home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 144630996

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
144630996 MDU6SXNzdWUxNDQ2MzA5OTY= 810 correct DJF mean 10194086 closed 0     4 2016-03-30T15:36:42Z 2022-04-06T16:19:47Z 2016-05-04T12:56:30Z MEMBER      

This started as a question and I add it as reference. Maybe you have a comment.

There are several ways to calculate time series of seasonal data (starting from monthly or daily data):

```

load libraries

import pandas as pd import matplotlib.pyplot import numpy as np import xarray as xr

Create Example Dataset

time = pd.date_range('2000.01.01', '2010.12.31', freq='M') data = np.random.rand(*time.shape) ds = xr.DataArray(data, coords=dict(time=time))

(1) using resample

ds_res = ds.resample('Q-FEB', 'time') ds_res = ds_res.sel(time=ds_res['time.month'] == 2) ds_res = ds_res.groupby('time.year').mean('time')

(2) this is wrong

ds_season = ds.where(ds['time.season'] == 'DJF').groupby('time.year').mean('time')

(3) using where and rolling

mask other months with nan

ds_DJF = ds.where(ds['time.season'] == 'DJF')

rolling mean -> only Jan is not nan

however, we loose Jan/ Feb in the first year and Dec in the last

ds_DJF = ds_DJF.rolling(min_periods=3, center=True, time=3).mean()

make annual mean

ds_DJF = ds_DJF.groupby('time.year').mean('time')

ds_res.plot(marker='*') ds_season.plot() ds_DJF.plot()

plt.show() ```

(1) The first is to use resample with 'Q-FEB' as argument. This works fine. It does include Jan/ Feb in the first year, and Dec in the last year + 1. If this makes sense can be debated. One case where this does not work is when you have, say, two regions in your data set, for one you want to calculate DJF and for the other you want NovDecJan.

(2) Using 'time.season' is wrong as it combines Jan, Feb and Dec from the same year.

(3) The third uses where and rolling and you lose 'incomplete' seasons. If you replace ds.where(ds['time.season'] == 'DJF') with ds.groupby('time.month').where(summer_months), where summer_months is a boolean array it works also for non-standard 'summers' (or seasons) across the globe.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/810/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 4 rows from issue in issue_comments
Powered by Datasette · Queries took 0.567ms · About: xarray-datasette