html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/1844#issuecomment-1163265245,https://api.github.com/repos/pydata/xarray/issues/1844,1163265245,IC_kwDOAMm_X85FVgTd,2448579,2022-06-22T15:30:44Z,2022-06-22T15:30:44Z,MEMBER,"You can now do `month_day_str = da.time.dt.strftime(""%m-%d"")` See https://strftime.org/ for more options","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,290023410 https://github.com/pydata/xarray/issues/1844#issuecomment-418191318,https://api.github.com/repos/pydata/xarray/issues/1844,418191318,MDEyOklzc3VlQ29tbWVudDQxODE5MTMxOA==,6628425,2018-09-03T20:51:37Z,2018-09-03T20:55:08Z,MEMBER,"Building on the above example, if you're OK with using a coordinate of strings, the following might be a little simpler way of defining the labels to use for grouping (this is perhaps closer to a single attribute solution): ``` In [14]: month_day_str = xr.DataArray(da.indexes['time'].strftime('%m-%d'), coords=da.coords, ...: name='month_day_str') ...: In [15]: da.groupby(month_day_str).mean('time') Out[15]: array([2., 3.]) Coordinates: * month_day_str (month_day_str) object '01-01' '03-01' ``` Note #2090 / #2144 would make this more straightforward.","{""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,290023410 https://github.com/pydata/xarray/issues/1844#issuecomment-418188977,https://api.github.com/repos/pydata/xarray/issues/1844,418188977,MDEyOklzc3VlQ29tbWVudDQxODE4ODk3Nw==,6628425,2018-09-03T20:30:45Z,2018-09-03T20:30:45Z,MEMBER,"No worries @chiaral; I agree on the xarray side this isn't so well documented (you have to follow the link to the pandas description of the [datetime components](http://pandas.pydata.org/pandas-docs/stable/api.html#time-date-components)). Unfortunately there is not a simple attribute for grouping by matching month and day. It is possible to define your own vector of integers for this purpose, however. Perhaps you've already found a workaround, but just in case, here is one way to define a ""modified ordinal day"" that you can use in a `groupby` call: ``` In [1]: import xarray as xr In [2]: from datetime import datetime In [3]: dates = [datetime(1999, 1, 1), datetime(1999, 3, 1), ...: datetime(2000, 1, 1), datetime(2000, 3, 1)] ...: In [4]: da = xr.DataArray([1, 2, 3, 4], coords=[dates], dims=['time']) In [5]: not_leap_year = xr.DataArray(~da.indexes['time'].is_leap_year, coords=da.coords) In [6]: march_or_later = da.time.dt.month >= 3 In [7]: ordinal_day = da.time.dt.dayofyear In [8]: modified_ordinal_day = ordinal_day + (not_leap_year & march_or_later) In [9]: modified_ordinal_day = modified_ordinal_day.rename('modified_ordinal_day') In [10]: modified_ordinal_day Out[10]: array([ 1, 61, 1, 61]) Coordinates: * time (time) datetime64[ns] 1999-01-01 1999-03-01 2000-01-01 2000-03-01 In [11]: da.groupby(modified_ordinal_day).mean('time') Out[11]: array([2., 3.]) Coordinates: * modified_ordinal_day (modified_ordinal_day) int64 1 61 ``` Note if we use the standard ordinal day we get three groups, because of the difference between non-leap and leap years: ``` In [12]: ordinal_day Out[12]: array([ 1, 60, 1, 61]) Coordinates: * time (time) datetime64[ns] 1999-01-01 1999-03-01 2000-01-01 2000-03-01 In [13]: da.groupby(ordinal_day).mean('time') Out[13]: array([2., 2., 4.]) Coordinates: * dayofyear (dayofyear) int64 1 60 61 ```","{""total_count"": 5, ""+1"": 5, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,290023410 https://github.com/pydata/xarray/issues/1844#issuecomment-417855365,https://api.github.com/repos/pydata/xarray/issues/1844,417855365,MDEyOklzc3VlQ29tbWVudDQxNzg1NTM2NQ==,6628425,2018-09-01T12:09:25Z,2018-09-01T12:09:25Z,MEMBER,"@chiaral if I understand correctly, your data does use a standard calendar, but the issue is that you would like to group values based on matching month and day numbers (e.g. all January 1st's, all January 6th's, ..., all March 2nd's etc.) rather than matching ""days since December 31st the preceding year,"" which is what the `dayofyear` attribute corresponds with. Is that right?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,290023410 https://github.com/pydata/xarray/issues/1844#issuecomment-417694660,https://api.github.com/repos/pydata/xarray/issues/1844,417694660,MDEyOklzc3VlQ29tbWVudDQxNzY5NDY2MA==,1217238,2018-08-31T15:09:56Z,2018-08-31T15:09:56Z,MEMBER,@chiaral You should take a look at CFTimeIndex which specifically was designed to solve this problem: http://xarray.pydata.org/en/stable/time-series.html#non-standard-calendars-and-dates-outside-the-timestamp-valid-range,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,290023410 https://github.com/pydata/xarray/issues/1844#issuecomment-359129344,https://api.github.com/repos/pydata/xarray/issues/1844,359129344,MDEyOklzc3VlQ29tbWVudDM1OTEyOTM0NA==,1217238,2018-01-20T00:49:33Z,2018-01-20T00:49:56Z,MEMBER,"You can do this in a single step with `xarray.apply_ufunc()`, which is a sort of more flexible/powerful interface to xarray's broadcasting arithmetic. Extending the [toy weather example](http://xarray.pydata.org/en/stable/examples/weather-data.html) from the docs: ```python import xarray as xr import numpy as np import pandas as pd import seaborn as sns # pandas aware plotting library np.random.seed(123) times = pd.date_range('2000-01-01', '2001-12-31', name='time') annual_cycle = np.sin(2 * np.pi * (np.array(times.dayofyear) / 365.25 - 0.28)) base = 10 + 15 * annual_cycle.reshape(-1, 1) tmin_values = base + 3 * np.random.randn(annual_cycle.size, 3) tmax_values = base + 10 + 3 * np.random.randn(annual_cycle.size, 3) ds = xr.Dataset({'tmin': (('time', 'location'), tmin_values), 'tmax': (('time', 'location'), tmax_values)},((62, 3), (3,), (3,)) {'time': times, 'location': ['IA', 'IN', 'IL']}) # new code ds_mean = ds.groupby('time.month').mean('time') ds_std = ds.groupby('time.month').std('time') xarray.apply_ufunc(lambda x, m, s: (x - m) / s, ds.groupby('time.month'), ds_mean, ds_std) ``` The other way (about twice as slow) is to chain two calls to `groupby()`: ```python (ds.groupby('time.month') - ds_mean).groupby('time.month') / ds_std ``` I'll mark this as a documentation issue in case anyone wants to add an example to the docs.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,290023410