html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/557#issuecomment-751265292,https://api.github.com/repos/pydata/xarray/issues/557,751265292,MDEyOklzc3VlQ29tbWVudDc1MTI2NTI5Mg==,26384082,2020-12-25T15:49:52Z,2020-12-25T15:49:52Z,NONE,"In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity If this issue remains relevant, please comment here or remove the `stale` label; otherwise it will be marked as closed automatically ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,104484316 https://github.com/pydata/xarray/issues/557#issuecomment-427335183,https://api.github.com/repos/pydata/xarray/issues/557,427335183,MDEyOklzc3VlQ29tbWVudDQyNzMzNTE4Mw==,1217238,2018-10-05T11:33:45Z,2018-10-05T11:33:45Z,MEMBER,`dataset.time.dt.year.isin(elnino_years)` should work already with.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,104484316 https://github.com/pydata/xarray/issues/557#issuecomment-427291294,https://api.github.com/repos/pydata/xarray/issues/557,427291294,MDEyOklzc3VlQ29tbWVudDQyNzI5MTI5NA==,6883049,2018-10-05T08:46:12Z,2018-10-05T08:46:12Z,CONTRIBUTOR,"I though this issue was long forgotten ; ) The fact is that today I still call this seltime function very often in my code so I am glad to see it back, thank you. I like the .isin(elnino_years) syntax, and I see that it is consistent with pandas. Something like dataset.time.year.isin(elnino_years) would be very nice too, right now is just a ""to_index()"" away, as this works: dataset.time.to_index().year.isin(elnino_years). However I am not sure about how this is related with multidimensional indexing, as time is only one dimension.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,104484316 https://github.com/pydata/xarray/issues/557#issuecomment-427279439,https://api.github.com/repos/pydata/xarray/issues/557,427279439,MDEyOklzc3VlQ29tbWVudDQyNzI3OTQzOQ==,1217238,2018-10-05T07:59:47Z,2018-10-05T08:00:28Z,MEMBER,"I don’t think there’s an easy way to this with vectorized indexing, but if we supported multidimensional indexing with boolean keys as proposed in https://github.com/pydata/xarray/issues/1887 (equivalent to where with drop=True) we could write `pr_dataset[pr_dataset['time.year'].isin(elnino_years)]`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,104484316 https://github.com/pydata/xarray/issues/557#issuecomment-427078606,https://api.github.com/repos/pydata/xarray/issues/557,427078606,MDEyOklzc3VlQ29tbWVudDQyNzA3ODYwNg==,1197350,2018-10-04T16:12:43Z,2018-10-04T16:12:43Z,MEMBER,Is there a way to do this easily now with vectorized indexing?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,104484316 https://github.com/pydata/xarray/issues/557#issuecomment-137478562,https://api.github.com/repos/pydata/xarray/issues/557,137478562,MDEyOklzc3VlQ29tbWVudDEzNzQ3ODU2Mg==,6883049,2015-09-03T15:05:35Z,2015-09-04T09:49:54Z,CONTRIBUTOR,"I agree with your arguments against creating too many datetime specific methods. However, I am not convinced about using .sel with 'time.year', it looks a bit hackish to me. Maybe all ""CDO methods"" can be integrated in one single ""seltimes"" method. Please look at the function below. The time_coord arguments lets you choose the time coordinate that you want to filter, and by using numpy.logical_and is it possible to do the whole operation in one single function call. It could be turned into a method. What do you think? ``` python %matplotlib inline import xray import numpy as np from matplotlib import pyplot as plt ifile = ""HadCRUT.4.3.0.0.median.nc"" ixd = xray.open_dataset(ifile) ``` You can download the example file from here http://www.metoffice.gov.uk/hadobs/hadcrut4/data/4.3.0.0/gridded_fields/HadCRUT.4.3.0.0.median_netcdf.zip Now comes the function. ``` python def seltime(ixd, time_coord, **kwargs): """""" Select time steps by groups of years, months, etc. Parameters ---------- ixd : xray.Dataset or xray.DataArray, input dataset. Will be ""self"" if the function is turned into a method. time_coord: str, name of the time coordinate to use. Needed to avoid ambiguities. **kwargs: String or interable: Currently years, months, days and hours are supported. Returns ---------- xray.Dataset or xrat.DataArray filtered by the constraints defined by the kwargs. """""" ntimes = len(ixd.coords[time_coord]) time_mask = np.repeat(True, ntimes) for time_unit, time_values in kwargs.iteritems(): if time_unit not in (""year"", ""month"", ""day"", ""hour""): raise KeyError(time_unit) time_str = ""{}.{}"".format(time_coord, time_unit) time_mask_temp = np.in1d(ixd[time_str], time_values) time_mask = np.logical_and(time_mask, time_mask_temp) oxd = ixd.sel(**{time_coord : time_mask}) return oxd ``` It works, and it looks fast enough. ``` python elnino_years = (1958, 1973, 1983, 1987, 1998) months = (1, 2, 3, 4) %timeit elnino_xd = seltime(ixd, ""time"", year=elnino_years, month=months) elnino_xd.time ``` ``` 1000 loops, best of 3: 1.05 ms per loop array(['1958-01-16T12:00:00.000000000Z', '1958-02-15T00:00:00.000000000Z', '1958-03-16T12:00:00.000000000Z', '1958-04-16T00:00:00.000000000Z', '1973-01-16T13:00:00.000000000+0100', '1973-02-15T01:00:00.000000000+0100', '1973-03-16T13:00:00.000000000+0100', '1973-04-16T01:00:00.000000000+0100', '1983-01-16T13:00:00.000000000+0100', '1983-02-15T01:00:00.000000000+0100', '1983-03-16T13:00:00.000000000+0100', '1983-04-16T02:00:00.000000000+0200', '1987-01-16T13:00:00.000000000+0100', '1987-02-15T01:00:00.000000000+0100', '1987-03-16T13:00:00.000000000+0100', '1987-04-16T02:00:00.000000000+0200', '1998-01-16T13:00:00.000000000+0100', '1998-02-15T01:00:00.000000000+0100', '1998-03-16T13:00:00.000000000+0100', '1998-04-16T02:00:00.000000000+0200'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] 1958-01-16T12:00:00 1958-02-15 ... Attributes: start_month: 1 start_year: 1850 end_month: 4 end_year: 2015 long_name: time standard_name: time axis: T ``` If we average in time we can see the warm tonge in the pacific. ``` python plt.figure(figsize=(12, 5)) elnino_xd.temperature_anomaly.mean(""time"").plot() plt.gca().invert_yaxis() ``` ![output_3_0](https://cloud.githubusercontent.com/assets/6883049/9661527/62021132-525b-11e5-8d99-d4b68cbde03a.png) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,104484316 https://github.com/pydata/xarray/issues/557#issuecomment-137257727,https://api.github.com/repos/pydata/xarray/issues/557,137257727,MDEyOklzc3VlQ29tbWVudDEzNzI1NzcyNw==,2443309,2015-09-02T21:57:54Z,2015-09-02T21:57:54Z,MEMBER,"I'd also like to see this come into being. My two cents on the syntax is that your second idea is best: ``` Python pr_dataset.sel('time.year', elnino_years) ``` I don't care much for option 3 since the variable name is being used as an attribute (what if my time variable is called ""Times""?). ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,104484316 https://github.com/pydata/xarray/issues/557#issuecomment-137182504,https://api.github.com/repos/pydata/xarray/issues/557,137182504,MDEyOklzc3VlQ29tbWVudDEzNzE4MjUwNA==,1217238,2015-09-02T17:37:56Z,2015-09-02T17:37:56Z,MEMBER,"Currently, the way to do this is to create a boolean indexer, with something like the following: ``` pr_dataset.sel(time=np.in1d(pr_dataset['time.year'], elnino_years)) ``` I agree that this is overly verbose and we can come up with something better. I'm not quite happy with `selyear`, though: - It's ambiguous which datetime variable the year refers do (there can be more than one time variable on some datasets). - I'm also not a big fan of adding a bunch of new API methods that are datetime specific -- it creates a lot of noise (pandas has an issue with this). Something like `pr_dataset.sel(year=elnino_years)` would be the ideal fix for this second concern (we've discussed this in another issue, can't remember which one now), but it's still ambiguous which time variable it refers to. So, some other possible ways to spell this: 1. `pr_dataset.sel('time', year=elnino_years)` 2. `pr_dataset.sel('time.year', elnino_years)` 3. `pr_dataset.sel.time.year(elnino_years)` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,104484316 https://github.com/pydata/xarray/issues/557#issuecomment-137134794,https://api.github.com/repos/pydata/xarray/issues/557,137134794,MDEyOklzc3VlQ29tbWVudDEzNzEzNDc5NA==,5356122,2015-09-02T15:32:17Z,2015-09-02T15:32:17Z,MEMBER,"Just curious- how would we currently do this? ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,104484316