id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 149130368,MDU6SXNzdWUxNDkxMzAzNjg=,830,"""Reverse"" groupby method for split/apply/combine",5629061,closed,0,,,5,2016-04-18T12:00:04Z,2020-10-04T16:06:58Z,2020-10-04T16:06:58Z,NONE,,,,"When dealing with high-dimensional data, algorithms often involve operations or aggregation on a particular dimension only, whilst keeping all other dimensions in the dataset. For example, I might know that I want to average all data along the time axis, and I'm indifferent to the other dimensions present, i.e. I want my algorithm to work whenever there is a time axis, and to be indifferent to the presence/lack of any other dimensions. Mapping this kind of implementation to xarray is awkward though because I can only use `groupby()` for the split/apply/combine operation. For example, in xarray I have to do this: ``` averages = dataarray.groupby([dimensions excluding time dimension]).apply(my_method_that_works_on_time_dimension) ``` instead of this (where `aggregate_over()` is my ""reverse"" groupby method): ``` averages = dataarray.aggregate_over([time_dimension]).apply(my_method_that_works_on_time_dimension) ``` For the first example I have to do some extra work: I have to write additional code to fetch all the dimensions in the array, remove the time dimension from that list, and then use that list with groupby, in order to make my code depend on the time dimension only. It would be really helpful to add a `aggregate_over()` method (name TBD of course!) as an alternative to `groupby()` that automates this extra work. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/830/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 192325490,MDU6SXNzdWUxOTIzMjU0OTA=,1143,timedelta64[D] is always coerced to timedelta64[ns],5629061,closed,0,,,5,2016-11-29T16:11:53Z,2019-01-22T19:21:18Z,2019-01-22T19:21:18Z,NONE,,,,"Hi guys, the following snippets show the issue... ``` xarray.DataArray([1,2,3,4]).astype('timedelta64[D]') #output is """""" array([ 86400000000000, 172800000000000, 259200000000000, 345600000000000], dtype='timedelta64[ns]') Coordinates: * dim_0 (dim_0) int64 0 1 2 3 """""" ``` Compare this with Pandas: ``` pandas.Series([1,2,3,4]).astype('timedelta64[D]') #output is """""" 0 1 days 1 2 days 2 3 days 3 4 days dtype: timedelta64[D] """""" ``` This behvaiour becomes more problematic when trying to convert from timedelta[ns] to e.g. days as ints: ``` xarray.DataArray(pandas.Series([1,2,3,4]).astype('timedelta64[D]')).astype(int) #output is """""" array([ 86400000000000, 172800000000000, 259200000000000, 345600000000000]) Coordinates: * dim_0 (dim_0) int64 0 1 2 3 """""" ``` Again contrast that with pandas: ``` pandas.Series([1,2,3,4]).astype('timedelta64[D]').astype(int) #output is """""" 0 1 1 2 2 3 3 4 dtype: int64 """""" ``` Other variations of timedelta e.g. timedelta64[s], timedelta64[W] etc suffer from the same problem. Thanks","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1143/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 207477701,MDU6SXNzdWUyMDc0Nzc3MDE=,1267,"""in"" operator does not work as expected on DataArray dimensions",5629061,closed,0,,2856429,2,2017-02-14T10:35:41Z,2018-10-28T17:56:17Z,2018-10-28T17:56:17Z,NONE,,,,"As an example I have a DataArray called ""my_dataarray"" that looks something like this: ``` array([1, 2, 3]) Coordinates: * Type (Type) object 'Type 1' 'Type 2' 'Type 3' ``` 'Type' is a dimension on my DataArray. Note that 'Type' is also a DataArray that looks like this: ``` OrderedDict([('Type', array(['Type 1', 'Type 2', 'Type 3'], dtype='object'))]) ``` Let's say I run: ``` 'Type 1' in my_dataarray.Type ``` The result is False, even though 'Type 1' is in the ""Type"" dimension. To get the result I was expecting I need to run: ``` 'Type 1' in my_dataarray.Type.values ``` Stepping through the code, the problematic line is here: https://github.com/pydata/xarray/blob/20ec32430fac63a8976699d9528b5fdc1cd4125d/xarray/core/dataarray.py#L487 The test used for `__contains__(self, key)` on the Type dimension is whether the key is in the `_coords` of Type. This is probably the right thing to do when the DataArray is used for storing data, but probably not what we want if the DataArray is being used as a dimension - it should instead check if 'Type 1' is in the *values* of Type? ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1267/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 149136587,MDExOlB1bGxSZXF1ZXN0NjY4NDQ0OTc=,831,"Add a ""reverse groupby"" / ""aggregate over"" method",5629061,closed,0,,,1,2016-04-18T12:27:08Z,2016-05-26T01:55:11Z,2016-05-26T01:55:11Z,NONE,,0,pydata/xarray/pulls/831,"Implements #830 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/831/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull