id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 98587746,MDU6SXNzdWU5ODU4Nzc0Ng==,508,Ignore missing variables when concatenating datasets?,1217238,closed,0,,,8,2015-08-02T06:03:57Z,2023-01-20T16:04:28Z,2023-01-20T16:04:28Z,MEMBER,,,,"Several users (@raj-kesavan, @richardotis, now myself) have wondered about how to concatenate xray Datasets with different variables. With the current `xray.concat`, you need to awkwardly create dummy variables filled with `NaN` in datasets that don't have them (or drop mismatched variables entirely). Neither of these are great options -- `concat` should have an option (the default?) to take care of this for the user. This would also be more consistent with `pd.concat`, which takes a more relaxed approach to matching dataframes with different variables (it does an outer join). ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/508/reactions"", ""total_count"": 6, ""+1"": 6, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 711626733,MDU6SXNzdWU3MTE2MjY3MzM=,4473,Wrap numpy-groupies to speed up Xarray's groupby aggregations,1217238,closed,0,,,8,2020-09-30T04:43:04Z,2022-05-15T02:38:29Z,2022-05-15T02:38:29Z,MEMBER,,,," **Is your feature request related to a problem? Please describe.** Xarray's groupby aggregations (e.g., `groupby(..).sum()`) are very slow compared to pandas, as described in https://github.com/pydata/xarray/issues/659. **Describe the solution you'd like** We could speed things up considerably (easily 100x) by wrapping the [numpy-groupies](https://github.com/ml31415/numpy-groupies) package. **Additional context** One challenge is how to handle dask arrays (and other duck arrays). In some cases it might make sense to apply the numpy-groupies function (using apply_ufunc), but in other cases it might be better to stick with the current indexing + concatenate solution. We could either pick some simple heuristics for choosing the algorithm to use on dask arrays, or could just stick with the current algorithm for now. In particular, it might make sense to stick with the current algorithm if there are a many chunks in the arrays to aggregated along the ""grouped"" dimension (depending on the size of the unique group values).","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4473/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 269700511,MDU6SXNzdWUyNjk3MDA1MTE=,1672,Append along an unlimited dimension to an existing netCDF file,1217238,open,0,,,8,2017-10-30T18:09:54Z,2020-11-29T17:35:04Z,,MEMBER,,,,"This would be a nice feature to have for some use cases, e.g., for writing simulation time-steps: https://stackoverflow.com/questions/46951981/create-and-write-xarray-dataarray-to-netcdf-in-chunks It should be relatively straightforward to add, too, building on support for writing files with unlimited dimensions. User facing API would probably be a new keyword argument to `to_netcdf()`, e.g., `extend='time'` to indicate the extended dimension.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1672/reactions"", ""total_count"": 21, ""+1"": 21, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 169274464,MDU6SXNzdWUxNjkyNzQ0NjQ=,939,Consider how to deal with the proliferation of decoder options on open_dataset,1217238,closed,0,,,8,2016-08-04T01:57:26Z,2020-10-06T15:39:11Z,2020-10-06T15:39:11Z,MEMBER,,,,"There are already lots of keyword arguments, and users want even more! (#843) Maybe we should use some sort of object to encapsulate desired options? ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/939/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 187625917,MDExOlB1bGxSZXF1ZXN0OTI1MjQzMjg=,1087,WIP: New DataStore / Encoder / Decoder API for review,1217238,closed,0,,,8,2016-11-07T05:02:04Z,2020-04-17T18:37:45Z,2020-04-17T18:37:45Z,MEMBER,,0,pydata/xarray/pulls/1087,"The goal here is to make something extensible that we can live with for quite some time, and to clean up the internals of xarray's backend interface. Most of these are analogues of existing xarray classes with a cleaned up interface. I have not yet worried about backwards compatibility or tests -- I would appreciate feedback on the approach here. Several parts of the logic exist for the sake of dask. I've included the word ""dask"" in comments to facilitate inspection by mrocklin. CC @rabernat, @pwolfram, @jhamman, @mrocklin -- for review CC @mcgibbon, @JoyMonteiro -- this is relevant to our discussion today about adding support for appending to netCDF files. Don't let this stop you from getting started on that with the existing interface, though.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1087/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 454168102,MDU6SXNzdWU0NTQxNjgxMDI=,3009,Xarray test suite failing with dask-master,1217238,closed,0,,,8,2019-06-10T13:21:50Z,2019-06-23T16:49:23Z,2019-06-23T16:49:23Z,MEMBER,,,,"There are a [wide variety of failures](https://travis-ci.org/pydata/xarray/jobs/543540695), mostly related to backends and indexing, e.g., `AttributeError: 'tuple' object has no attribute 'tuple'`. By the looks of it, something is going wrong with xarray's internal `ExplicitIndexer` objects, which are getting converted into something else. I'm pretty sure this is due to the recent merge of the `Array._meta` pull request: https://github.com/dask/dask/pull/4543 There are 81 test failures, but my guess is that there that probably only a handful (at most) of underlying causes.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3009/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 290320242,MDExOlB1bGxSZXF1ZXN0MTY0MjAzNzAz,1847,Use getitem_with_mask in reindex_variables,1217238,closed,0,,,8,2018-01-22T00:19:20Z,2018-05-23T21:13:42Z,2018-02-14T13:11:48Z,MEMBER,,0,pydata/xarray/pulls/1847,"This is an internal refactor of reindexing/alignment to use `Variable.getitem_with_mask`. As noted back in https://github.com/pydata/xarray/pull/1751#issuecomment-348380756, there is a nice improvement for alignment with dask (~100x improvement) but we are slower in several cases with NumPy (2-3x). ASV results (smaller ratio is better): ``` before after ratio [e31cf43e] [5830f2f8] 4.85ms 4.86ms 1.00 reindexing.Reindex.time_1d_coarse 98.15ms 98.97ms 1.01 reindexing.Reindex.time_1d_fine_all_found + 96.88ms 210.71ms 2.17 reindexing.Reindex.time_1d_fine_some_missing 24.47ms 25.18ms 1.03 reindexing.Reindex.time_2d_coarse 433.26ms 437.19ms 1.01 reindexing.Reindex.time_2d_fine_all_found + 245.20ms 711.36ms 2.90 reindexing.Reindex.time_2d_fine_some_missing - 23.78ms 12.79ms 0.54 reindexing.Reindex.time_reindex_coarse - 409.89ms 230.75ms 0.56 reindexing.Reindex.time_reindex_fine_all_found + 233.41ms 369.48ms 1.58 reindexing.Reindex.time_reindex_fine_some_missing 14.39ms 14.20ms 0.99 reindexing.ReindexDask.time_1d_coarse 184.07ms 182.64ms 0.99 reindexing.ReindexDask.time_1d_fine_all_found - 1.44s 277.03ms 0.19 reindexing.ReindexDask.time_1d_fine_some_missing 95.49ms 94.49ms 0.99 reindexing.ReindexDask.time_2d_coarse 910.11ms 916.47ms 1.01 reindexing.ReindexDask.time_2d_fine_all_found failed 997.33ms n/a reindexing.ReindexDask.time_2d_fine_some_missing ``` Note that `reindexing.ReindexDask.time_2d_fine_some_missing` timed out previously, which I think indicates that it took longer than 60 seconds. - [x] Tests passed (for all non-documentation changes) - [x] Passes ``git diff upstream/master **/*py | flake8 --diff`` (remove if you did not edit any Python files) - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later) ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1847/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 171077425,MDU6SXNzdWUxNzEwNzc0MjU=,967,sortby() or sort_index() method for Dataset and DataArray,1217238,closed,0,,741199,8,2016-08-14T20:40:13Z,2017-05-12T00:29:12Z,2017-05-12T00:29:12Z,MEMBER,,,,"They should function like the pandas methods of the same name. Under the covers, I believe it would suffice to simply remap `ds.sort_index('time')` -> `ds.isel(time=ds.indexes['time'].argsort())`. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/967/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 197083082,MDExOlB1bGxSZXF1ZXN0OTkwNDA2MzE=,1179,Switch to shared Lock (SerializableLock if possible) for reading/writing,1217238,closed,0,,,8,2016-12-22T02:50:43Z,2017-01-04T17:12:58Z,2017-01-04T17:12:46Z,MEMBER,,0,pydata/xarray/pulls/1179,"Fixes #1172 The serializable lock will be useful for dask.distributed or multi-processing (xref #798, #1173, among others).","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1179/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 118711154,MDExOlB1bGxSZXF1ZXN0NTE3MjI1MDY=,666,Shift method for shifting data,1217238,closed,0,,,8,2015-11-24T21:53:11Z,2015-12-02T23:32:28Z,2015-12-02T23:32:28Z,MEMBER,,0,pydata/xarray/pulls/666,"Fixes #624 New `shift` method for shifting datasets or arrays along a dimension: ``` In [1]: import xray In [2]: array = xray.DataArray([5, 6, 7, 8], dims='x') In [3]: array.shift(x=2) Out[3]: array([ nan, nan, 5., 6.]) Coordinates: * x (x) int64 0 1 2 3 ``` Based on the API proposed for `roll` in https://github.com/xray/xray/issues/624 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/666/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 40225000,MDU6SXNzdWU0MDIyNTAwMA==,212,"Get ride of ""noncoordinates"" as a name?",1217238,closed,0,,740776,8,2014-08-14T05:52:30Z,2014-09-22T00:55:22Z,2014-09-22T00:55:22Z,MEMBER,,,,"As @ToddSmall has pointed out (in #202), ""noncoordinates"" is a confusing name -- it's something defined by what it isn't, not what it is. Unfortunately, our best alternative is ""variables"", which already has a lot of meaning from the netCDF world (and which we already use). Related: #211 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/212/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 33772168,MDExOlB1bGxSZXF1ZXN0MTYwMzc5NTA=,134,Fix concatenating Variables with dtype=datetime64,1217238,closed,0,,664063,8,2014-05-19T05:39:46Z,2014-06-28T01:08:03Z,2014-05-20T19:09:28Z,MEMBER,,0,pydata/xarray/pulls/134,"This is an alternative to #125 which I think is a little cleaner. Basically, there was a bug where `Variable.values` for datetime64 arrays always made a copy of values. This made it impossible to edit variable values in-place. @akleeman would appreciate your thoughts. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/134/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull