id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 274797981,MDU6SXNzdWUyNzQ3OTc5ODE=,1725,Switch our lazy array classes to use Dask instead?,6815844,open,0,,,9,2017-11-17T09:12:34Z,2023-09-15T15:51:41Z,,MEMBER,,,,"Ported from #1724, [comment](https://github.com/pydata/xarray/pull/1724#pullrequestreview-77354985) by @shoyer > In the long term, it would be nice to get ride of these uses of `_data`, maybe by switching entirely from our lazy array classes to Dask. The subtleties of checking `_data` vs `data` are undesirable, e.g., consider the bug on these lines: https://github.com/pydata/xarray/blob/1a012080e0910f3295d0fc26806ae18885f56751/xarray/core/formatting.py#L212-L213 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1725/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 527237590,MDU6SXNzdWU1MjcyMzc1OTA=,3562,Minimize `.item()` call,6815844,open,0,,,1,2019-11-22T14:44:43Z,2023-06-08T04:48:50Z,,MEMBER,,,,"#### MCVE Code Sample I want to minimize the number of calls `.item()` within my data analysis. It often happens 1. when putting a 0d-DataArray into a slice ```python da = xr.DataArray([0.5, 4.5, 2.5], dims=['x'], coords={'x': [0, 1, 2]}) da[: da.argmax()] ``` -> `TypeError: 'DataArray' object cannot be interpreted as an integer` 2. when using a 0d-DataArray for selecting ```python da = xr.DataArray([0.5, 4.5, 2.5], dims=['x'], coords={'x': [0, 0, 2]}) da.sel(x=da['x'][0]) ``` -> `IndexError: arrays used as indices must be of integer (or boolean) type` Both cases, I need to call '.item()'. It is not a big issue, but I think it would be nice if xarray becomes more self-contained.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3562/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 675482176,MDU6SXNzdWU2NzU0ODIxNzY=,4325,Optimize ndrolling nanreduce,6815844,open,0,,,5,2020-08-08T07:46:53Z,2023-04-13T15:56:52Z,,MEMBER,,,,"In #4219 we added ndrolling. However, nanreduce, such as `ds.rolling(x=3, y=2).mean()` calls `np.nanmean` which copies the strided-array into a full-array. This is memory-inefficient. We can implement inhouse-nanreduce methods for the strided array. For example, our `.nansum` currently does make a strided array -> copy the array -> replace nan by 0 -> sum but we can do instead replace nan by 0 -> make a strided array -> sum This is much more memory efficient. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4325/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 531087939,MDExOlB1bGxSZXF1ZXN0MzQ3NTkyNzE1,3587,boundary options for rolling.construct,6815844,open,0,,,4,2019-12-02T12:11:44Z,2022-06-09T14:50:17Z,,MEMBER,,0,pydata/xarray/pulls/3587," - [x] Closes #2007, #2011 - [x] Tests added - [x] Passes `black . && mypy . && flake8` - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API Added some boundary options for rolling.construct. Currently, the option names are inherited from `np.pad`, `['edge' | 'reflect' | 'symmetric' | 'wrap']`. Do we want a more intuitive name, such as `periodic`?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3587/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 280875330,MDU6SXNzdWUyODA4NzUzMzA=,1772,nonzero method for xr.DataArray,6815844,open,0,,,5,2017-12-11T02:25:11Z,2022-04-01T10:42:20Z,,MEMBER,,,,"`np.nonzero` to `DataArray` returns a wrong result, ```python In [4]: da = xr.DataArray(np.arange(12).reshape(4, 3), dims=['x', 'y'], ...: coords={'x': [0, 1, 2, 3], 'y': ['a', 'b', 'c']}) ...: np.nonzero(da) ...: Out[4]: array([[0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3], [1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2]]) Coordinates: * x (x) int64 0 1 2 3 * y (y) # Paste the output here xr.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-101-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.9.6-172-gc58d142 pandas: 0.21.0 numpy: 1.13.1 scipy: 0.19.1 netCDF4: None h5netcdf: None Nio: None bottleneck: 1.2.1 cyordereddict: None dask: 0.16.0 matplotlib: 2.0.2 cartopy: None seaborn: 0.7.1 setuptools: 36.5.0 pip: 9.0.1 conda: 4.3.30 pytest: 3.2.3 IPython: 6.0.0 sphinx: 1.6.3 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1772/reactions"", ""total_count"": 6, ""+1"": 6, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 898657012,MDU6SXNzdWU4OTg2NTcwMTI=,5361,Inconsistent behavior in grouby depending on the dimension order,6815844,open,0,,,1,2021-05-21T23:11:37Z,2022-03-29T11:45:32Z,,MEMBER,,,," `groupby` works inconsistently depending on the dimension order of a `DataArray`. Furthermore, in some cases, this causes a corrupted object. ```python In [4]: data = xr.DataArray( ...: np.random.randn(4, 2), ...: dims=['x', 'z'], ...: coords={'x': ['a', 'b', 'a', 'c'], 'y': ('x', [0, 1, 0, 2])} ...: ) ...: ...: data.groupby('x').mean() Out[4]: array([[ 0.95447186, -1.14467028], [ 0.76294958, 0.3751244 ], [-0.41030223, -1.35344548]]) Coordinates: * x (x) object 'a' 'b' 'c' Dimensions without coordinates: z ``` `groupby` works fine (although this drops nondimensional coordinate `y`, related to #3745). However, `groupby` does not give a correct result if we work on the second dimension, ```python In [5]: data.T.groupby('x').mean() # <--- change the dimension order, and do the same thing Out[5]: array([[ 0.95447186, 0.76294958, -0.41030223], [-1.14467028, 0.3751244 , -1.35344548]]) Coordinates: * x (x) object 'a' 'b' 'c' y (x) int64 0 1 0 2 # <-- the size must be 3!! Dimensions without coordinates: z ``` The bug has been discussed in #2944 and solved, but I found this is still there.
Output of xr.show_versions() INSTALLED VERSIONS ------------------ commit: 09d8a4a785fa6521314924fd785740f2d13fb8ee python: 3.7.7 (default, Mar 23 2020, 22:36:06) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 5.4.0-72-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.10.4 libnetcdf: 4.6.1 xarray: 0.16.1.dev30+g1d3dee08.d20200808 pandas: 1.1.3 numpy: 1.18.1 scipy: 1.5.2 netCDF4: 1.4.2 pydap: None h5netcdf: 0.8.0 h5py: 2.10.0 Nio: None zarr: None cftime: 1.2.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.6.0 distributed: 2.7.0 matplotlib: 3.2.2 cartopy: None seaborn: 0.10.1 numbagg: None pint: None setuptools: 46.1.1.post20200323 pip: 20.0.2 conda: None pytest: 5.2.1 IPython: 7.13.0 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5361/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 359240638,MDU6SXNzdWUzNTkyNDA2Mzg=,2410,Updated text for indexing page,6815844,open,0,,,11,2018-09-11T22:01:39Z,2021-11-15T21:17:14Z,,MEMBER,,,,"We have a bunch of terms to describe the xarray structure, such as *dimension*, *coordinate*, *dimension coordinate*, etc.. Although it has been discussed in #1295 and we tried to use the consistent terminology in our docs, it looks still not easy for users to understand our functionalities. In #2399, @horta wrote a list of definitions (https://drive.google.com/file/d/1uJ_U6nedkNe916SMViuVKlkGwPX-mGK7/view?usp=sharing). I think it would be nice to have something like this in our docs. Any thought?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2410/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 254927382,MDU6SXNzdWUyNTQ5MjczODI=,1553,Multidimensional reindex,6815844,open,0,,,2,2017-09-04T03:29:39Z,2020-12-19T16:00:00Z,,MEMBER,,,,"From a discussion in #1473 [comment](https://github.com/pydata/xarray/pull/1473#issuecomment-326776669) It would be convenient if we have multi-dimensional `reindex` method, where we consider dimensions and coordinates of indexers. The proposed outline by @shoyer is + Given `reindex` arguments of the form `dim=array` where `array` is a 1D unlabeled array/list, convert them into `DataArray(array, [(dim, array)])`. + Do multi-dimensional indexing with broadcasting like `sel`, but fill in `NaN` for missing values (we could allow for customizing this with a `fill_value` argument). + Join coordinates like for `sel`, but coordinates from the indexers take precedence over coordinates from the object being indexed. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1553/reactions"", ""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 280673215,MDU6SXNzdWUyODA2NzMyMTU=,1771,Needs performance check / improvements in value assignment of DataArray,6815844,open,0,,,1,2017-12-09T03:42:41Z,2019-10-28T14:53:24Z,,MEMBER,,,,"https://github.com/pydata/xarray/blob/5e801894886b2060efa8b28798780a91561a29fd/xarray/core/dataarray.py#L482-L489 In #1746, we added a validation in `xr.DataArray.__setitem__` whether the coordinates consistency of array, key, and values are checked. In the current implementation, we call `xr.DataArray.__getitem__` to use the existing coordinate validation logic, but it does unnecessary indexing and it may decrease the `__setitem__` performance if the arrray is multidimensional. We may need to optimize the logic here. Is it reasonable to constantly monitor the performance of basic operations, such as `Dataset` construction, alignment, indexing, and assignment? (or are these operations too light to make a performance monitor?) cc @jhamman @shoyer ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1771/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue