id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 668256331,MDU6SXNzdWU2NjgyNTYzMzE=,4288,hue argument for xarray.plot.step() for plotting multiple histograms over shared bins,4666753,open,0,,,2,2020-07-30T00:30:37Z,2022-04-17T19:27:28Z,,CONTRIBUTOR,,,," **Is your feature request related to a problem? Please describe.** I love how efficiently we can plot line data for different observations using `xr.DataArray.plot(hue={hue coordinate name})` over a 2D array, and I have appreciated `xr.DataArray.plot.step()` for plotting histogram data using interval coordinates. Today, I wanted to plot/compare several histograms over the same set of bins. I figured I could write `xr.DataArray.plot.step(hue={...})`, but I found out that this functionality is not implemented. **Describe the solution you'd like** I think we should have a hue kwarg for `xr.DataArray.plot.step()`. When specified, we would be able to plot 2D data in the same way as `xr.DataArray.plot()`, except that we get a set of step plots instead of a set of line plots. **Describe alternatives you've considered** + Use `xr.DataArray.plot()` instead. This is effective for histograms with many bins, but inaccurately represents histograms with coarse bins + Manually call `xr.DataArray.plot.hist()` on each 1D subarray for each label on the hue coordinate, adding appropriate labels and legend. This is fine and my current solution, but I think it would be excellent to use the same shorthand that was developed for line plots. **Additional context** I didn't evaluate the other plotting functions implemented, but I suspect that others could appropriately consider a hue argument but do not yet support doing so. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4288/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1177669703,PR_kwDOAMm_X84029qU,6402,No chunk warning if empty,4666753,closed,0,,,6,2022-03-23T06:43:54Z,2022-04-09T20:27:46Z,2022-04-09T20:27:40Z,CONTRIBUTOR,,0,pydata/xarray/pulls/6402," - [x] Closes #6401 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6402/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1177665302,I_kwDOAMm_X85GMb8W,6401,Unnecessary warning when specifying `chunks` opening dataset with empty dimension,4666753,closed,0,,,0,2022-03-23T06:38:25Z,2022-04-09T20:27:40Z,2022-04-09T20:27:40Z,CONTRIBUTOR,,,,"### What happened? I receive unnecessary warnings when opening Zarr datasets with empty dimensions/arrays using the `chunks` argument (for a non-empty dimension). If an array has zero size (due to an empty dimension), it is saved as a single chunk regardless of Dask chunking on other dimensions (#5742). If the `chunks` parameter is provided for other dimensions when loading the Zarr file (based on the expected chunksizes were the array nonempty), xarray gives a warning about potentially degraded performance from splitting the single chunk. ### What did you expect to happen? I expect no warning to be raised when there is no data: - performance degradation on an empty array should be negligible. - we don't always know if one of the dimensions is empty until loading. But we would use the `chunks` parameter for dimensions with consistent chunksizes (to specify a multiple of what's on disk) -- this is thrown off when other dimensions are empty. ### Minimal Complete Verifiable Example ```Python import xarray as xr import numpy as np # each `a` is expected to be chunked separately ds = xr.Dataset({""x"": ((""a"", ""b""), np.empty((4, 0)))}).chunk({""a"": 1}) # but when we save it, it gets saved as a single chunk ds.to_zarr(""tmp.zarr"") # so if we open it up with expected chunksizes (not knowing that b is empty): ds2 = xr.open_zarr(""tmp.zarr"", chunks={""a"": 1}) # we get a warning :( ``` ### Relevant log output ```Python {...}/miniconda3/envs/new-majiq/lib/python3.8/site-packages/xarray/core/dataset.py:410: UserWarning: Specified Dask chunks (1, 1, 1, 1) would separate on disks chunk shape 4 for dime nsion a. This could degrade performance. (chunks = {'a': (1, 1, 1, 1), 'b': (0,)}, preferred_ch unks = {'a': 4, 'b': 1}). Consider rechunking after loading instead. _check_chunks_compatibility(var, output_chunks, preferred_chunks) ``` ### Anything else we need to know? This can be fixed by only calling `_check_chunks_compatibility()` whenever `var` is nonempty (PR forthcoming). ### Environment INSTALLED VERSIONS [3/1946] ------------------ commit: None python: 3.8.12 | packaged by conda-forge | (default, Jan 30 2022, 23:42:07) [GCC 9.4.0] python-bits: 64 OS: Linux OS-release: 5.4.72-microsoft-standard-WSL2 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: C.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.1 libnetcdf: None xarray: 2022.3.0 pandas: 1.4.1 numpy: 1.22.2 scipy: 1.8.0 netCDF4: None pydap: None h5netcdf: None h5py: 3.6.0 Nio: None zarr: 2.11.1 cftime: 1.5.1.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.4 dask: 2022.01.0 distributed: 2022.01.0 matplotlib: 3.5.1 cartopy: None seaborn: 0.11.2 numbagg: None fsspec: 2022.01.0 cupy: None pint: None sparse: None setuptools: 59.8.0 pip: 22.0.4 conda: None pytest: 7.0.1 IPython: 8.1.1 sphinx: 4.4.0","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6401/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 552987067,MDU6SXNzdWU1NTI5ODcwNjc=,3712,"[Documentation/API?] {DataArray,Dataset}.sortby is stable sort?",4666753,open,0,,,0,2020-01-21T16:27:37Z,2022-04-09T02:26:34Z,,CONTRIBUTOR,,,,"I noticed that `{DataArray,Dataset}.sortby()` are implemented using `np.lexsort()`, which is a stable sort. Can we expect this function to remain a stable sort in the future even if the implementation is changed for some reason? It is not explicitly stated in the docs that the sorting will be stable. If this function is meant to always be stable, I think the documentation should explicitly state this. If not, I think it would be helpful to have an optional argument to ensure that the sort is kept stable in case the implementation changes in the future.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3712/reactions"", ""total_count"": 3, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 1}",,,13221727,issue 1076377174,I_kwDOAMm_X85AKDZW,6062,Import hangs when matplotlib installed but no display available,4666753,closed,0,,,3,2021-12-10T03:12:55Z,2021-12-29T07:56:59Z,2021-12-29T07:56:59Z,CONTRIBUTOR,,,,"**What happened**: On a device with no display available, importing xarray without setting the matplotlib backend hangs on import of matplotlib.pyplot since #5794 was merged. **What you expected to happen**: I expect to be able to run `import xarray` without needing to mess with environment variables or import matplotlib and change the default backend. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6062/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 1076384122,PR_kwDOAMm_X84vp8Ru,6064,"Revert ""Single matplotlib import""",4666753,closed,0,,,3,2021-12-10T03:24:54Z,2021-12-29T07:56:59Z,2021-12-29T07:56:59Z,CONTRIBUTOR,,0,pydata/xarray/pulls/6064,"Revert pydata/xarray#5794, which causes failure to import when used without display (issue #6062). - [ ] Closes #6102 - [ ] Closes #6062","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6064/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1033142897,I_kwDOAMm_X849lIJx,5883,Failing parallel writes to_zarr with regions parameter?,4666753,closed,0,,,1,2021-10-22T03:33:02Z,2021-10-22T18:37:06Z,2021-10-22T18:37:06Z,CONTRIBUTOR,,,," **What happened**: Following guidance on how to use regions keyword in `xr.Dataset.to_zarr()`, I wrote a multithreaded program that makes independent writes to each index along an axis. But, when I use more than one thread, some of these writes fail. **What you expected to happen**: I expect all the writes to take place safely so long as the regions I write to do not overlap (they do not). **Minimal Complete Verifiable Example**: ```python path = ""tmp.zarr"" NTHREADS = 4 # when 1, things work as expected import multiprocessing.dummy as mp # threads, instead of processes import numpy as np import dask.array as da import xarray as xr # dummy values for metadata xr.Dataset( {""x"": ((""a"", ""b""), -da.ones((10, 7), chunks=(None, 1)))}, {""apple"": (""a"", -da.ones(10, dtype=int, chunks=(1,)))}, ).to_zarr(path, mode=""w"", compute=False) # actual values to save ds = xr.Dataset( {""x"": ((""a"", ""b""), np.random.uniform(size=(10, 7)))}, {""apple"": (""a"", np.arange(10))}, ) # save them using NTHREADS with mp.Pool(NTHREADS) as p: p.map( lambda idx: ds.isel(a=slice(idx, 1 + idx)).to_zarr(path, mode=""r+"", region=dict(a=slice(idx, 1 + idx))), range(10) ) ds_roundtrip = xr.open_zarr(path).load() # open what we just saved over multiple threads # perfect match for x on some slices of a, but when NTHREADS > 1, x has very different value or NaN on other slices of a xr.testing.assert_allclose(ds, ds_roundtrip) # fails when NTHREADS > 1. ``` **Anything else we need to know?**: + this behavior is the same if coordinate ""apple"" (over a) is changed to be coordinate ""a"" (index over dimension) + if dummy dataset had ""apple"" defined using dask, I observed `ds_roundtrip` having all correct values of ""apple"" (but not ""x""). *But*, if it was defined as a numpy array, I observed `ds_roundtrip` having incorrect values of ""apple"" (in addition to ""x""). **Environment**:
Output of xr.show_versions() ``` INSTALLED VERSIONS ------------------ commit: None python: 3.8.10 | packaged by conda-forge | (default, May 11 2021, 07:01:05) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.4.72-microsoft-standard-WSL2 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: C.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.8.0 xarray: 0.19.0 pandas: 1.3.3 numpy: 1.21.2 scipy: 1.7.1 netCDF4: 1.5.7 pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: 2.10.1 cftime: 1.5.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2021.08.1 distributed: 2021.08.1 matplotlib: 3.4.1 cartopy: None seaborn: 0.11.2 numbagg: None pint: None setuptools: 58.2.0 pip: 21.3 conda: None pytest: None IPython: 7.28.0 sphinx: None ```
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5883/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 980605063,MDExOlB1bGxSZXF1ZXN0NzIwODExNjQ4,5742,Fix saving chunked datasets with zero length dimensions,4666753,closed,0,,,2,2021-08-26T20:12:08Z,2021-10-10T00:12:34Z,2021-10-10T00:02:42Z,CONTRIBUTOR,,0,pydata/xarray/pulls/5742,"This fixes #5741 by loading to memory all variables with zero length before saving with `Dataset.to_zarr()` - [x] Closes #5741 - [x] Tests added - [x] Passes `pre-commit run --all-files` - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5742/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 980549418,MDU6SXNzdWU5ODA1NDk0MTg=,5741,Dataset.to_zarr fails on dask array with zero-length dimension (ZeroDivisonError),4666753,closed,0,,,0,2021-08-26T18:57:00Z,2021-10-10T00:02:42Z,2021-10-10T00:02:42Z,CONTRIBUTOR,,,," **What happened**: I have an `xr.Dataset` with a dask-array-valued variable including zero-length dimension (other variables are non-empty). I tried saving it to zarr, but it fails with a zero division error. **What you expected to happen**: I expect it to save without any errors. **Minimal Complete Verifiable Example**: the following commands fail. ```python import numpy as np import xarray as xr ds = xr.Dataset( {""x"": ((""a"", ""b"", ""c""), np.empty((75, 0, 30))), ""y"": ((""a"", ""c""), np.random.normal(size=(75, 30)))}, {""a"": np.arange(75), ""b"": [], ""c"": np.arange(30)}, ).chunk({}) ds.to_zarr(""fails.zarr"") # RAISES ZeroDivisionError ``` **Anything else we need to know?**: If we load all the empty arrays to numpy, it is able to save correctly. That is: ```python ds[""x""].load() # run on all variables that have a zero dimension ds.to_zarr(""works.zarr"") # successfully runs ``` I'll make a PR using this solution, but not sure if this is a deeper bug that should be fixed in zarr or in a nicer way. **Environment**:
Output of xr.show_versions() ``` INSTALLED VERSIONS ------------------ commit: None python: 3.8.8 | packaged by conda-forge | (default, Feb 20 2021, 16:22:27) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.4.72-microsoft-standard-WSL2 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: C.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.8.0 xarray: 0.19.0 pandas: 1.2.4 numpy: 1.20.2 scipy: 1.7.1 netCDF4: 1.5.6 pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: 2.9.3 cftime: 1.5.0 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2021.08.1 distributed: 2021.08.1 matplotlib: 3.4.1 cartopy: None seaborn: None numbagg: None pint: None setuptools: 49.6.0.post20210108 pip: 21.0.1 conda: None pytest: None IPython: 7.22.0 sphinx: None ```
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5741/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 559958918,MDExOlB1bGxSZXF1ZXN0MzcxMDM0MDY2,3752,Fix swap_dims() index names (issue #3748),4666753,closed,0,,,5,2020-02-04T20:25:18Z,2020-02-24T23:33:05Z,2020-02-24T22:34:59Z,CONTRIBUTOR,,0,pydata/xarray/pulls/3752," - [x] Closes #3748 - [x] Tests added - [x] Passes `isort -rc . && black . && mypy . && flake8` - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3752/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 559841620,MDU6SXNzdWU1NTk4NDE2MjA=,3748,`swap_dims()` incorrectly changes underlying index name,4666753,closed,0,,,1,2020-02-04T16:41:25Z,2020-02-24T22:34:58Z,2020-02-24T22:34:58Z,CONTRIBUTOR,,,,"#### MCVE Code Sample ```python import xarray as xr # create data array with named dimension and named coordinate x = xr.DataArray([1], {""idx"": [2], ""y"": (""idx"", [3])}, [""idx""], name=""x"") # what's our current index? (idx, this is fine) x.indexes # prints ""idx: Int64Index([2], dtype='int64', name='idx')"" # swap dim so that y is our dimension, what's index now? x.swap_dims({""idx"": ""y""}).indexes # prints ""y: Int64Index([3], dtype='int64', name='idx')"" ``` The dimension name is appropriately swapped but the pandas index name is incorrect. #### Expected Output ```python # swap dim so that y is our dimension, what's index now? x.swap_dims({""idx"": ""y""}).indexes # prints ""y: Int64Index([3], dtype='int64', name='y')"" ``` #### Problem Description This is a problem because running `x.swap_dims({""idx"": ""y""}).to_dataframe()` gives a dataframe with columns `[""x"", ""idx""]` and index `""idx""`. This gives ambiguous names and drops the original name, while the DataArray string representation gives no indication that this might be happening. #### Output of ``xr.show_versions()``
# Paste the output here xr.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.8.1 | packaged by conda-forge | (default, Jan 29 2020, 15:06:10) [Clang 9.0.1 ] python-bits: 64 OS: Darwin OS-release: 18.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.3 xarray: 0.15.0 pandas: 0.25.3 numpy: 1.18.1 scipy: 1.4.1 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.10.1 distributed: 2.10.0 matplotlib: None cartopy: None seaborn: None numbagg: None setuptools: 45.1.0.post20200119 pip: 20.0.2 conda: None pytest: 5.3.5 IPython: 7.12.0 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3748/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue