id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1656130602,I_kwDOAMm_X85itowq,7726,open_zarr: PermissionError with multiple processes despite use of ProcessSynchronizer,34257249,open,0,,,0,2023-04-05T18:55:12Z,2023-04-06T01:37:32Z,,CONTRIBUTOR,,,,"### What happened? Several processes read and write to a xarray stored in .zarr format, on a network. The write operations write to existing regions. These regions are not aligned to chunks, [therefore](https://zarr.readthedocs.io/en/stable/tutorial.html#parallel-computing-and-synchronization) I use a [ProcessSynchronizer](https://zarr.readthedocs.io/en/stable/api/sync.html#zarr.sync.ProcessSynchronizer). The ProcessSynchronizer points to a local folder on SSD, separate from the actual stored array. After several hundreds of read/write I get permission errors like below. So far I have failed to reproduce the error with a MCVE. The file `0` that gave a permission error is the chunk of coordinates of a certain dimension in the dimension folder `dim_yyy`: ``` dim_yyy |-- .zarray |-- .zattrs `-- 0 ``` ### What did you expect to happen? No permission error. ### Minimal Complete Verifiable Example ```Python **I have failed so far to reproduce the error with an MVCE. Here my attempt.** from pathlib import Path import dask.array as da import pandas as pd import xarray as xr from dask.distributed import Client from zarr.sync import ProcessSynchronizer if __name__ == ""__main__"": path_store = Path(aaa) path_synchronizer = Path(bbb) # must exist, and not same location as store # create and save a datset to zarr s0, s1, s2 = 10, 10, 10 temperature = da.random.random((s0, s1, s2), chunks=[s0, s1, s2]) precipitation = da.random.random((s0, s1, s2), chunks=[s0, s1, s2]) lon = da.random.random((s0, s1)) lat = da.random.random((s0, s1)) time = pd.date_range(""2014-09-06"", periods=s2) reference_time = pd.Timestamp(""2014-09-05"") ds = xr.Dataset( data_vars=dict( temperature=([""x"", ""y"", ""time""], temperature), precipitation=([""x"", ""y"", ""time""], precipitation), ), coords=dict( lon=([""x"", ""y""], lon), lat=([""x"", ""y""], lat), time=time, reference_time=reference_time, ), attrs=dict(description=""Weather related data.""), ) print(f""{ds=}"") ds.to_zarr(path_store, mode=""w"") def read_write(path_store: Path): """"""lazily opens the dataset, then writes into a region. Comment/uncomment to use synchronizer"""""" synchronizer = ProcessSynchronizer(path_synchronizer) for b in range(100): # open the saved dataset # xr.open_zarr(path_store, synchronizer=synchronizer) ds = xr.open_zarr(path_store) # process a region dst = ( ds.temperature.isel(x=slice(0, 5), y=slice(0, 5), time=slice(0, 5)) .to_dataset() .load() ) dst[""temperature""] = -dst[""temperature""] dst = dst.drop_vars([""time"", ""reference_time""]) # save the region to the zarr store dst.to_zarr( path_store, region={ ""x"": slice(0, 5), ""y"": slice(0, 5), ""time"": slice(0, 5), }, # synchronizer=synchronizer, ) # independent processes that perform read and write operations with Client(processes=True) as client: futures = [client.submit(read_write, path_store) for a in range(1000)] client.gather(futures) ``` ### MVCE confirmation - [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [ ] Complete example — the example is self-contained, including all data and the text of any traceback. - [ ] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [X] New issue — a search of GitHub Issues suggests this is not a duplicate. ### Relevant log output ```Python return xr.open_zarr(path, synchronizer=synchronizer) File ""C:\anaconda3\lib\site-packages\xarray\backends\zarr.py"", line 787, in open_zarr ds = open_dataset( File ""C:\anaconda3\lib\site-packages\xarray\backends\api.py"", line 539, in open_dataset backend_ds = backend.open_dataset( File ""C:\anaconda3\lib\site-packages\xarray\backends\zarr.py"", line 862, in open_dataset ds = store_entrypoint.open_dataset( File ""C:\anaconda3\lib\site-packages\xarray\backends\store.py"", line 43, in open_dataset ds = Dataset(vars, attrs=attrs) File ""C:\anaconda3\lib\site-packages\xarray\core\dataset.py"", line 604, in __init__ variables, coord_names, dims, indexes, _ = merge_data_and_coords( File ""C:\anaconda3\lib\site-packages\xarray\core\merge.py"", line 575, in merge_data_and_coords return merge_core( File ""C:\anaconda3\lib\site-packages\xarray\core\merge.py"", line 755, in merge_core collected = collect_variables_and_indexes(aligned, indexes=indexes) File ""C:\anaconda3\lib\site-packages\xarray\core\merge.py"", line 365, in collect_variables_and_indexes variable = as_variable(variable, name=name) File ""C:\anaconda3\lib\site-packages\xarray\core\variable.py"", line 168, in as_variable obj = obj.to_index_variable() File ""C:\anaconda3\lib\site-packages\xarray\core\variable.py"", line 624, in to_index_variable return IndexVariable( File ""C:\anaconda3\lib\site-packages\xarray\core\variable.py"", line 2844, in __init__ self._data = PandasIndexingAdapter(self._data) File ""C:\anaconda3\lib\site-packages\xarray\core\indexing.py"", line 1420, in __init__ self.array = safe_cast_to_index(array) File ""C:\anaconda3\lib\site-packages\xarray\core\indexes.py"", line 177, in safe_cast_to_index index = pd.Index(np.asarray(array), **kwargs) File ""C:\anaconda3\lib\site-packages\xarray\core\indexing.py"", line 524, in __array__ return np.asarray(array[self.key], dtype=None) File ""C:\anaconda3\lib\site-packages\xarray\backends\zarr.py"", line 68, in __getitem__ return array[key.tuple] File ""C:\anaconda3\lib\site-packages\zarr\core.py"", line 821, in __getitem__ result = self.get_basic_selection(pure_selection, fields=fields) File ""C:\anaconda3\lib\site-packages\zarr\core.py"", line 947, in get_basic_selection return self._get_basic_selection_nd(selection=selection, out=out, File ""C:\anaconda3\lib\site-packages\zarr\core.py"", line 990, in _get_basic_selection_nd return self._get_selection(indexer=indexer, out=out, fields=fields) File ""C:\anaconda3\lib\site-packages\zarr\core.py"", line 1285, in _get_selection self._chunk_getitem(chunk_coords, chunk_selection, out, out_selection, File ""C:\anaconda3\lib\site-packages\zarr\core.py"", line 1994, in _chunk_getitem cdata = self.chunk_store[ckey] File ""C:\anaconda3\lib\site-packages\zarr\storage.py"", line 1085, in __getitem__ return self._fromfile(filepath) File ""C:\anaconda3\lib\site-packages\zarr\storage.py"", line 1059, in _fromfile with open(fn, 'rb') as f: PermissionError: [Errno 13] Permission denied: 'xxx.zarr\\dim_yyy/0' ``` ### Anything else we need to know? _No response_ ### Environment
INSTALLED VERSIONS ------------------ commit: None python: 3.10.9 | packaged by Anaconda, Inc. | (main, Mar 1 2023, 18:18:15) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 85 Stepping 7, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: ('English_United States', '1252') libhdf5: 1.10.6 libnetcdf: None xarray: 2022.11.0 pandas: 1.5.3 numpy: 1.23.5 scipy: 1.10.0 netCDF4: None pydap: None h5netcdf: None h5py: 3.7.0 Nio: None zarr: 2.14.2 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.5 dask: 2022.7.0 distributed: None matplotlib: 3.7.0 cartopy: None seaborn: 0.12.2 numbagg: None fsspec: 2022.11.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 65.6.3 pip: 23.0.1 conda: 23.1.0 pytest: 7.1.2 IPython: 8.10.0 sphinx: 5.0.2
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7726/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1584580877,I_kwDOAMm_X85ecskN,7527,DataArray.idxmax converts coordinates into float64 by default,34257249,open,0,,,0,2023-02-14T17:45:07Z,2023-02-14T17:51:33Z,,CONTRIBUTOR,,,,"### What happened? Same example as in [DataArray.idxmax](https://docs.xarray.dev/en/stable/generated/xarray.Dataset.idxmax.html#xarray.Dataset.idxmax) but instead we look at the ""y"" dimension. The starting ""y"" coordinates are of type int: `[-1,0,1]` The return values of argmax are of type int64: good. The return values of idxmax are of type float64: bad. ### What did you expect to happen? If no fillna operation must occur, then the return values of idxmax should be the same type as from the input. Else, the return type might change to a new type depending on the type of the filled value. ### Minimal Complete Verifiable Example ```Python array = xr.DataArray( [ [2.0, 1.0, 2.0, 0.0, -2.0], [-4.0, np.NaN, 2.0, np.NaN, -2.0], [np.NaN, np.NaN, 1.0, np.NaN, np.NaN], ], dims=[""y"", ""x""], coords={""y"": [-1, 0, 1], ""x"": np.arange(5.0) ** 2}, ) print(array.argmax(dim=""y"").dtype) print(array.idxmax(dim=""y"").dtype) ``` ### MVCE confirmation - [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [X] Complete example — the example is self-contained, including all data and the text of any traceback. - [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [X] New issue — a search of GitHub Issues suggests this is not a duplicate. ### Relevant log output ```Python In [41]: print(array.argmax(dim=""y"").dtype) int64 In [42]: print(array.idxmax(dim=""y"").dtype) float64 ``` ### Anything else we need to know? Suggestions: - change [these two lines](https://github.com/pydata/xarray/blob/main/xarray/core/computation.py#L2086-L2087): ``` if skipna or (skipna is None and array.dtype.kind in na_dtypes): # Put the NaN values back in after removing them ``` into ``` if (skipna or (skipna is None and array.dtype.kind in na_dtypes)) and allna.any(): # Put the NaN values back in after removing them, if any ``` - or maybe instead, it is a bug from `DataArray.where`: this [`res = res.where(~allna, fill_value)`](https://github.com/pydata/xarray/blob/main/xarray/core/computation.py#L2088) should not change the array type if `not allna.any()`? Actually, it is a known limitation of `where`: #3570 ### Environment
INSTALLED VERSIONS ------------------ commit: None python: 3.9.13 (main, Aug 25 2022, 23:51:50) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 158 Stepping 13, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: ('English_United States', '1252') libhdf5: 1.10.6 libnetcdf: None xarray: 0.20.1 pandas: 1.4.4 numpy: 1.24.2 scipy: 1.9.1 netCDF4: None pydap: None h5netcdf: None h5py: 3.7.0 Nio: None zarr: 2.13.3 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.5 dask: 2022.7.0 distributed: 2022.7.0 matplotlib: 3.5.2 cartopy: None seaborn: 0.11.2 numbagg: None fsspec: 2022.7.1 cupy: None pint: None sparse: None setuptools: 63.4.1 pip: 23.0 conda: 22.9.0 pytest: 7.1.2 IPython: 7.31.1 sphinx: 5.0.2
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7527/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1452123685,I_kwDOAMm_X85WjaYl,7294,DataArray.transpose with transpose_coords=True does not change coords order,34257249,open,0,,,6,2022-11-16T19:02:27Z,2022-11-24T20:40:32Z,,CONTRIBUTOR,,,,"### What happened? I used [DataArray.transpose](https://docs.xarray.dev/en/stable/generated/xarray.DataArray.transpose.html) with `transpose_coords=True` to change the coords order from `startings_dims = ""dim_0"", ""dim_1"", ""dim_2""` to `reordered_dims = ""dim_2"", ""dim_1"", ""dim_0""`. The order of dims was correctly transposed but the order of coords remained unchanged. ### What did you expect to happen? I expected the transposed coords to be in the new order: `reordered_dims = ""dim_2"", ""dim_1"", ""dim_0""` ### Minimal Complete Verifiable Example ```Python import numpy as np import pandas as pd import xarray as xr np.random.seed(0) temperature = np.random.randn(4, 4, 3) dim_0_values = [1, 2, 3, 4] dim_1_values = [5, 6, 7, 8] dim_2_values = pd.date_range(""2014-09-06"", periods=3) starting_dims = ""dim_0"", ""dim_1"", ""dim_2"" da = xr.DataArray( data=temperature, dims=starting_dims, coords=dict( dim_0=dim_0_values, dim_1=dim_1_values, dim_2=dim_2_values, ), attrs=dict( description=""Ambient temperature."", units=""degC"", ), ) print(f""{da.dims=}"") print(f""{da.coords.keys()=}"") reordered_dims = ""dim_2"", ""dim_1"", ""dim_0"" print(f""{da.transpose(*reordered_dims).dims=}"") print(f""{da.transpose(*reordered_dims).coords.keys()=}"") print(f""{da.transpose(*reordered_dims, transpose_coords=True).coords.keys()=}"") ``` ### MVCE confirmation - [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [X] Complete example — the example is self-contained, including all data and the text of any traceback. - [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [X] New issue — a search of GitHub Issues suggests this is not a duplicate. ### Relevant log output ```Python da.dims=('dim_0', 'dim_1', 'dim_2') da.coords.keys()=KeysView(Coordinates: * dim_0 (dim_0) int32 1 2 3 4 * dim_1 (dim_1) int32 5 6 7 8 * dim_2 (dim_2) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08) da.transpose(*reordered_dims).dims=('dim_2', 'dim_1', 'dim_0') da.transpose(*reordered_dims).coords.keys()=KeysView(Coordinates: * dim_0 (dim_0) int32 1 2 3 4 * dim_1 (dim_1) int32 5 6 7 8 * dim_2 (dim_2) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08) da.transpose(*reordered_dims, transpose_coords=True).coords.keys()=KeysView(Coordinates: * dim_0 (dim_0) int32 1 2 3 4 * dim_1 (dim_1) int32 5 6 7 8 * dim_2 (dim_2) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08) ``` ### Anything else we need to know? _No response_ ### Environment
INSTALLED VERSIONS ------------------ commit: None python: 3.9.12 (main, Apr 4 2022, 05:22:27) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 85 Stepping 7, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: ('English_United States', '1252') libhdf5: 1.10.6 libnetcdf: None xarray: 2022.6.0 pandas: 1.4.2 numpy: 1.21.5 scipy: 1.9.3 netCDF4: None pydap: None h5netcdf: None h5py: 3.6.0 Nio: None zarr: 2.13.2 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.4 dask: 2022.02.1 distributed: 2022.2.1 matplotlib: 3.5.1 cartopy: None seaborn: 0.11.2 numbagg: None fsspec: 2022.02.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 61.2.0 pip: 22.3.1 conda: 4.12.0 pytest: 7.1.1 IPython: 8.2.0 sphinx: 4.4.0
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7294/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1449069429,I_kwDOAMm_X85WXwt1,7289,typo in xarray/doc/user-guide/reshaping.rst,34257249,closed,0,,,2,2022-11-15T02:47:56Z,2022-11-15T15:20:44Z,2022-11-15T15:20:44Z,CONTRIBUTOR,,,,"### What is your issue? line 23 An ellipsis (...) can be use -> An ellipsis (...) can be use**d**","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7289/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue