id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 2211106929,I_kwDOAMm_X86DytBx,8882,"to_zarr silently loses data when using append_dim, if chunks are different to zarr store",140395181,closed,0,,,4,2024-03-27T15:27:02Z,2024-03-29T14:35:51Z,2024-03-29T14:35:51Z,NONE,,,,"### What happened? When writing a chunked DataArray to an existing zarr store, appending along an existing dimension of the store, I have found that some data are not written if there are multiple array chunks to one zarr chunk. I appreciate it is probably bad practice to have different chunksizes in my DataArray and zarr_store, but I think its a realistic scenario that needs to be caught. This may be related to / the same underlying issue as #8371. Perhaps the checks mentioned in https://github.com/pydata/xarray/issues/8371#issuecomment-1814589157 are somehow getting bypassed? Using zarr's ThreadSynchronizer is the only way I have found to ensure that all the data gets written. ### What did you expect to happen? I expected that either - to_zarr would recognise the different chunk sizes, and re-chunk or wait for all the chunks to be written - or an error would be raised, given that the results result in loss of data in an unpredictable way ### Minimal Complete Verifiable Example ```Python import xarray as xr import numpy as np from matplotlib import pyplot as plt x_coords = np.arange(10) y_coords = np.arange(10) t_coords = np.array([np.datetime64('2020-01-01').astype('datetime64[ns]')]) data = np.ones((10,10)) for i in range(4): plt.subplot(1,4,i+1) da = xr.DataArray(data.reshape((-1,10,10)), dims = ['time','x','y'], coords = {'x':x_coords, 'y':y_coords, 'time':t_coords}, ).chunk({'x':5, 'y':5,'time':1}).rename('foo') da.to_zarr('foo.zarr', mode='w') new_time = np.array([np.datetime64('2021-01-01').astype('datetime64[ns]')]) da2 = xr.DataArray(data.reshape((-1,10,10)), dims = ['time','x','y'], coords = {'x':x_coords, 'y':y_coords, 'time':new_time}, ).chunk({'x':1, 'y':1,'time':1}).rename('foo') da2.to_zarr('foo.zarr',append_dim='time', mode='a') plt.imshow(xr.open_zarr('foo.zarr').isel(time=-1).foo.values) ``` ### MVCE confirmation - [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [ ] Complete example — the example is self-contained, including all data and the text of any traceback. - [ ] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [ ] New issue — a search of GitHub Issues suggests this is not a duplicate. - [ ] Recent environment — the issue occurs with the latest version of xarray and its dependencies. ### Relevant log output _No response_ ### Anything else we need to know? Output from the plots above: ![image](https://github.com/pydata/xarray/assets/140395181/1982344f-7db8-4e80-a3f3-b031747cacad) ### Environment
INSTALLED VERSIONS ------------------ commit: None python: 3.11.4 | packaged by conda-forge | (main, Jun 10 2023, 18:08:17) [GCC 12.2.0] python-bits: 64 OS: Linux OS-release: 5.15.0-1041-azure machine: x86_64 processor: x86_64 byteorder: little LC_ALL: C.UTF-8 LANG: C.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.14.3 libnetcdf: 4.9.2 xarray: 2024.2.0 pandas: 2.2.1 numpy: 1.26.4 scipy: 1.12.0 netCDF4: 1.6.5 pydap: installed h5netcdf: 1.3.0 h5py: 3.10.0 Nio: None zarr: 2.17.1 cftime: 1.6.3 nc_time_axis: 1.4.1 iris: None bottleneck: 1.3.8 dask: 2024.3.1 distributed: 2024.3.1 matplotlib: 3.8.3 cartopy: 0.22.0 seaborn: 0.13.2 numbagg: None fsspec: 2024.3.1 cupy: None pint: 0.23 sparse: 0.15.1 flox: 0.9.5 numpy_groupies: 0.10.2 setuptools: 69.2.0 pip: 24.0 conda: 24.1.2 pytest: 8.1.1 mypy: None IPython: 8.22.2 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8882/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue