github: issues: 1 row where user = 140395181 sorted by updated

1 row where user = 140395181 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at ▲	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
2211106929	I_kwDOAMm_X86DytBx	8882	to_zarr silently loses data when using append_dim, if chunks are different to zarr store	harryC-space-intelligence 140395181	closed	0			4	2024-03-27T15:27:02Z	2024-03-29T14:35:51Z	2024-03-29T14:35:51Z	NONE				What happened? When writing a chunked DataArray to an existing zarr store, appending along an existing dimension of the store, I have found that some data are not written if there are multiple array chunks to one zarr chunk. I appreciate it is probably bad practice to have different chunksizes in my DataArray and zarr_store, but I think its a realistic scenario that needs to be caught. This may be related to / the same underlying issue as #8371. Perhaps the checks mentioned in https://github.com/pydata/xarray/issues/8371#issuecomment-1814589157 are somehow getting bypassed? Using zarr's ThreadSynchronizer is the only way I have found to ensure that all the data gets written. What did you expect to happen? I expected that either to_zarr would recognise the different chunk sizes, and re-chunk or wait for all the chunks to be written or an error would be raised, given that the results result in loss of data in an unpredictable way Minimal Complete Verifiable Example ```Python import xarray as xr import numpy as np from matplotlib import pyplot as plt x_coords = np.arange(10) y_coords = np.arange(10) t_coords = np.array([np.datetime64('2020-01-01').astype('datetime64[ns]')]) data = np.ones((10,10)) for i in range(4): plt.subplot(1,4,i+1) da = xr.DataArray(data.reshape((-1,10,10)), dims = ['time','x','y'], coords = {'x':x_coords, 'y':y_coords, 'time':t_coords}, ).chunk({'x':5, 'y':5,'time':1}).rename('foo') da.to_zarr('foo.zarr', mode='w') new_time = np.array([np.datetime64('2021-01-01').astype('datetime64[ns]')]) da2 = xr.DataArray(data.reshape((-1,10,10)), dims = ['time','x','y'], coords = {'x':x_coords, 'y':y_coords, 'time':new_time}, ).chunk({'x':1, 'y':1,'time':1}).rename('foo') da2.to_zarr('foo.zarr',append_dim='time', mode='a') plt.imshow(xr.open_zarr('foo.zarr').isel(time=-1).foo.values) ``` MVCE confirmation [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [ ] Complete example — the example is self-contained, including all data and the text of any traceback. [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [ ] New issue — a search of GitHub Issues suggests this is not a duplicate. [ ] Recent environment — the issue occurs with the latest version of xarray and its dependencies. Relevant log output No response Anything else we need to know? Output from the plots above: Environment INSTALLED VERSIONS ------------------ commit: None python: 3.11.4 \| packaged by conda-forge \| (main, Jun 10 2023, 18:08:17) [GCC 12.2.0] python-bits: 64 OS: Linux OS-release: 5.15.0-1041-azure machine: x86_64 processor: x86_64 byteorder: little LC_ALL: C.UTF-8 LANG: C.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.14.3 libnetcdf: 4.9.2 xarray: 2024.2.0 pandas: 2.2.1 numpy: 1.26.4 scipy: 1.12.0 netCDF4: 1.6.5 pydap: installed h5netcdf: 1.3.0 h5py: 3.10.0 Nio: None zarr: 2.17.1 cftime: 1.6.3 nc_time_axis: 1.4.1 iris: None bottleneck: 1.3.8 dask: 2024.3.1 distributed: 2024.3.1 matplotlib: 3.8.3 cartopy: 0.22.0 seaborn: 0.13.2 numbagg: None fsspec: 2024.3.1 cupy: None pint: 0.23 sparse: 0.15.1 flox: 0.9.5 numpy_groupies: 0.10.2 setuptools: 69.2.0 pip: 24.0 conda: 24.1.2 pytest: 8.1.1 mypy: None IPython: 8.22.2 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8882/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		completed	xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

1 row where user = 140395181 sorted by updated_at descending

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

MVCE confirmation

Relevant log output

Anything else we need to know?

Environment

Advanced export