issue_comments: 1099643203

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/6456#issuecomment-1099643203	https://api.github.com/repos/pydata/xarray/issues/6456	1099643203	IC_kwDOAMm_X85BizlD	5635139	2022-04-14T21:31:37Z	2022-04-14T21:31:37Z	MEMBER	@max-sixty could you explain which bit isn't working for you? The initial example I shared works fine in colab for me, so that might be a you problem. The second one required specifying the chunks when making the datasets (I've editted above). Right, you changed the example after I responded But this bug report was more about the fact that overwriting was converting data to NaNs (in two different ways depending on the code apparently). In my case there is no longer any need to do the overwriting, but this doesn't seem like the expected behaviour of overwriting, and I'm sure there are some valid reasons to overwrite data - hence me opening the bug report. Something surprising is indeed going on here. To focus on the surprising part; ```python print(ds3.low_dim.values) ds3.to_zarr('zarr_bug.zarr', mode='w') print(ds3.low_dim.values) ``` returns: `[[2. 3. 2. ... 8. 0. 9.] [6. 2. 6. ... 2. 4. 3.] [0. 8. 8. ... 6. 5. 4.] ... [1. 0. 5. ... 2. 0. 3.] [5. 5. 7. ... 9. 6. 2.] [5. 7. 8. ... 4. 8. 9.]] [[nan nan nan ... nan nan nan] [nan nan nan ... nan nan nan] [nan nan nan ... nan nan nan] ... [ 1. 0. 5. ... 2. 0. 3.] [ 5. 5. 7. ... 9. 6. 2.] [ 5. 7. 8. ... 4. 8. 9.]]` Similarly: ```python In [50]: ds3.low_dim.count().compute() Out[50]: <xarray.DataArray 'low_dim' ()> array(1000000) In [51]: ds3.to_zarr('zarr_bug.zarr', mode='w') Out[51]: <xarray.backends.zarr.ZarrStore at 0x16a27c6d0> In [55]: ds3.low_dim.count().compute() Out[55]: <xarray.DataArray 'low_dim' ()> array(500000) ``` So it's changing the result in memory just from writing to the Zarr store. I'm not sure what the cause is. We can still massively reduce the size of this example — it's currently doing pickling, got a bunch of repeated code, etc. Does it work without the pickling? What if `ds3 = xr.concat([ds1, ds1.copy(deep=True)])`, etc.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		1197117301