issue_comments: 406705740

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/2300#issuecomment-406705740	https://api.github.com/repos/pydata/xarray/issues/2300	406705740	MDEyOklzc3VlQ29tbWVudDQwNjcwNTc0MA==	1530840	2018-07-20T19:36:08Z	2018-07-20T19:38:03Z	NONE	Ah, that's great. I do see some improvement. Specifically, I can now set chunks using xarray, and successfully write to zarr, and reopen it. However, when reopening it I do find that the chunks have been inconsistently applied (some fields have the expected chunksize whereas some small fields have the entire variable in one chunk). Furthermore, trying to write a second time with `to_zarr` leads to: `*** NotImplementedError: Specified zarr chunks (100,) would overlap multiple dask chunks ((100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 4),). This is not implemented in xarray yet. Consider rechunking the data using`chunk()`or specifying different chunks in encoding.` Trying to reapply the original chunks with `xr.Dataset.chunk` succeeds, and `ds.chunks` no longer reports "inconsistent chunks", but trying to write still produces the same error. I also tried loading my entire dataset into memory, allowing the initial `to_zarr` to default to zarr's chunking heuristics. Trying to read and write a second time again results in the same error: `NotImplementedError: Specified zarr chunks (63170,) would overlap multiple dask chunks ((63170, 63170, 63170, 63170, 63170, 63170, 63170, 63169),). This is not implemented in xarray yet. Consider rechunking the data using`chunk()`or specifying different chunks in encoding.` I tried this round-tripping experiment with my monkey patches, and it works for a sequence of read/write/read/write... without any intervention in between. This only works for default zarr-chunking, however, since the patch to `xr.backends.zarr._determine_zarr_chunks` overrides whatever chunks are on the originating dataset. Curious: Is there any downside in xarray to using datasets with inconsistent chunks? I take it that it is a supported configuration because xarray allows it to happen, but just outputs that error when calling `ds.chunks`, which is just a sort of convenience method for looking at chunks across a whole dataset which happens to have consistent chunks...? One other thing to add: it might be nice to have an option to allow zarr auto-chunking even when `chunks!={}`. I don't know how sensitive zarr performance is to chunksizes, but it'd be nice to have some form of sane auto-chunking available when you don't want to bother with manually choosing.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		342531772