html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2371#issuecomment-431993989,https://api.github.com/repos/pydata/xarray/issues/2371,431993989,MDEyOklzc3VlQ29tbWVudDQzMTk5Mzk4OQ==,1530840,2018-10-22T21:27:22Z,2018-10-22T21:27:22Z,NONE,"Fixed by https://github.com/numpy/numpy/pull/11777
Released in https://github.com/numpy/numpy/releases/tag/v1.15.1","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,351343574
https://github.com/pydata/xarray/issues/2300#issuecomment-406732486,https://api.github.com/repos/pydata/xarray/issues/2300,406732486,MDEyOklzc3VlQ29tbWVudDQwNjczMjQ4Ng==,1530840,2018-07-20T21:33:08Z,2018-07-20T21:33:08Z,NONE,"I took a closer look and noticed my one-dimensional fields of size 505359 were reporting a chunksize or 63170. Turns out that's enough to come up with a minimal repro:
```python
>>> xr.__version__
'0.10.8'
>>> ds=xr.Dataset({'foo': (['bar'], np.zeros((505359,)))})
>>> ds.to_zarr('test.zarr')
>>> ds2=xr.open_zarr('test.zarr')
>>> ds2
Dimensions: (bar: 505359)
Dimensions without coordinates: bar
Data variables:
foo (bar) float64 dask.array
>>> ds2.foo.encoding
{'chunks': (63170,), 'compressor': Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0), 'filters':
None, '_FillValue': nan, 'dtype': dtype('float64')}
>>> ds2.to_zarr('test2.zarr')
```
raises
```
NotImplementedError: Specified zarr chunks (63170,) would overlap multiple dask chunks ((63170, 63170, 63
170, 63170, 63170, 63170, 63170, 63169),). This is not implemented in xarray yet. Consider rechunking th
e data using `chunk()` or specifying different chunks in encoding.
```","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,342531772
https://github.com/pydata/xarray/issues/2300#issuecomment-406705740,https://api.github.com/repos/pydata/xarray/issues/2300,406705740,MDEyOklzc3VlQ29tbWVudDQwNjcwNTc0MA==,1530840,2018-07-20T19:36:08Z,2018-07-20T19:38:03Z,NONE,"Ah, that's great. I do see *some* improvement. Specifically, I can now set chunks using xarray, and successfully write to zarr, and reopen it. However, when reopening it I do find that the chunks have been inconsistently applied (some fields have the expected chunksize whereas some small fields have the entire variable in one chunk). Furthermore, trying to write a second time with `to_zarr` leads to:
`
*** NotImplementedError: Specified zarr chunks (100,) would overlap multiple dask chunks ((100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 4),). This is not implemented in xarray yet. Consider rechunking the data using `chunk()` or specifying different chunks in encoding.
`
Trying to reapply the original chunks with `xr.Dataset.chunk` succeeds, and `ds.chunks` no longer reports ""inconsistent chunks"", but trying to write still produces the same error.
I also tried loading my entire dataset into memory, allowing the initial `to_zarr` to default to zarr's chunking heuristics. Trying to read and write a second time again results in the same error:
`
NotImplementedError: Specified zarr chunks (63170,) would overlap multiple dask chunks ((63170, 63170, 63170, 63170, 63170, 63170, 63170, 63169),). This is not implemented in xarray yet. Consider rechunking the data using `chunk()` or specifying different chunks in encoding.
`
I tried this round-tripping experiment with my monkey patches, and it works for a sequence of read/write/read/write... without any intervention in between. This only works for default zarr-chunking, however, since the patch to `xr.backends.zarr._determine_zarr_chunks` overrides whatever chunks are on the originating dataset.
Curious: Is there any downside in xarray to using datasets with inconsistent chunks? I take it that it is a supported configuration because xarray allows it to happen, but just outputs that error when calling `ds.chunks`, which is just a sort of convenience method for looking at chunks across a whole dataset which happens to have consistent chunks...?
One other thing to add: it might be nice to have an option to allow zarr auto-chunking even when `chunks!={}`. I don't know how sensitive zarr performance is to chunksizes, but it'd be nice to have some form of sane auto-chunking available when you don't want to bother with manually choosing.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,342531772
https://github.com/pydata/xarray/pull/1702#issuecomment-389041983,https://api.github.com/repos/pydata/xarray/issues/1702,389041983,MDEyOklzc3VlQ29tbWVudDM4OTA0MTk4Mw==,1530840,2018-05-15T04:51:03Z,2018-05-15T04:51:03Z,NONE,"Just doing some garbage-collection; it looks like this was somehow fixed. This works in 0.10.0 and 0.10.3:
```
>>> ds = xr.Dataset({'a': ('b', [])})
>>> xr.Dataset.equals(ds, xr.Dataset.from_dict(ds.to_dict()))
True
```","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,272325640
https://github.com/pydata/xarray/issues/1599#issuecomment-333247131,https://api.github.com/repos/pydata/xarray/issues/1599,333247131,MDEyOklzc3VlQ29tbWVudDMzMzI0NzEzMQ==,1530840,2017-09-29T21:48:58Z,2017-09-29T21:48:58Z,NONE,"@nicain, for sure. Probably best for the API's sake to stick to the simplicity of a flag.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,261727170
https://github.com/pydata/xarray/issues/1599#issuecomment-333240395,https://api.github.com/repos/pydata/xarray/issues/1599,333240395,MDEyOklzc3VlQ29tbWVudDMzMzI0MDM5NQ==,1530840,2017-09-29T21:12:15Z,2017-09-29T21:12:15Z,NONE,"Could have a callable `serializer` kwarg that defaults to `np.ndarray.tolist`. I have a use case where I would pass in `np.ndarray.tobytes` for this. But then again, I could just use `numpy=True` or `tolist=False` and then walk the dict myself.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,261727170