home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 795114188

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/4380#issuecomment-795114188 https://api.github.com/repos/pydata/xarray/issues/4380 795114188 MDEyOklzc3VlQ29tbWVudDc5NTExNDE4OA== 743508 2021-03-10T09:00:48Z 2021-03-10T09:00:48Z CONTRIBUTOR

Running into the same issue, when I:

  1. Load input from a Zarr data source
  2. Queue some processing (delayed dask ufuncs)
  3. Re-chunk using chunk() to get the dask task size I want
  4. use to_zarr to trigger the calculation (dask distributed backend) and save to a new file on disk

I get the chunk size mismatch error which I solve by manually overwriting the encoding['chunks'] value, which seems unintuitive to me. Since I'm going from->to a zarr, I assumed that calling chunk() would set the chunk size for both the dask arrays and the zarr output, since calling to_zarr on a dask array will only work if the dask and zarr encoding chunk size match.

I didn't realize the overwrite_encoded_chunks option existed but it's also a bit confusing that to get the right chunksize on the output i need to set the overwrite option on the input.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  686608969
Powered by Datasette · Queries took 0.835ms · About: xarray-datasette