home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 1717787692

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1717787692 I_kwDOAMm_X85mY1ws 7853 Surprising behavior of DataArray.chunk when using automatic chunksize determination 4753005 closed 1     2 2023-05-19T20:31:25Z 2023-08-01T16:27:19Z 2023-08-01T16:27:19Z NONE      

What is your issue?

I have a DataArray da with dims (x, y), and additional coordinates such as x_coord on dim x. If I try to chunk this array using da.chunk(chunks={'x': 'auto'}), I end up with a situation where: 1. The data themselves are chunked along x with chunksize a. 2. The x coordinate itself is not chunked. 3. The x_coord coordinate on dim x is chunked, with chunksize b != a.

As far as I can tell, what is going on is that da.chunk(chunks={'x': 'auto'}) is autodetermining the chunksize differently for each "thing" (data, variable, coordinate, etc) on the x dimension. What I expected was for it to determine one chunksize based on the data in the array, then use that chunksize (or no chunking) to each coordinate as well. Maybe there could be an option to yield unified chunks by default.

I discovered this because after chunking, da.chunksizes raises a ValueError because of the mismatch between the data and x_coord, and the proposed solution -- calling da.unify_chunks() -- then results in irregular chunksizes on both the data and x_coord. To get the behavior that I expected I have to call da.chunk(da.encoding['preferred_chunks']), which also, incidentally, seems like what I would have expected from da.unify_chunks().

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7853/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 0 rows from issue in issue_comments
Powered by Datasette · Queries took 0.513ms · About: xarray-datasette