issue_comments: 822566735
This data as json
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/pydata/xarray/issues/4554#issuecomment-822566735 | https://api.github.com/repos/pydata/xarray/issues/4554 | 822566735 | MDEyOklzc3VlQ29tbWVudDgyMjU2NjczNQ== | 20629530 | 2021-04-19T15:37:30Z | 2021-04-19T15:37:30Z | CONTRIBUTOR | Took a look and it seems to originate from the stacking part and someting in In ```python3 import xarray as xr import dask.array as dsa nz, ny, nx = (10, 20, 30) data = dsa.ones((nz, ny, nx), chunks=(1, 5, nx)) da = xr.DataArray(data, dims=['z', 'y', 'x']) da.chunks ((1, 1, 1, 1, 1, 1, 1, 1, 1, 1), (5, 5, 5, 5), (30,))stk = da.stack(zy=['z', 'y']) print(stk.dims, stk.chunks) ('x', 'zy') ((30,), (20, 20, 20, 20, 20, 20, 20, 20, 20, 20))Merged chunks!``` And then I went down the rabbit hole (ok it's not that deep) and is all goes down here: https://github.com/pydata/xarray/blob/e0358e586079c12525ce60c4a51b591dc280713b/xarray/core/variable.py#L1507 In ```python Let's stack as xarray does: x, z, y -> x, zydata_t = data.transpose(2, 0, 1) # Dask array with shape (30, 10, 20), the same as new_data = data_t.reshape((30, -1), merge_chunks=True) # True is the default, this is the same call as in xarray new_data.chunks ((30,), (20, 20, 20, 20, 20, 20, 20, 20, 20, 20))new_data = data_t.reshape((30, -1), merge_chunks=False) new_data.shape # I'm printing shape because chunks is too large, but see the bug: (30, 6000) # instead of (30, 200)!!!Doesn't happen when we do not transpose. So let's reshape data as z, y, x -> zy, xnew_data = data.reshape((-1, 30), merge_chunks=True) new_data.chunks ((5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5), (30,))Chunks were not merged? But this is the output expected by paigem.new_data = data.reshape((-1, 30), merge_chunks=False) new_data.chunks ((5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5), (30,))That's what I expected with merge_chunks=False.``` For |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
732910109 |