issue_comments: 1176645772
This data as json
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/pydata/xarray/issues/4205#issuecomment-1176645772 | https://api.github.com/repos/pydata/xarray/issues/4205 | 1176645772 | IC_kwDOAMm_X85GIjCM | 20629530 | 2022-07-06T20:00:09Z | 2022-07-06T20:00:32Z | CONTRIBUTOR | I have the same problem in xarray 2022.3.0. The issue is that this creates unnecessary dask tasks in the graph and some operations acting on the coordinates unexpectedly trigger dask computations. "Unexpected" because the coordinates at the beginning of the process where not chunked. So computation that was expected to happen in the main thread (or not happen at all) is now happenning in the dask workers. An example: ```python3 import numpy as np import xarray as xr from dask.diagnostics import ProgressBar A 2D variableda = xr.DataArray( np.ones((12, 10)), dims=('x', 'y'), coords={'x': np.arange(12), 'y': np.arange(10)} ) A 1D variable sharing a dim with dadb = xr.DataArray( np.ones((12,)), dims=('x'), coords={'x': np.arange(12)} ) A non-dimension coordinatecx = xr.DataArray(np.zeros((12,)), dims=('x',), coords={'x': np.arange(12)}) Assign it to da and dbda = da.assign_coords(cx=cx) db = db.assign_coords(cx=cx) We need to chunk along yda = da.chunk({'y': 1}) Notice how
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
651945063 |