issues: 280942467
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
280942467 | MDU6SXNzdWUyODA5NDI0Njc= | 1774 | xarray mean generates unstable dask name (hash) | 27647769 | closed | 0 | 2 | 2017-12-11T09:11:31Z | 2017-12-11T20:43:35Z | 2017-12-11T20:43:35Z | NONE | Code Sample```python import dask.array as da import xarray as xr import numpy as np create dask arrayx = da.ones((5, 5), chunks=(2, 2)) create xarray arrayx2 = xr.DataArray(x, dims=['d1', 'd2']) print dask name after taking meanx = x.mean(axis=0) print(x.name) print dask name after taking meanx2 = x2.mean(dim='d1') print(x2.data.name) confirm both functions do the sameprint(np.allclose(x.compute(), x2.data.compute())) ``` Problem DescriptionRunning the above sample three times outputs: ``` mean_agg-aggregate-9716da6e38d695dbff18f713d787e614 mean_agg-aggregate-02c33c19e6209edbe409749388d2f9f0 True mean_agg-aggregate-9716da6e38d695dbff18f713d787e614 mean_agg-aggregate-2f59be8ef8c35336717fdcd7744bd167 True mean_agg-aggregate-9716da6e38d695dbff18f713d787e614 mean_agg-aggregate-822994428d6b4cdea8e5c134711e5609 True ``` which shows the dask name (hash) generated using the xarray mean is unstable. Expected OutputFor processing large datasets, it's convenient to have a stable hash name in order to save intermediate results or compare them among developers.
Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1774/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |