home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 998992357

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/4738#issuecomment-998992357 https://api.github.com/repos/pydata/xarray/issues/4738 998992357 IC_kwDOAMm_X847i2nl 13301940 2021-12-21T18:14:15Z 2021-12-21T18:14:15Z MEMBER

Okay... I think the following comment is still valid:

The issue appears to be caused by the coordinates which are used in dask_tokenize

It appears that the deterministic behavior of the tokenization process is affected depending on whether the dataset/datarray contains non-dimension coordinates or dimension coordinates

python In [2]: ds = xr.tutorial.open_dataset('rasm')

```python In [39]: a = ds.isel(time=0)

In [40]: a Out[40]: <xarray.Dataset> Dimensions: (y: 205, x: 275) Coordinates: time object 1980-09-16 12:00:00 xc (y, x) float64 189.2 189.4 189.6 189.7 ... 17.65 17.4 17.15 16.91 yc (y, x) float64 16.53 16.78 17.02 17.27 ... 28.26 28.01 27.76 27.51 Dimensions without coordinates: y, x Data variables: Tair (y, x) float64 ...

In [41]: dask.base.tokenize(a) == dask.base.tokenize(a) Out[41]: True ```

```python In [42]: b = ds.isel(y=0)

In [43]: b Out[43]: <xarray.Dataset> Dimensions: (time: 36, x: 275) Coordinates: * time (time) object 1980-09-16 12:00:00 ... 1983-08-17 00:00:00 xc (x) float64 189.2 189.4 189.6 189.7 ... 293.5 293.8 294.0 294.3 yc (x) float64 16.53 16.78 17.02 17.27 ... 27.61 27.36 27.12 26.87 Dimensions without coordinates: x Data variables: Tair (time, x) float64 ...

In [44]: dask.base.tokenize(b) == dask.base.tokenize(b) Out[44]: False ```

This looks like a bug in my opinion...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  775502974
Powered by Datasette · Queries took 0.831ms · About: xarray-datasette