home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 1099643203

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/6456#issuecomment-1099643203 https://api.github.com/repos/pydata/xarray/issues/6456 1099643203 IC_kwDOAMm_X85BizlD 5635139 2022-04-14T21:31:37Z 2022-04-14T21:31:37Z MEMBER

@max-sixty could you explain which bit isn't working for you? The initial example I shared works fine in colab for me, so that might be a you problem. The second one required specifying the chunks when making the datasets (I've editted above).

Right, you changed the example after I responded

But this bug report was more about the fact that overwriting was converting data to NaNs (in two different ways depending on the code apparently).

In my case there is no longer any need to do the overwriting, but this doesn't seem like the expected behaviour of overwriting, and I'm sure there are some valid reasons to overwrite data - hence me opening the bug report.

Something surprising is indeed going on here. To focus on the surprising part;

```python print(ds3.low_dim.values)

ds3.to_zarr('zarr_bug.zarr', mode='w')

print(ds3.low_dim.values) ```

returns:

[[2. 3. 2. ... 8. 0. 9.] [6. 2. 6. ... 2. 4. 3.] [0. 8. 8. ... 6. 5. 4.] ... [1. 0. 5. ... 2. 0. 3.] [5. 5. 7. ... 9. 6. 2.] [5. 7. 8. ... 4. 8. 9.]] [[nan nan nan ... nan nan nan] [nan nan nan ... nan nan nan] [nan nan nan ... nan nan nan] ... [ 1. 0. 5. ... 2. 0. 3.] [ 5. 5. 7. ... 9. 6. 2.] [ 5. 7. 8. ... 4. 8. 9.]]

Similarly:

```python

In [50]: ds3.low_dim.count().compute() Out[50]: <xarray.DataArray 'low_dim' ()> array(1000000)

In [51]: ds3.to_zarr('zarr_bug.zarr', mode='w') Out[51]: <xarray.backends.zarr.ZarrStore at 0x16a27c6d0>

In [55]: ds3.low_dim.count().compute() Out[55]: <xarray.DataArray 'low_dim' ()> array(500000) ```

So it's changing the result in memory just from writing to the Zarr store. I'm not sure what the cause is.

We can still massively reduce the size of this example — it's currently doing pickling, got a bunch of repeated code, etc. Does it work without the pickling? What if ds3 = xr.concat([ds1, ds1.copy(deep=True)]), etc.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1197117301
Powered by Datasette · Queries took 0.692ms · About: xarray-datasette