home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

10 rows where issue = 340192831 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 4

  • apatlpo 4
  • shoyer 3
  • rabernat 2
  • tinaok 1

author_association 3

  • MEMBER 5
  • CONTRIBUTOR 4
  • NONE 1

issue 1

  • can't store zarr after open_zarr and isel · 10 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
493431087 https://github.com/pydata/xarray/issues/2278#issuecomment-493431087 https://api.github.com/repos/pydata/xarray/issues/2278 MDEyOklzc3VlQ29tbWVudDQ5MzQzMTA4Nw== tinaok 46813815 2019-05-17T12:11:21Z 2019-05-17T14:03:38Z NONE

Hi, second test case indicated by Apatlpo on on 12 Jul 2018, brakes ```python nx, ny, nt = 32, 32, 64 ds = xr.Dataset({}, coords={'x':np.arange(nx),'y':np.arange(ny), 't': np.arange(nt)}) ds = ds.assign(v=ds.tnp.cos(np.pi/180./100ds.x)np.cos(np.pi/180./50ds.y)) ds = ds.chunk({'t': 1, 'x': nx/2, 'y': ny/2}) ds.to_zarr('data.zarr', mode='w')

```

python ds = xr.open_zarr('data.zarr') ds = ds.chunk({'t': nt, 'x': nx/4, 'y': ny/4}) ds.to_zarr('data_rechunked.zarr', mode='w')

Err message is following . ValueError: Final chunk of Zarr array must be the same size or smaller than the first. The specified Zarr chunk encoding is (1, 16, 16), but (64,) in variable Dask chunks ((64,), (8, 8, 8, 8), (8, 8, 8, 8)) is incompatible. Consider rechunking using `chunk()

(if I add del ds.v.encoding['chunks'] as follows, it does not break) python nx, ny, nt = 32, 32, 64 ds = xr.Dataset({}, coords={'x':np.arange(nx),'y':np.arange(ny), 't': np.arange(nt)}) ds = ds.assign(v=ds.t*np.cos(np.pi/180./100*ds.x)*np.cos(np.pi/180./50*ds.y)) ds = ds.chunk({'t': 1, 'x': nx/2, 'y': ny/2}) ds.to_zarr('data.zarr', mode='w') ds = xr.open_zarr('data.zarr') del ds.v.encoding['chunks'] ds = ds.chunk({'t': nt, 'x': nx/4, 'y': ny/4}) ds.to_zarr('data_rechunked.zarr', mode='w')

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  can't store zarr after open_zarr and isel 340192831
404970837 https://github.com/pydata/xarray/issues/2278#issuecomment-404970837 https://api.github.com/repos/pydata/xarray/issues/2278 MDEyOklzc3VlQ29tbWVudDQwNDk3MDgzNw== shoyer 1217238 2018-07-13T22:37:23Z 2018-07-13T22:37:23Z MEMBER

https://github.com/pydata/xarray/blob/64a7d1144c78eacbcd2401d0aa06e86f4047b0a7/xarray/backends/netCDF4_.py#L208-L209

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  can't store zarr after open_zarr and isel 340192831
404873326 https://github.com/pydata/xarray/issues/2278#issuecomment-404873326 https://api.github.com/repos/pydata/xarray/issues/2278 MDEyOklzc3VlQ29tbWVudDQwNDg3MzMyNg== apatlpo 11750960 2018-07-13T15:48:46Z 2018-07-13T15:48:46Z CONTRIBUTOR

Could you please be more specific about where this is done for netCDF?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  can't store zarr after open_zarr and isel 340192831
404530618 https://github.com/pydata/xarray/issues/2278#issuecomment-404530618 https://api.github.com/repos/pydata/xarray/issues/2278 MDEyOklzc3VlQ29tbWVudDQwNDUzMDYxOA== shoyer 1217238 2018-07-12T14:25:02Z 2018-07-12T14:25:02Z MEMBER

We do Ryan's option 2 for netCDF files and it works pretty well. On Thu, Jul 12, 2018 at 8:24 AM Ryan Abernathey notifications@github.com wrote:

Yes, this is the same underlying issue.

On Thu, Jul 12, 2018 at 2:59 PM Aurélien Ponte notifications@github.com wrote:

Note that there is also a fix here that is simply del ds['v'].encoding['chunks'] prior to data storage.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/2278#issuecomment-404503718, or mute the thread < https://github.com/notifications/unsubscribe-auth/ABJFJp-x0xW1Pe_zzEmnO41Ae3tYE541ks5uF0hAgaJpZM4VK7Q0

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/2278#issuecomment-404510872, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKS1tIqTavCwrLu9qcwDQtb98SG6G_Tks5uF04kgaJpZM4VK7Q0 .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  can't store zarr after open_zarr and isel 340192831
404510872 https://github.com/pydata/xarray/issues/2278#issuecomment-404510872 https://api.github.com/repos/pydata/xarray/issues/2278 MDEyOklzc3VlQ29tbWVudDQwNDUxMDg3Mg== rabernat 1197350 2018-07-12T13:24:51Z 2018-07-12T13:24:51Z MEMBER

Yes, this is the same underlying issue.

On Thu, Jul 12, 2018 at 2:59 PM Aurélien Ponte notifications@github.com wrote:

Note that there is also a fix here that is simply del ds['v'].encoding['chunks'] prior to data storage.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/2278#issuecomment-404503718, or mute the thread https://github.com/notifications/unsubscribe-auth/ABJFJp-x0xW1Pe_zzEmnO41Ae3tYE541ks5uF0hAgaJpZM4VK7Q0 .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  can't store zarr after open_zarr and isel 340192831
404503718 https://github.com/pydata/xarray/issues/2278#issuecomment-404503718 https://api.github.com/repos/pydata/xarray/issues/2278 MDEyOklzc3VlQ29tbWVudDQwNDUwMzcxOA== apatlpo 11750960 2018-07-12T12:59:44Z 2018-07-12T13:00:01Z CONTRIBUTOR

Note that there is also a fix for case 2 that is simply del ds['v'].encoding['chunks'] prior to data storage.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  can't store zarr after open_zarr and isel 340192831
404503025 https://github.com/pydata/xarray/issues/2278#issuecomment-404503025 https://api.github.com/repos/pydata/xarray/issues/2278 MDEyOklzc3VlQ29tbWVudDQwNDUwMzAyNQ== apatlpo 11750960 2018-07-12T12:57:17Z 2018-07-12T12:57:35Z CONTRIBUTOR

With the same case, I have another error message which may reflect the same issue (or not), maybe you can tell me. The error message is different which is the reason I am posting this.

Starting from the same dataset: nx, ny, nt = 32, 32, 64 ds = xr.Dataset({}, coords={'x':np.arange(nx),'y':np.arange(ny), 't': np.arange(nt)}) ds = ds.assign(v=ds.t*np.cos(np.pi/180./100*ds.x)*np.cos(np.pi/180./50*ds.y)) ds = ds.chunk({'t': 1, 'x': nx/2, 'y': ny/2}) ds.to_zarr('data.zarr', mode='w')

Case 1 works fine: ds = ds.chunk({'t': nt, 'x': nx/4, 'y': ny/4}) ds.to_zarr('data_rechunked.zarr', mode='w')

Case 2 breaks: ds = xr.open_zarr('data.zarr') ds = ds.chunk({'t': nt, 'x': nx/4, 'y': ny/4}) ds.to_zarr('data_rechunked.zarr', mode='w') with the following error message: .... NotImplementedError: Specified zarr chunks (1, 16, 16) would overlap multiple dask chunks ((64,), (8, 8, 8, 8), (8, 8, 8, 8)). This is not implemented in xarray yet. Consider rechunking the data using `chunk()` or specifying different chunks in encoding.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  can't store zarr after open_zarr and isel 340192831
404429223 https://github.com/pydata/xarray/issues/2278#issuecomment-404429223 https://api.github.com/repos/pydata/xarray/issues/2278 MDEyOklzc3VlQ29tbWVudDQwNDQyOTIyMw== rabernat 1197350 2018-07-12T08:15:43Z 2018-07-12T08:16:02Z MEMBER

Any idea about how serious this is and/or where it's coming from?

The source of the bug is that encoding metadata chunks (which describes the chunk size of the underlying zarr store) is automatically getting populated when you load the zarr store (ds = xr.open_zarr('data.zarr')), and this encoding metadata is being preserved as you transform (sub-select) the dataset. Some possible solutions would be to

  1. Not put chunks into encoding at all.
  2. Figure out a way to strip chunks when performing selection operations or other operations that change shape.

Idea 1 is easier but would mean discarding some relevant metadata about encoding. This would break round-tripping of the un-modified zarr dataset.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  can't store zarr after open_zarr and isel 340192831
404415760 https://github.com/pydata/xarray/issues/2278#issuecomment-404415760 https://api.github.com/repos/pydata/xarray/issues/2278 MDEyOklzc3VlQ29tbWVudDQwNDQxNTc2MA== apatlpo 11750960 2018-07-12T07:25:36Z 2018-07-12T07:25:36Z CONTRIBUTOR

thanks for the workaround suggestion. Apparently you also need to delete chunks for the t singleton coordinate though. The workaround looks at the end like: ds = xr.open_zarr('data.zarr') del ds['v'].encoding['chunks'] del ds['t'].encoding['chunks'] ds.isel(t=0).to_zarr('data_t0.zarr', mode='w') Any idea about how serious this is and/or where it's coming from?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  can't store zarr after open_zarr and isel 340192831
404277786 https://github.com/pydata/xarray/issues/2278#issuecomment-404277786 https://api.github.com/repos/pydata/xarray/issues/2278 MDEyOklzc3VlQ29tbWVudDQwNDI3Nzc4Ng== shoyer 1217238 2018-07-11T19:06:20Z 2018-07-11T19:06:20Z MEMBER

Yes, this is definitely a bug.

One workaround is to explicitly remove the broken chunks encoding from the loaded dataset, e.g., del ds['v'].encoding['chunks']

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  can't store zarr after open_zarr and isel 340192831

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.524ms · About: xarray-datasette