issue_comments
4 rows where issue = 1197117301 and user = 5635139 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- Writing a a dataset to .zarr in a loop makes all the data NaNs · 4 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1099643203 | https://github.com/pydata/xarray/issues/6456#issuecomment-1099643203 | https://api.github.com/repos/pydata/xarray/issues/6456 | IC_kwDOAMm_X85BizlD | max-sixty 5635139 | 2022-04-14T21:31:37Z | 2022-04-14T21:31:37Z | MEMBER |
Right, you changed the example after I responded
Something surprising is indeed going on here. To focus on the surprising part; ```python print(ds3.low_dim.values) ds3.to_zarr('zarr_bug.zarr', mode='w') print(ds3.low_dim.values) ``` returns:
Similarly: ```python In [50]: ds3.low_dim.count().compute() Out[50]: <xarray.DataArray 'low_dim' ()> array(1000000) In [51]: ds3.to_zarr('zarr_bug.zarr', mode='w') Out[51]: <xarray.backends.zarr.ZarrStore at 0x16a27c6d0> In [55]: ds3.low_dim.count().compute() Out[55]: <xarray.DataArray 'low_dim' ()> array(500000) ``` So it's changing the result in memory just from writing to the Zarr store. I'm not sure what the cause is. We can still massively reduce the size of this example — it's currently doing pickling, got a bunch of repeated code, etc. Does it work without the pickling? What if |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Writing a a dataset to .zarr in a loop makes all the data NaNs 1197117301 | |
1095585081 | https://github.com/pydata/xarray/issues/6456#issuecomment-1095585081 | https://api.github.com/repos/pydata/xarray/issues/6456 | IC_kwDOAMm_X85BTU05 | max-sixty 5635139 | 2022-04-11T21:29:27Z | 2022-04-11T21:29:27Z | MEMBER | @tbloch1 it doesn't copy in to someone else's python atm — that's the "C" part of MCVE... |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Writing a a dataset to .zarr in a loop makes all the data NaNs 1197117301 | |
1094412198 | https://github.com/pydata/xarray/issues/6456#issuecomment-1094412198 | https://api.github.com/repos/pydata/xarray/issues/6456 | IC_kwDOAMm_X85BO2em | max-sixty 5635139 | 2022-04-10T23:46:53Z | 2022-04-10T23:46:53Z | MEMBER |
Or GH Discussions! But it would need a smaller MCVE |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Writing a a dataset to .zarr in a loop makes all the data NaNs 1197117301 | |
1093253883 | https://github.com/pydata/xarray/issues/6456#issuecomment-1093253883 | https://api.github.com/repos/pydata/xarray/issues/6456 | IC_kwDOAMm_X85BKbr7 | max-sixty 5635139 | 2022-04-08T19:05:12Z | 2022-04-08T19:05:12Z | MEMBER | Hi @tbloch1 — thanks for the issue So I understand — is this loading the existing dataset, adding one a slice, and then writing the whole result? Have you considered using For the example — would it be possible to slim that down a bit further? Does it happen with with one read & write after the initial one? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Writing a a dataset to .zarr in a loop makes all the data NaNs 1197117301 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 1