home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where author_association = "MEMBER", issue = 546562676 and user = 1197350 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • rabernat · 3 ✖

issue 1

  • open_mfdataset: support for multiple zarr datasets · 3 ✖

author_association 1

  • MEMBER · 3 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
573910792 https://github.com/pydata/xarray/issues/3668#issuecomment-573910792 https://api.github.com/repos/pydata/xarray/issues/3668 MDEyOklzc3VlQ29tbWVudDU3MzkxMDc5Mg== rabernat 1197350 2020-01-13T22:50:41Z 2020-01-13T22:50:48Z MEMBER

It would be wonderful if we could translate this complex xarray issue into a minimally simple zarr issue. Then the zarr devs can decide whether this use case is compatible with the zarr spec or not.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset: support for multiple zarr datasets 546562676
572369966 https://github.com/pydata/xarray/issues/3668#issuecomment-572369966 https://api.github.com/repos/pydata/xarray/issues/3668 MDEyOklzc3VlQ29tbWVudDU3MjM2OTk2Ng== rabernat 1197350 2020-01-09T03:42:23Z 2020-01-09T03:42:23Z MEMBER

Thanks for these detailed reports!

The scenario you are describing--trying to open a file that is not accessible at all from the client--is certainly not something we ever considered when designing this. It is a miracle to me that it does work with netCDF.

I think you are on track with the serialization diagnostics. I believe that @jhamman has the best understanding of this topic. He implemented the parallel mode in open_mfdataset. Perhaps he can give some suggestions.

In the meantime, it seems worth asking the obvious question...how hard would it be to mount the NFS volume on the client? That would avoid having to go down this route.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset: support for multiple zarr datasets 546562676
572205386 https://github.com/pydata/xarray/issues/3668#issuecomment-572205386 https://api.github.com/repos/pydata/xarray/issues/3668 MDEyOklzc3VlQ29tbWVudDU3MjIwNTM4Ng== rabernat 1197350 2020-01-08T18:51:06Z 2020-01-08T18:51:06Z MEMBER

Hi @dmedv -- thanks a lot for raising this issue here!

One clarification question: is there just a single zarr store you are trying to read? Or are you trying to combine multiple stores, like open_mfdataset does with multiple netcdf files?

Some of the data is only available on the workers, not on the client.

Can you provide more detail about how the zarr data is distributed across the different workers and client.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset: support for multiple zarr datasets 546562676

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 56.01ms · About: xarray-datasette