home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

9 rows where author_association = "MEMBER", issue = 291332965 and user = 1197350 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • rabernat · 9 ✖

issue 1

  • Drop coordinates on loading large dataset. · 9 ✖

author_association 1

  • MEMBER · 9 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
364494085 https://github.com/pydata/xarray/issues/1854#issuecomment-364494085 https://api.github.com/repos/pydata/xarray/issues/1854 MDEyOklzc3VlQ29tbWVudDM2NDQ5NDA4NQ== rabernat 1197350 2018-02-09T17:03:06Z 2018-02-09T17:03:06Z MEMBER

@jhamman, chunking in lat and lon should not be necessary here. My understanding is that dask/dask#2364 made sure that the indexing operation happens before the concat.

One possibility is that the files have HDF-level chunking / compression, as discussed in #1440. That could be screwing this up.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Drop coordinates on loading large dataset. 291332965
364490209 https://github.com/pydata/xarray/issues/1854#issuecomment-364490209 https://api.github.com/repos/pydata/xarray/issues/1854 MDEyOklzc3VlQ29tbWVudDM2NDQ5MDIwOQ== rabernat 1197350 2018-02-09T16:50:13Z 2018-02-09T16:50:13Z MEMBER

Also, maybe you can post this dataset somewhere online for us to play around with?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Drop coordinates on loading large dataset. 291332965
364489976 https://github.com/pydata/xarray/issues/1854#issuecomment-364489976 https://api.github.com/repos/pydata/xarray/issues/1854 MDEyOklzc3VlQ29tbWVudDM2NDQ4OTk3Ng== rabernat 1197350 2018-02-09T16:49:30Z 2018-02-09T16:49:30Z MEMBER

Did you try my workaround?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Drop coordinates on loading large dataset. 291332965
364465016 https://github.com/pydata/xarray/issues/1854#issuecomment-364465016 https://api.github.com/repos/pydata/xarray/issues/1854 MDEyOklzc3VlQ29tbWVudDM2NDQ2NTAxNg== rabernat 1197350 2018-02-09T15:26:40Z 2018-02-09T15:26:40Z MEMBER

The way this should work is that the selection of a single point should happen before the data is concatenated. It is up to dask to properly "fuse" these two operations. It seems like that is failing for some reason.

As a temporary workaround, you could preprocess the data to only select the specific point before concatenating. python def select_point(ds): return ds.sel(latitude=10, longitude=10) ds = xr.open_mfdataset('*.nc', preprocesses=select_point)

But you shouldn't have to do this to get good performance here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Drop coordinates on loading large dataset. 291332965
364462150 https://github.com/pydata/xarray/issues/1854#issuecomment-364462150 https://api.github.com/repos/pydata/xarray/issues/1854 MDEyOklzc3VlQ29tbWVudDM2NDQ2MjE1MA== rabernat 1197350 2018-02-09T15:16:54Z 2018-02-09T15:16:54Z MEMBER

This sounds similar to #1396, which I thought was resolved (but is still marked as open).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Drop coordinates on loading large dataset. 291332965
364461729 https://github.com/pydata/xarray/issues/1854#issuecomment-364461729 https://api.github.com/repos/pydata/xarray/issues/1854 MDEyOklzc3VlQ29tbWVudDM2NDQ2MTcyOQ== rabernat 1197350 2018-02-09T15:15:28Z 2018-02-09T15:15:28Z MEMBER

Can you just try your full example without the chunks argument and see if it works any better?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Drop coordinates on loading large dataset. 291332965
364456174 https://github.com/pydata/xarray/issues/1854#issuecomment-364456174 https://api.github.com/repos/pydata/xarray/issues/1854 MDEyOklzc3VlQ29tbWVudDM2NDQ1NjE3NA== rabernat 1197350 2018-02-09T14:56:36Z 2018-02-09T14:56:36Z MEMBER

No, I meant this:

python ds = xr.open_mfdataset('path/to/ncs/*.nc', chunks={'time': 127}) ds_point = ds.sel(latitude=10, longitude=10) repr(ds_point)

Also, your comment says that "127 is normally the size of the time dimension in each file", but the info you posted indicates that it's 248. Can you also try open_mfsdataset without the chunks argument?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Drop coordinates on loading large dataset. 291332965
364447957 https://github.com/pydata/xarray/issues/1854#issuecomment-364447957 https://api.github.com/repos/pydata/xarray/issues/1854 MDEyOklzc3VlQ29tbWVudDM2NDQ0Nzk1Nw== rabernat 1197350 2018-02-09T14:26:46Z 2018-02-09T14:26:46Z MEMBER

I am puzzled by this. Selecting a single point should not require loading into memory the whole dataset.

Can you post the output of repr(ds.sel(latitude=10, longitude=10))?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Drop coordinates on loading large dataset. 291332965
360779298 https://github.com/pydata/xarray/issues/1854#issuecomment-360779298 https://api.github.com/repos/pydata/xarray/issues/1854 MDEyOklzc3VlQ29tbWVudDM2MDc3OTI5OA== rabernat 1197350 2018-01-26T13:04:31Z 2018-01-26T13:04:31Z MEMBER

Can you provide a bit more info about the structure of the individual files?

Open a single file and call ds.info(), then paste the contents here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Drop coordinates on loading large dataset. 291332965

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 51.765ms · About: xarray-datasette