home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

8 rows where author_association = "CONTRIBUTOR", issue = 94328498 and user = 4295853 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • pwolfram · 8 ✖

issue 1

  • open_mfdataset too many files · 8 ✖

author_association 1

  • CONTRIBUTOR · 8 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
288867744 https://github.com/pydata/xarray/issues/463#issuecomment-288867744 https://api.github.com/repos/pydata/xarray/issues/463 MDEyOklzc3VlQ29tbWVudDI4ODg2Nzc0NA== pwolfram 4295853 2017-03-23T21:36:07Z 2017-03-23T21:36:07Z CONTRIBUTOR

@ajoros should correct me if I'm wrong but it sounds like everything is working for his use case.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset too many files 94328498
288832707 https://github.com/pydata/xarray/issues/463#issuecomment-288832707 https://api.github.com/repos/pydata/xarray/issues/463 MDEyOklzc3VlQ29tbWVudDI4ODgzMjcwNw== pwolfram 4295853 2017-03-23T19:21:57Z 2017-03-23T19:21:57Z CONTRIBUTOR

@ajoros, #1198 was just merged so the bleeding-edge version of xarray is the one to try!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset too many files 94328498
288830741 https://github.com/pydata/xarray/issues/463#issuecomment-288830741 https://api.github.com/repos/pydata/xarray/issues/463 MDEyOklzc3VlQ29tbWVudDI4ODgzMDc0MQ== pwolfram 4295853 2017-03-23T19:14:23Z 2017-03-23T19:14:23Z CONTRIBUTOR

@ajoros, can you try something like pip -v install --force git+ssh://git@github.com/pwolfram/xarray@fix_too_many_open_files to see if #1198 fixes your problem with your dataset, noting that you need open_mfdataset(..., autoclose=True)?

@shoyer should correct me if I'm wrong but we are almost ready to merge the code in this PR and this would be a great "in the field" check if you could try it out soon.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset too many files 94328498
288414991 https://github.com/pydata/xarray/issues/463#issuecomment-288414991 https://api.github.com/repos/pydata/xarray/issues/463 MDEyOklzc3VlQ29tbWVudDI4ODQxNDk5MQ== pwolfram 4295853 2017-03-22T14:25:37Z 2017-03-22T14:25:37Z CONTRIBUTOR

We are very close on #1198 and will be merging soon. This would be a great time for everyone to ensure that #1198 resolves this issue before we merge.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset too many files 94328498
263723460 https://github.com/pydata/xarray/issues/463#issuecomment-263723460 https://api.github.com/repos/pydata/xarray/issues/463 MDEyOklzc3VlQ29tbWVudDI2MzcyMzQ2MA== pwolfram 4295853 2016-11-29T22:39:25Z 2016-11-29T23:30:59Z CONTRIBUTOR

I just realized I didn't say thank you to @shoyer et al for the advice and help. Please forgive my rudeness.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset too many files 94328498
263721589 https://github.com/pydata/xarray/issues/463#issuecomment-263721589 https://api.github.com/repos/pydata/xarray/issues/463 MDEyOklzc3VlQ29tbWVudDI2MzcyMTU4OQ== pwolfram 4295853 2016-11-29T22:31:25Z 2016-11-29T22:31:25Z CONTRIBUTOR

@shoyer, if I understand correctly the best approach as you see it to build on opener via #1128, recognizing this will be essentially "upgraded" sometime in the future, right?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset too many files 94328498
263693540 https://github.com/pydata/xarray/issues/463#issuecomment-263693540 https://api.github.com/repos/pydata/xarray/issues/463 MDEyOklzc3VlQ29tbWVudDI2MzY5MzU0MA== pwolfram 4295853 2016-11-29T20:46:20Z 2016-11-29T20:47:30Z CONTRIBUTOR

@shoyer, you probably have the very best feel for what the most efficacious solution is to this problem in terms of fixing the issue, performance, longer utility, etc. Is there any clear winner from the following potentially non-exhaustive options?

  1. LRU cache from #798
  2. Building on opener #1128
  3. New wrapper functionality as discussed above for NcML
  4. Use of PyReshaper (e.g., short term acknowledgement that change to xarray / dask may be somewhat out of scope for current design goals)

My current analysis:

I could see our team using PyReshaper because our data output format already has inertia but this adds complexity to a workflow that intuitively should be handled inside xarray. However, I think we want to get around the file number limitation eventually because it is an issue that multiple groups keep bringing up. This is perhaps the simplest solution but it is specific to our uses and not necessarily general. Towards a general solution, we would intuitively have a fixed cost performance penalty for the opener solution but it may be the simplest and cleanest approach, at least for the short term. However, we may need the LRU cache eventually to bridge xarray / dask-distributed so implementation of opener could be a depreciated effort in the long term. The NcML approach has the flavor of a solution along the lines of PyReshaper, although my limited experience with PyReshaper and NcML precludes a more rigorous analysis. We can follow up with @kmpaul on this point if it would be helpful moving forward.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset too many files 94328498
263418422 https://github.com/pydata/xarray/issues/463#issuecomment-263418422 https://api.github.com/repos/pydata/xarray/issues/463 MDEyOklzc3VlQ29tbWVudDI2MzQxODQyMg== pwolfram 4295853 2016-11-28T22:42:55Z 2016-11-28T22:43:32Z CONTRIBUTOR

We (+ @milenaveneziani and @xylar) are running into this issue again. Ideally, this should be resolved and after following up with everyone on strategy I may have another look at this issue if it sounds straightforward to fix.

@shoyer and @mrocklin, if I understand correctly, incorporation of the LRU cache could help with this problem assuming time series were sliced into small chunks for access, correct? We would still run into problems, however, if there were say 10^6 files and we wanted to get a time-series spanning these files, right? If so, we may need a more robust solution than just the LRU cache. In the short term, PyReshaper may provide a temporary solution for us. cc @kmpaul to provide some perspective here too regarding use of https://github.com/NCAR/PyReshaper.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset too many files 94328498

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 559.299ms · About: xarray-datasette