home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

6 rows where author_association = "MEMBER", issue = 283388962 and user = 1217238 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • shoyer · 6 ✖

issue 1

  • fix distributed writes · 6 ✖

author_association 1

  • MEMBER · 6 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
367377682 https://github.com/pydata/xarray/pull/1793#issuecomment-367377682 https://api.github.com/repos/pydata/xarray/issues/1793 MDEyOklzc3VlQ29tbWVudDM2NzM3NzY4Mg== shoyer 1217238 2018-02-21T16:09:00Z 2018-02-21T16:09:00Z MEMBER

I don't totally understand the scipy constraints on incremental writes but could that be playing a factor here?

I'm pretty sure SciPy supports incremental reads but not incremental writes. In general the entire netCDF file needs to get written at once. Certainly it's not possible to update only part of an array -- scipy needs it in memory as a NumPy array to copy its raw data to the netCDF file.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  fix distributed writes 283388962
362697439 https://github.com/pydata/xarray/pull/1793#issuecomment-362697439 https://api.github.com/repos/pydata/xarray/issues/1793 MDEyOklzc3VlQ29tbWVudDM2MjY5NzQzOQ== shoyer 1217238 2018-02-02T20:26:30Z 2018-02-02T20:26:30Z MEMBER

A simpler way to handle locking for now (but with possibly subpar performance) would be to use a single global distributed lock.

As for autoclose, perhaps we should make the default autoclose=None, which becomes True if using dask-distributed (or maybe using dask in general?) and otherwise is False.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  fix distributed writes 283388962
362673882 https://github.com/pydata/xarray/pull/1793#issuecomment-362673882 https://api.github.com/repos/pydata/xarray/issues/1793 MDEyOklzc3VlQ29tbWVudDM2MjY3Mzg4Mg== shoyer 1217238 2018-02-02T18:57:33Z 2018-02-02T18:57:33Z MEMBER

We might always need to use autoclose=True with distributed. The problem is that in xarray's default mode of operation, we open a netCDF file (without using dask) to create variables, dimensions and attributes, keeping the file open. Then we write the data using dask (via AbstractWritableDataStore.sync()), but the original file is still open.

As for the lock, we need locking both: - Per process: only one thread can use HDF5 for reading/writing at the same time. - Per file: only one worker can read/write a file at the same time.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  fix distributed writes 283388962
362465131 https://github.com/pydata/xarray/pull/1793#issuecomment-362465131 https://api.github.com/repos/pydata/xarray/issues/1793 MDEyOklzc3VlQ29tbWVudDM2MjQ2NTEzMQ== shoyer 1217238 2018-02-02T02:17:34Z 2018-02-02T02:17:34Z MEMBER

Looking into this a little bit, this looks like a dask-distributed bug to me. Somehow Client.get() is returning a tornado.concurrent.Future object, even though sync=True.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  fix distributed writes 283388962
360547539 https://github.com/pydata/xarray/pull/1793#issuecomment-360547539 https://api.github.com/repos/pydata/xarray/issues/1793 MDEyOklzc3VlQ29tbWVudDM2MDU0NzUzOQ== shoyer 1217238 2018-01-25T17:57:29Z 2018-01-25T17:57:29Z MEMBER

Has anyone successfully used dask.array.store() with the distributed scheduler?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  fix distributed writes 283388962
352906316 https://github.com/pydata/xarray/pull/1793#issuecomment-352906316 https://api.github.com/repos/pydata/xarray/issues/1793 MDEyOklzc3VlQ29tbWVudDM1MjkwNjMxNg== shoyer 1217238 2017-12-19T22:29:47Z 2017-12-19T22:29:47Z MEMBER

yes, see https://github.com/pydata/xarray/issues/1464#issuecomment-341329662

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  fix distributed writes 283388962

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 209.022ms · About: xarray-datasette