home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

8 rows where author_association = "MEMBER", issue = 283388962 and user = 306380 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • mrocklin · 8 ✖

issue 1

  • fix distributed writes · 8 ✖

author_association 1

  • MEMBER · 8 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
362773103 https://github.com/pydata/xarray/pull/1793#issuecomment-362773103 https://api.github.com/repos/pydata/xarray/issues/1793 MDEyOklzc3VlQ29tbWVudDM2Mjc3MzEwMw== mrocklin 306380 2018-02-03T03:13:04Z 2018-02-03T03:13:04Z MEMBER

Honestly we don't have a very clean mechanism for this. Probably you want to look at dask.context._globals['get']. This should either be None, which means "use the collection's default" (dask.threaded.get in your case) or a callable. If you're using the distributed scheduler then this will be a method of a Client object. Again, not a very clean thing to test for. My apologies.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  fix distributed writes 283388962
362698024 https://github.com/pydata/xarray/pull/1793#issuecomment-362698024 https://api.github.com/repos/pydata/xarray/issues/1793 MDEyOklzc3VlQ29tbWVudDM2MjY5ODAyNA== mrocklin 306380 2018-02-02T20:28:55Z 2018-02-02T20:28:55Z MEMBER

Performance-wise Dask locks will probably add 1-10ms of communication overhead (probably on the lower end of that), plus whatever contention there will be from locking. You can make these locks as fine-grained as you want, for example by defining a lock-per-filename with Lock(filename) with no cost (which would presumably reduce contention).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  fix distributed writes 283388962
362673511 https://github.com/pydata/xarray/pull/1793#issuecomment-362673511 https://api.github.com/repos/pydata/xarray/issues/1793 MDEyOklzc3VlQ29tbWVudDM2MjY3MzUxMQ== mrocklin 306380 2018-02-02T18:56:16Z 2018-02-02T18:56:16Z MEMBER

SerializableLock isn't appropriate here if you want inter process locking. Dask's lock is probably better here if you're running with the distributed scheduler.

On Feb 2, 2018 1:38 PM, "Joe Hamman" notifications@github.com wrote:

The tests failure indicates that the netcdf4/h5netcdf libraries cannot open the file in write/append mode, and it seems that is because the file is already open (by another process).

Two questions:

  1. autoclose is False to_netcdf. That generally makes sense to me but I'm concerned that we're not being explicit enough about closing the file after each process is done interacting with it. Do we have a way to lock until the file is closed?
  2. The lock we're using is dask's SerializableLock. Is that the correct Lock to be using? There is also the distributed.Lock.

xref: dask/dask#1892 https://github.com/dask/dask/issues/1892

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/pull/1793#issuecomment-362657475, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszNohUQirqOLvJZ_5dkoQ74icEsCkks5tQ0w4gaJpZM4RHpBe .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  fix distributed writes 283388962
362590407 https://github.com/pydata/xarray/pull/1793#issuecomment-362590407 https://api.github.com/repos/pydata/xarray/issues/1793 MDEyOklzc3VlQ29tbWVudDM2MjU5MDQwNw== mrocklin 306380 2018-02-02T13:46:18Z 2018-02-02T13:46:18Z MEMBER

For reference, the line

computed = restored.compute()

would have to be replaced with

(computed,) = yield c.compute(restored)

To get the same result. However there were a few more calls to compute hidden in various functions (like to_netcdf) that would be tricky to make asynchronous, so I opted to swich to synchronous style instead.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  fix distributed writes 283388962
362589762 https://github.com/pydata/xarray/pull/1793#issuecomment-362589762 https://api.github.com/repos/pydata/xarray/issues/1793 MDEyOklzc3VlQ29tbWVudDM2MjU4OTc2Mg== mrocklin 306380 2018-02-02T13:43:33Z 2018-02-02T13:43:33Z MEMBER

I've pushed a fix for the future error. We were using a coroutine-style test with synchronous style code. More information here: http://distributed.readthedocs.io/en/latest/develop.html#writing-tests

In the future I suspect that the with cluster style tests will be easier to use for anyone not familiar with async programming. They're a little more opaque (you don't have access to the scheduler or workers), but probably match the API that you expect most people to use in practice.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  fix distributed writes 283388962
360548130 https://github.com/pydata/xarray/pull/1793#issuecomment-360548130 https://api.github.com/repos/pydata/xarray/issues/1793 MDEyOklzc3VlQ29tbWVudDM2MDU0ODEzMA== mrocklin 306380 2018-01-25T17:59:34Z 2018-01-25T17:59:34Z MEMBER

I can take a look at the future not iterable issue sometime tomorrow.

Has anyone successfully used dask.array.store() with the distributed scheduler?

My guess is that this would be easy with a friendly storage target. I'm not sure though. cc @jakirkham who has been active on this topic recently.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  fix distributed writes 283388962
357105359 https://github.com/pydata/xarray/pull/1793#issuecomment-357105359 https://api.github.com/repos/pydata/xarray/issues/1793 MDEyOklzc3VlQ29tbWVudDM1NzEwNTM1OQ== mrocklin 306380 2018-01-12T00:23:09Z 2018-01-12T00:23:09Z MEMBER

I don't know. I would want to look at the fail case locally. I can try to do this near term, no promises though :/

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  fix distributed writes 283388962
352908509 https://github.com/pydata/xarray/pull/1793#issuecomment-352908509 https://api.github.com/repos/pydata/xarray/issues/1793 MDEyOklzc3VlQ29tbWVudDM1MjkwODUwOQ== mrocklin 306380 2017-12-19T22:39:43Z 2017-12-19T22:39:43Z MEMBER

The zarr test seems a bit different. I think your issue here is that you are trying to use synchronous API with the async test harness. I've changed your test and pushed to your branch (hope you don't mind). Relevant docs are here: http://distributed.readthedocs.io/en/latest/develop.html#writing-tests

Async testing is nicer in many ways, but does require you to be a bit familiar with the async/tornado API. I also suspect that operations like to_zarr really aren't yet async friendly.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  fix distributed writes 283388962

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 24.263ms · About: xarray-datasette