home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

7 rows where issue = 180516114 and user = 1217238 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

These facets timed out: issue

user 1

  • shoyer · 7 ✖

author_association 1

  • MEMBER 7
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
391805626 https://github.com/pydata/xarray/issues/1026#issuecomment-391805626 https://api.github.com/repos/pydata/xarray/issues/1026 MDEyOklzc3VlQ29tbWVudDM5MTgwNTYyNg== shoyer 1217238 2018-05-24T17:59:31Z 2018-05-24T17:59:31Z MEMBER

Indeed, it looks like this works now. Extending the example from the first post: In [3]: ds.chunk({'x': 5}).thedata.groupby('thegroup').mean() Out[3]: <xarray.DataArray 'thedata' (thegroup: 2)> dask.array<shape=(2,), dtype=float64, chunksize=(1,)> Coordinates: * thegroup (thegroup) object False True

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multidim groupby on dask arrays: dask.array.reshape error 180516114
286181363 https://github.com/pydata/xarray/issues/1026#issuecomment-286181363 https://api.github.com/repos/pydata/xarray/issues/1026 MDEyOklzc3VlQ29tbWVudDI4NjE4MTM2Mw== shoyer 1217238 2017-03-13T17:28:40Z 2017-03-13T17:28:40Z MEMBER

This is what I was looking for:

Frozen(SortedKeysDict({'allpoints': (1, 1, 1, 1, 1......(allpoints)....., 1, 1), 'T': (11L,)}))

So in this case (where the chunk size is already 1), dask.array.reshape could actually work fine and the error is unnecessary (we don't have the exploding task issue). So this could potentially be fixed upstream in dask.

For now, the best work-around (because you don't have any memory concerns) is to "rechunk" into a single block along the last axis before reshaping, e.g., .chunk(allpoints=259200) or .chunk(allpoints=1e9) (or something arbitrarily large).

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multidim groupby on dask arrays: dask.array.reshape error 180516114
286152275 https://github.com/pydata/xarray/issues/1026#issuecomment-286152275 https://api.github.com/repos/pydata/xarray/issues/1026 MDEyOklzc3VlQ29tbWVudDI4NjE1MjI3NQ== shoyer 1217238 2017-03-13T15:58:29Z 2017-03-13T15:58:29Z MEMBER

@byersiiasa What matters for dask's reshape is the array shape and chunk shape, all of which you should see when you print a dask.array (or xarray.DataArray containing one). What is the size of the chunking along time and allpoints?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multidim groupby on dask arrays: dask.array.reshape error 180516114
286123584 https://github.com/pydata/xarray/issues/1026#issuecomment-286123584 https://api.github.com/repos/pydata/xarray/issues/1026 MDEyOklzc3VlQ29tbWVudDI4NjEyMzU4NA== shoyer 1217238 2017-03-13T14:29:12Z 2017-03-13T14:29:12Z MEMBER

That array is loaded in numpy already - can you share the dask version? On Mon, Mar 13, 2017 at 2:57 AM byersiiasa notifications@github.com wrote:

<xarray.DataArray 'dis' (time: 30, allpoints: 259200)> array([[ 9.969210e+36, 9.969210e+36, 9.969210e+36, ..., 9.969210e+36, 9.969210e+36, 9.969210e+36], [ 9.969210e+36, 9.969210e+36, 9.969210e+36, ..., 9.969210e+36, 9.969210e+36, 9.969210e+36], [ 9.969210e+36, 9.969210e+36, 9.969210e+36, ..., 9.969210e+36, 9.969210e+36, 9.969210e+36], ..., [ 9.969210e+36, 9.969210e+36, 9.969210e+36, ..., 9.969210e+36, 9.969210e+36, 9.969210e+36], [ 9.969210e+36, 9.969210e+36, 9.969210e+36, ..., 9.969210e+36, 9.969210e+36, 9.969210e+36], [ 9.969210e+36, 9.969210e+36, 9.969210e+36, ..., 9.969210e+36, 9.969210e+36, 9.969210e+36]]) Coordinates: * time (time) datetime64[ns] 1971-01-01 1972-01-01 1973-01-01 ... * allpoints (allpoints) MultiIndex - lon (allpoints) float64 -179.8 -179.8 -179.8 -179.8 -179.8 -179.8 ... - lat (allpoints) float64 89.75 89.25 88.75 88.25 87.75 87.25 86.75 ...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/1026#issuecomment-286062113, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKS1pd_uTiUQLRXjBhR7D06uvkkKJBDks5rlRLxgaJpZM4KMB0C .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multidim groupby on dask arrays: dask.array.reshape error 180516114
285893380 https://github.com/pydata/xarray/issues/1026#issuecomment-285893380 https://api.github.com/repos/pydata/xarray/issues/1026 MDEyOklzc3VlQ29tbWVudDI4NTg5MzM4MA== shoyer 1217238 2017-03-11T19:23:55Z 2017-03-11T19:23:55Z MEMBER

@byersiiasa can you share what stacked.dis looks like?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multidim groupby on dask arrays: dask.array.reshape error 180516114
250997873 https://github.com/pydata/xarray/issues/1026#issuecomment-250997873 https://api.github.com/repos/pydata/xarray/issues/1026 MDEyOklzc3VlQ29tbWVudDI1MDk5Nzg3Mw== shoyer 1217238 2016-10-02T21:38:30Z 2016-10-02T21:38:30Z MEMBER

It would look something like this: 1. Verify that chunks are the same on all dask arrays to be stacked. 2. Use np.ravel with map_blocks to flatten each block independently. 3. Construct the appropriate (non-sorted) MultiIndex to label the flattened elements.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multidim groupby on dask arrays: dask.array.reshape error 180516114
250986266 https://github.com/pydata/xarray/issues/1026#issuecomment-250986266 https://api.github.com/repos/pydata/xarray/issues/1026 MDEyOklzc3VlQ29tbWVudDI1MDk4NjI2Ng== shoyer 1217238 2016-10-02T18:20:36Z 2016-10-02T18:20:36Z MEMBER

This was an intentional change -- see https://github.com/dask/dask/pull/1469

Previously, we created lots of teeny tasks, which tended to negate any out of core benefits. The problem is that reshape promises an order to the elements it reshape which tends to split across existing chunks of dask arrays.

We could work around this in xarray by adding custom logic to stack for keeping chunks together when reshaping, but we can't do this upstream in dask because we need to make sure we keep all the arrays aligned.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multidim groupby on dask arrays: dask.array.reshape error 180516114

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 6557.81ms · About: xarray-datasette