home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

6 rows where author_association = "NONE" and issue = 180516114 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • byersiiasa 6

issue 1

  • multidim groupby on dask arrays: dask.array.reshape error · 6 ✖

author_association 1

  • NONE · 6 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
286381505 https://github.com/pydata/xarray/issues/1026#issuecomment-286381505 https://api.github.com/repos/pydata/xarray/issues/1026 MDEyOklzc3VlQ29tbWVudDI4NjM4MTUwNQ== byersiiasa 17701232 2017-03-14T10:30:24Z 2017-03-14T10:30:24Z NONE

Thanks - this is working well.

Reverting back to xarray 0.8.2 and dask 0.10.1 seems to be a combination that worked well for this particular task using delayed.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multidim groupby on dask arrays: dask.array.reshape error 180516114
286171415 https://github.com/pydata/xarray/issues/1026#issuecomment-286171415 https://api.github.com/repos/pydata/xarray/issues/1026 MDEyOklzc3VlQ29tbWVudDI4NjE3MTQxNQ== byersiiasa 17701232 2017-03-13T16:58:06Z 2017-03-13T16:58:06Z NONE

@shoyer No chunking as the dataset was quite small (360x720x30). Also, the calculation is along the time dimension so this effectively disappears for each lat/lon. Hence initial surprise why it was coming up with this chunk/reshape issue since I thought all it has to do is unstack 'allpoints'

If I print one of the dask arrays from within the function print sT dask.array<from-va..., shape=(11L,), dtype=float64, chunksize=(11L,)> This is 11L because the calculation returns 11 values per point to an xr.Dataset.

Others have no chunks because they are single values (for each point) print p_value dask.array<from-va..., shape=(), dtype=float64, chunksize=()> Only returns one value per point The object returned (xr.Dataset) from the .apply function comes out with chunks: mle.chunks Frozen(SortedKeysDict({'allpoints': (1, 1, 1, 1, 1......(allpoints)....., 1, 1), 'T': (11L,)}))

and looks like: <xarray.Dataset> Dimensions: (T: 11, allpoints: 259200) Coordinates: * T (T) int32 1 5 10 15 20 25 30 40 50 75 100 * allpoints (allpoints) MultiIndex - allpoints_level_0 (allpoints) float64 40.25 40.25 40.25 40.25 40.25 ... - allpoints_level_1 (allpoints) float64 22.75 23.25 23.75 24.25 24.75 ... Data variables: xi (allpoints) float64 -0.6906 -0.6906 -0.6906 -0.6906 ... mu (allpoints) float64 9.969e+36 9.969e+36 9.969e+36 ... sT (allpoints, T) float64 9.969e+36 9.969e+36 9.969e+36 ... KS_p_value (allpoints) float64 3.8e-12 3.8e-12 3.8e-12 3.8e-12 ... sigma (allpoints) float64 5.297e-24 5.297e-24 5.297e-24 ... KS_statistic (allpoints) float64 0.6321 0.6321 0.6321 0.6321 ...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multidim groupby on dask arrays: dask.array.reshape error 180516114
286152988 https://github.com/pydata/xarray/issues/1026#issuecomment-286152988 https://api.github.com/repos/pydata/xarray/issues/1026 MDEyOklzc3VlQ29tbWVudDI4NjE1Mjk4OA== byersiiasa 17701232 2017-03-13T16:00:39Z 2017-03-13T16:00:39Z NONE

So, not sure if this is helpful but I'll leave these notes here just in case.

  • 0.11.0 - similar problem to @rabernat above - 0.10.1 - seems to work fine for what I wanted (delayed)
  • 0.9.0 - appeared to work ok, but actually I'm not convinced it was parallelising the tasks. And also resulted in massive memory issues
  • 0.14.0 - another problem, can't remember what but issue to do with delayed I think.
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multidim groupby on dask arrays: dask.array.reshape error 180516114
286144002 https://github.com/pydata/xarray/issues/1026#issuecomment-286144002 https://api.github.com/repos/pydata/xarray/issues/1026 MDEyOklzc3VlQ29tbWVudDI4NjE0NDAwMg== byersiiasa 17701232 2017-03-13T15:33:25Z 2017-03-13T15:33:25Z NONE

I have been re-running that script you helped me with in Google groups: https://groups.google.com/forum/#!searchin/xarray/combogev%7Csort:relevance/xarray/nfNh40Zt3sU/WfhavtXgCAAJ

do you mean the delayed object from within the function? perhaps <bound method Array.visualize of dask.array<from-va..., shape=(11L,), dtype=float64, chunksize=(11L,)>>

or perhaps Delayed('fit-3767d9ad6cfa517555b5800b3b5f4e41')

I am going to keep trying with different versions of dask since this 0.9.0 doesn't seem to behave it did previously.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multidim groupby on dask arrays: dask.array.reshape error 180516114
286062113 https://github.com/pydata/xarray/issues/1026#issuecomment-286062113 https://api.github.com/repos/pydata/xarray/issues/1026 MDEyOklzc3VlQ29tbWVudDI4NjA2MjExMw== byersiiasa 17701232 2017-03-13T09:57:04Z 2017-03-13T09:57:04Z NONE

<xarray.DataArray 'dis' (time: 30, allpoints: 259200)> array([[ 9.969210e+36, 9.969210e+36, 9.969210e+36, ..., 9.969210e+36, 9.969210e+36, 9.969210e+36], [ 9.969210e+36, 9.969210e+36, 9.969210e+36, ..., 9.969210e+36, 9.969210e+36, 9.969210e+36], [ 9.969210e+36, 9.969210e+36, 9.969210e+36, ..., 9.969210e+36, 9.969210e+36, 9.969210e+36], ..., [ 9.969210e+36, 9.969210e+36, 9.969210e+36, ..., 9.969210e+36, 9.969210e+36, 9.969210e+36], [ 9.969210e+36, 9.969210e+36, 9.969210e+36, ..., 9.969210e+36, 9.969210e+36, 9.969210e+36], [ 9.969210e+36, 9.969210e+36, 9.969210e+36, ..., 9.969210e+36, 9.969210e+36, 9.969210e+36]]) Coordinates: * time (time) datetime64[ns] 1971-01-01 1972-01-01 1973-01-01 ... * allpoints (allpoints) MultiIndex - lon (allpoints) float64 -179.8 -179.8 -179.8 -179.8 -179.8 -179.8 ... - lat (allpoints) float64 89.75 89.25 88.75 88.25 87.75 87.25 86.75 ...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multidim groupby on dask arrays: dask.array.reshape error 180516114
285851059 https://github.com/pydata/xarray/issues/1026#issuecomment-285851059 https://api.github.com/repos/pydata/xarray/issues/1026 MDEyOklzc3VlQ29tbWVudDI4NTg1MTA1OQ== byersiiasa 17701232 2017-03-11T07:51:57Z 2017-03-12T14:53:35Z NONE

Hi @rabernat and @shoyer I have come across same issue while re-running some old code now using xarray 0.9.1 / dask 0.11.0. Was there any workaround or solution?

Issue occurs for me when trying to unstack 'allpoints', e.g. mle = stacked.dis.groupby('allpoints').apply(combogev) dsmle = mle.unstack('allpoints')

Thanks

Also works with dask 0.9.0

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multidim groupby on dask arrays: dask.array.reshape error 180516114

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 143.351ms · About: xarray-datasette