home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where author_association = "MEMBER", issue = 279161550 and user = 1217238 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • shoyer · 5 ✖

issue 1

  • dask compute on reduction failes with ValueError · 5 ✖

author_association 1

  • MEMBER · 5 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
349537178 https://github.com/pydata/xarray/issues/1759#issuecomment-349537178 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTUzNzE3OA== shoyer 1217238 2017-12-06T05:20:29Z 2017-12-06T05:20:29Z MEMBER

OK, I figured it out. We set dask.set_options(get=dask.get) in xarray/tests/__init__.py, which is convenient for debugging (warning: changing global state!). Naturally, that means dask ignores the value of __dask_scheduler__ attribute.

I don't really have better ideas for how to test this, but it is reassuring that __dask_scheduler__ is the only attribute where we could reasonably expect to encounter this issue and its implementation is basically trivial. I suppose we could make one basic sanity test where we test each xarray data structure using the default scheduler. Possibly it would also make sense to change the scheduler explicitly (e.g., via a helper function for equality checks) rather than overloading it for all tests.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349391827 https://github.com/pydata/xarray/issues/1759#issuecomment-349391827 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTM5MTgyNw== shoyer 1217238 2017-12-05T18:13:18Z 2017-12-05T18:13:18Z MEMBER

I can remove all the mock related code from test_dask.py entirely and test_dataarray_with_dask_coords still passes.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349149824 https://github.com/pydata/xarray/issues/1759#issuecomment-349149824 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTE0OTgyNA== shoyer 1217238 2017-12-05T00:04:46Z 2017-12-05T00:04:46Z MEMBER

any suggestions on comprehensive ways to test things?

I still don't understand what actually went wrong here. It looks like we have test coverage for calling compute on an xarray.DataArray (see test_dataarray_with_dask_coords), but even though the exact example from that test fails at the repl the test passes when called with pytest: ``` In [5]: import xarray as xr

In [6]: import numpy as np

In [7]: import dask

In [8]: import dask.array as da

In [9]: import toolz ...: x = xr.Variable('x', da.arange(8, chunks=(4,))) ...: y = xr.Variable('y', da.arange(8, chunks=(4,)) * 2) ...: data = da.random.random((8, 8), chunks=(4, 4)) + 1 ...: array = xr.DataArray(data, dims=['x', 'y']) ...: array.coords['xx'] = x ...: array.coords['yy'] = y ...: ...: assert dict(array.dask_graph()) == toolz.merge(data.dask_graph(), ...: x.dask_graph(), ...: y.dask_graph()) ...: ...: (array2,) = dask.compute(array) ...:


ValueError Traceback (most recent call last) <ipython-input-9-4aaee405bed6> in <module>() 11 y.dask_graph()) 12 ---> 13 (array2,) = dask.compute(array)

~/conda/envs/xarray-py36/lib/python3.6/site-packages/dask/base.py in compute(args, kwargs) 334 results_iter = iter(results) 335 return tuple(a if f is None else f(next(results_iter), a) --> 336 for f, a in postcomputes) 337 338

~/conda/envs/xarray-py36/lib/python3.6/site-packages/dask/base.py in <genexpr>(.0) 334 results_iter = iter(results) 335 return tuple(a if f is None else f(next(results_iter), *a) --> 336 for f, a in postcomputes) 337 338

~/dev/xarray/xarray/core/dataarray.py in _dask_finalize(results, func, args, name) 607 @staticmethod 608 def _dask_finalize(results, func, args, name): --> 609 ds = func(results, *args) 610 variable = ds._variables.pop(_THIS_ARRAY) 611 coords = ds._variables

~/dev/xarray/xarray/core/dataset.py in _dask_postcompute(results, info, args) 551 func, args2 = v 552 r = results2.pop() --> 553 result = func(r, args2) 554 else: 555 result = v

~/dev/xarray/xarray/core/variable.py in _dask_finalize(results, array_func, array_args, dims, attrs, encoding) 389 results = {k: v for k, v in results.items() if k[0] == name} # cull 390 data = array_func(results, *array_args) --> 391 return Variable(dims, data, attrs=attrs, encoding=encoding) 392 393 @property

~/dev/xarray/xarray/core/variable.py in init(self, dims, data, attrs, encoding, fastpath) 267 """ 268 self._data = as_compatible_data(data, fastpath=fastpath) --> 269 self._dims = self._parse_dimensions(dims) 270 self._attrs = None 271 self._encoding = None

~/dev/xarray/xarray/core/variable.py in _parse_dimensions(self, dims) 431 raise ValueError('dimensions %s must have the same length as the ' 432 'number of data dimensions, ndim=%s' --> 433 % (dims, self.ndim)) 434 return dims 435

ValueError: dimensions ('x',) must have the same length as the number of data dimensions, ndim=0 ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349148049 https://github.com/pydata/xarray/issues/1759#issuecomment-349148049 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTE0ODA0OQ== shoyer 1217238 2017-12-04T23:55:28Z 2017-12-04T23:55:28Z MEMBER

Any objection to having the .compute methods point to dask.compute if the dask.version is appropriate?

Yes, this seems like a small win.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349134458 https://github.com/pydata/xarray/issues/1759#issuecomment-349134458 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTEzNDQ1OA== shoyer 1217238 2017-12-04T22:54:47Z 2017-12-04T22:54:47Z MEMBER

I don't think you're doing anything wrong. This looks like a bug related to @mrocklin's recent addition of dask duck methods for xarray.

Here are the values being passed to xarray.Dataset._dask_postcompute: results = ('mean_agg-aggregate-21a3ca7e382bdeee43ea06bee1ce3feb', 0) info = [(True, <this-array>, (<function Variable._dask_finalize at 0x118156158>, (<function finalize at 0x118350730>, (), ('y',), None, None)))] args = (set(), {'y': 2}, None, None, None)

It seems like something is going wrong on the dask side: results should be the computed numpy array, not an uncomputed dask key.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 228.361ms · About: xarray-datasette