home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

17 rows where issue = 279161550 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 4

  • mrocklin 10
  • shoyer 5
  • jakirkham 1
  • max-sixty 1

author_association 2

  • MEMBER 16
  • NONE 1

issue 1

  • dask compute on reduction failes with ValueError · 17 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
349772394 https://github.com/pydata/xarray/issues/1759#issuecomment-349772394 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTc3MjM5NA== jakirkham 3019665 2017-12-06T20:57:51Z 2017-12-06T20:57:51Z NONE

Given the recent turn in discussion here, might be worthwhile to share some thoughts on issue ( https://github.com/dask/dask/issues/2694 ).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349765925 https://github.com/pydata/xarray/issues/1759#issuecomment-349765925 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTc2NTkyNQ== mrocklin 306380 2017-12-06T20:32:58Z 2017-12-06T20:32:58Z MEMBER

That seems sensible to me. It would also be a good way to ensure that XArray operations adhere to all of the dask.array checks.

from dask.array.utils import assert_eq
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349763381 https://github.com/pydata/xarray/issues/1759#issuecomment-349763381 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTc2MzM4MQ== max-sixty 5635139 2017-12-06T20:24:08Z 2017-12-06T20:24:08Z MEMBER

our own assert_eq functions that both invoke the single-threaded scheduler, and also do a variety of other sanity checks like ensuring that the expected and computed dtypes and shapes are the same, that the keynames in graphs are sensible, etc..

What do you think about using that in our tests, i.e. from dask.utils import asset_eq as assert_dask_equals?

Or we could even check for a dask object in our own assert_equals and pass off to dask's?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349630978 https://github.com/pydata/xarray/issues/1759#issuecomment-349630978 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTYzMDk3OA== mrocklin 306380 2017-12-06T12:52:56Z 2017-12-06T12:52:56Z MEMBER

In the dask library itself we solve this by creating our own assert_eq functions that both invoke the single-threaded scheduler, and also do a variety of other sanity checks like ensuring that the expected and computed dtypes and shapes are the same, that the keynames in graphs are sensible, etc..

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349537178 https://github.com/pydata/xarray/issues/1759#issuecomment-349537178 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTUzNzE3OA== shoyer 1217238 2017-12-06T05:20:29Z 2017-12-06T05:20:29Z MEMBER

OK, I figured it out. We set dask.set_options(get=dask.get) in xarray/tests/__init__.py, which is convenient for debugging (warning: changing global state!). Naturally, that means dask ignores the value of __dask_scheduler__ attribute.

I don't really have better ideas for how to test this, but it is reassuring that __dask_scheduler__ is the only attribute where we could reasonably expect to encounter this issue and its implementation is basically trivial. I suppose we could make one basic sanity test where we test each xarray data structure using the default scheduler. Possibly it would also make sense to change the scheduler explicitly (e.g., via a helper function for equality checks) rather than overloading it for all tests.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349452364 https://github.com/pydata/xarray/issues/1759#issuecomment-349452364 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTQ1MjM2NA== mrocklin 306380 2017-12-05T21:47:18Z 2017-12-05T21:47:18Z MEMBER

I can remove all the mock related code from test_dask.py entirely and test_dataarray_with_dask_coords still passes.

It was just a guess. Something wacky is certainly happening though. I recommend copying my code from my last comment and running pytest on it either in the root directory or in xarray/tests. I found that the outcome differed depending on location.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349391827 https://github.com/pydata/xarray/issues/1759#issuecomment-349391827 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTM5MTgyNw== shoyer 1217238 2017-12-05T18:13:18Z 2017-12-05T18:13:18Z MEMBER

I can remove all the mock related code from test_dask.py entirely and test_dataarray_with_dask_coords still passes.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349151434 https://github.com/pydata/xarray/issues/1759#issuecomment-349151434 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTE1MTQzNA== mrocklin 306380 2017-12-05T00:14:13Z 2017-12-05T00:14:13Z MEMBER

I still don't understand what actually went wrong here. It looks like we have test coverage for calling compute on an xarray.DataArray (see test_dataarray_with_dask_coords), but even though the exact example from that test fails at the repl the test passes when called with pytest:

I experienced some odd behavior when testing this within the XArray test suite

This file would pass when within xarray/tests/ but would fail when within the root directory

```python import numpy as np import xarray as xr import dask

def test_dask_reduction(): data = xr.DataArray(np.random.random(size=(10, 2)), dims=['samples', 'features']).chunk((5, 2)) result = dask.compute(data.mean(axis=0)) ```

I suspect some odd behavior around mock, but that's probably due to a general bias/lack of understanding of that module.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349151169 https://github.com/pydata/xarray/issues/1759#issuecomment-349151169 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTE1MTE2OQ== mrocklin 306380 2017-12-05T00:12:38Z 2017-12-05T00:12:38Z MEMBER

See https://github.com/pydata/xarray/pull/1760 for a potential fix

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349149824 https://github.com/pydata/xarray/issues/1759#issuecomment-349149824 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTE0OTgyNA== shoyer 1217238 2017-12-05T00:04:46Z 2017-12-05T00:04:46Z MEMBER

any suggestions on comprehensive ways to test things?

I still don't understand what actually went wrong here. It looks like we have test coverage for calling compute on an xarray.DataArray (see test_dataarray_with_dask_coords), but even though the exact example from that test fails at the repl the test passes when called with pytest: ``` In [5]: import xarray as xr

In [6]: import numpy as np

In [7]: import dask

In [8]: import dask.array as da

In [9]: import toolz ...: x = xr.Variable('x', da.arange(8, chunks=(4,))) ...: y = xr.Variable('y', da.arange(8, chunks=(4,)) * 2) ...: data = da.random.random((8, 8), chunks=(4, 4)) + 1 ...: array = xr.DataArray(data, dims=['x', 'y']) ...: array.coords['xx'] = x ...: array.coords['yy'] = y ...: ...: assert dict(array.dask_graph()) == toolz.merge(data.dask_graph(), ...: x.dask_graph(), ...: y.dask_graph()) ...: ...: (array2,) = dask.compute(array) ...:


ValueError Traceback (most recent call last) <ipython-input-9-4aaee405bed6> in <module>() 11 y.dask_graph()) 12 ---> 13 (array2,) = dask.compute(array)

~/conda/envs/xarray-py36/lib/python3.6/site-packages/dask/base.py in compute(args, kwargs) 334 results_iter = iter(results) 335 return tuple(a if f is None else f(next(results_iter), a) --> 336 for f, a in postcomputes) 337 338

~/conda/envs/xarray-py36/lib/python3.6/site-packages/dask/base.py in <genexpr>(.0) 334 results_iter = iter(results) 335 return tuple(a if f is None else f(next(results_iter), *a) --> 336 for f, a in postcomputes) 337 338

~/dev/xarray/xarray/core/dataarray.py in _dask_finalize(results, func, args, name) 607 @staticmethod 608 def _dask_finalize(results, func, args, name): --> 609 ds = func(results, *args) 610 variable = ds._variables.pop(_THIS_ARRAY) 611 coords = ds._variables

~/dev/xarray/xarray/core/dataset.py in _dask_postcompute(results, info, args) 551 func, args2 = v 552 r = results2.pop() --> 553 result = func(r, args2) 554 else: 555 result = v

~/dev/xarray/xarray/core/variable.py in _dask_finalize(results, array_func, array_args, dims, attrs, encoding) 389 results = {k: v for k, v in results.items() if k[0] == name} # cull 390 data = array_func(results, *array_args) --> 391 return Variable(dims, data, attrs=attrs, encoding=encoding) 392 393 @property

~/dev/xarray/xarray/core/variable.py in init(self, dims, data, attrs, encoding, fastpath) 267 """ 268 self._data = as_compatible_data(data, fastpath=fastpath) --> 269 self._dims = self._parse_dimensions(dims) 270 self._attrs = None 271 self._encoding = None

~/dev/xarray/xarray/core/variable.py in _parse_dimensions(self, dims) 431 raise ValueError('dimensions %s must have the same length as the ' 432 'number of data dimensions, ndim=%s' --> 433 % (dims, self.ndim)) 434 return dims 435

ValueError: dimensions ('x',) must have the same length as the number of data dimensions, ndim=0 ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349149217 https://github.com/pydata/xarray/issues/1759#issuecomment-349149217 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTE0OTIxNw== mrocklin 306380 2017-12-05T00:01:20Z 2017-12-05T00:01:20Z MEMBER

Also worth pointing out that this is likely the kind of bug that would have been caught with static typing

On Mon, Dec 4, 2017 at 6:55 PM, Stephan Hoyer notifications@github.com wrote:

Any objection to having the .compute methods point to dask.compute if the dask.version is appropriate?

Yes, this seems like a small win.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/1759#issuecomment-349148049, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszNQKcKZIvhT1rkNGONxU3rIH6gwOks5s9IZygaJpZM4Q1Xng .

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349148049 https://github.com/pydata/xarray/issues/1759#issuecomment-349148049 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTE0ODA0OQ== shoyer 1217238 2017-12-04T23:55:28Z 2017-12-04T23:55:28Z MEMBER

Any objection to having the .compute methods point to dask.compute if the dask.version is appropriate?

Yes, this seems like a small win.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349145345 https://github.com/pydata/xarray/issues/1759#issuecomment-349145345 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTE0NTM0NQ== mrocklin 306380 2017-12-04T23:43:05Z 2017-12-04T23:43:05Z MEMBER

Any objection to having the .compute methods point to dask.compute if the dask.__version__ is appropriate?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349144660 https://github.com/pydata/xarray/issues/1759#issuecomment-349144660 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTE0NDY2MA== mrocklin 306380 2017-12-04T23:40:18Z 2017-12-04T23:40:18Z MEMBER

Here is the problem. It was just a silly typo.

```diff diff --git a/xarray/core/dataarray.py b/xarray/core/dataarray.py index 0516b47..263860d 100644 --- a/xarray/core/dataarray.py +++ b/xarray/core/dataarray.py @@ -594,7 +594,7 @@ class DataArray(AbstractArray, BaseDataObject):

 @property
 def __dask_scheduler__(self):
  • return self._to_temp_dataset().dask_optimize
  • return self._to_temp_dataset().dask_scheduler

    def dask_postcompute(self): func, args = self._to_temp_dataset().dask_postcompute() ```

@shoyer any suggestions on comprehensive ways to test things?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349141548 https://github.com/pydata/xarray/issues/1759#issuecomment-349141548 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTE0MTU0OA== mrocklin 306380 2017-12-04T23:25:43Z 2017-12-04T23:25:43Z MEMBER

While testing this I oddly learn that the following line makes this pass

python from xarray.tests import mock

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349139597 https://github.com/pydata/xarray/issues/1759#issuecomment-349139597 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTEzOTU5Nw== mrocklin 306380 2017-12-04T23:17:47Z 2017-12-04T23:17:47Z MEMBER

I'll take a look

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550
349134458 https://github.com/pydata/xarray/issues/1759#issuecomment-349134458 https://api.github.com/repos/pydata/xarray/issues/1759 MDEyOklzc3VlQ29tbWVudDM0OTEzNDQ1OA== shoyer 1217238 2017-12-04T22:54:47Z 2017-12-04T22:54:47Z MEMBER

I don't think you're doing anything wrong. This looks like a bug related to @mrocklin's recent addition of dask duck methods for xarray.

Here are the values being passed to xarray.Dataset._dask_postcompute: results = ('mean_agg-aggregate-21a3ca7e382bdeee43ea06bee1ce3feb', 0) info = [(True, <this-array>, (<function Variable._dask_finalize at 0x118156158>, (<function finalize at 0x118350730>, (), ('y',), None, None)))] args = (set(), {'y': 2}, None, None, None)

It seems like something is going wrong on the dask side: results should be the computed numpy array, not an uncomputed dask key.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  dask compute on reduction failes with ValueError 279161550

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 238.953ms · About: xarray-datasette