issues: 332762756
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
332762756 | MDU6SXNzdWUzMzI3NjI3NTY= | 2234 | fillna error with distributed | 1197350 | closed | 0 | 3 | 2018-06-15T12:54:54Z | 2018-06-15T13:13:54Z | 2018-06-15T13:13:54Z | MEMBER | Code Sample, a copy-pastable example if possibleThe following code works with the default dask threaded scheduler.
It fails with distributed. I see the following error on the client side: ``` KilledWorker Traceback (most recent call last) <ipython-input-7-5ed3c292af2e> in <module>() ----> 1 da.fillna(0.).mean().load() /opt/conda/lib/python3.6/site-packages/xarray/core/dataarray.py in load(self, kwargs) 631 dask.array.compute 632 """ --> 633 ds = self._to_temp_dataset().load(kwargs) 634 new = self._from_temp_dataset(ds) 635 self._variable = new._variable /opt/conda/lib/python3.6/site-packages/xarray/core/dataset.py in load(self, kwargs) 489 490 # evaluate all the dask arrays simultaneously --> 491 evaluated_data = da.compute(*lazy_data.values(), kwargs) 492 493 for k, data in zip(lazy_data, evaluated_data): /opt/conda/lib/python3.6/site-packages/dask/base.py in compute(args, kwargs) 398 keys = [x.dask_keys() for x in collections] 399 postcomputes = [x.dask_postcompute() for x in collections] --> 400 results = schedule(dsk, keys, kwargs) 401 return repack([f(r, a) for r, (f, a) in zip(results, postcomputes)]) 402 /opt/conda/lib/python3.6/site-packages/distributed/client.py in get(self, dsk, keys, restrictions, loose_restrictions, resources, sync, asynchronous, direct, retries, priority, fifo_timeout, **kwargs) 2157 try: 2158 results = self.gather(packed, asynchronous=asynchronous, -> 2159 direct=direct) 2160 finally: 2161 for f in futures.values(): /opt/conda/lib/python3.6/site-packages/distributed/client.py in gather(self, futures, errors, maxsize, direct, asynchronous) 1560 return self.sync(self._gather, futures, errors=errors, 1561 direct=direct, local_worker=local_worker, -> 1562 asynchronous=asynchronous) 1563 1564 @gen.coroutine /opt/conda/lib/python3.6/site-packages/distributed/client.py in sync(self, func, args, kwargs) 650 return future 651 else: --> 652 return sync(self.loop, func, args, **kwargs) 653 654 def repr(self): /opt/conda/lib/python3.6/site-packages/distributed/utils.py in sync(loop, func, args, kwargs) 273 e.wait(10) 274 if error[0]: --> 275 six.reraise(error[0]) 276 else: 277 return result[0] /opt/conda/lib/python3.6/site-packages/six.py in reraise(tp, value, tb) 691 if value.traceback is not tb: 692 raise value.with_traceback(tb) --> 693 raise value 694 finally: 695 value = None /opt/conda/lib/python3.6/site-packages/distributed/utils.py in f() 258 yield gen.moment 259 thread_state.asynchronous = True --> 260 result[0] = yield make_coro() 261 except Exception as exc: 262 error[0] = sys.exc_info() /opt/conda/lib/python3.6/site-packages/tornado/gen.py in run(self) 1097 1098 try: -> 1099 value = future.result() 1100 except Exception: 1101 self.had_exception = True /opt/conda/lib/python3.6/site-packages/tornado/gen.py in run(self) 1105 if exc_info is not None: 1106 try: -> 1107 yielded = self.gen.throw(*exc_info) 1108 finally: 1109 # Break up a reference to itself /opt/conda/lib/python3.6/site-packages/distributed/client.py in _gather(self, futures, errors, direct, local_worker) 1437 six.reraise(type(exception), 1438 exception, -> 1439 traceback) 1440 if errors == 'skip': 1441 bad_keys.add(key) /opt/conda/lib/python3.6/site-packages/six.py in reraise(tp, value, tb) 691 if value.traceback is not tb: 692 raise value.with_traceback(tb) --> 693 raise value 694 finally: 695 value = None KilledWorker: ("('isna-mean_chunk-where-mean_agg-aggregate-74ec0f30171c1c667640f1f18df5f84b',)", 'tcp://10.20.197.7:43357')
This could very well be a distributed issue. Or a pandas issue. I'm not too sure what is going on. Why is pandas even involved at all? Problem descriptionThis should not raise an error. It worked fine in previous versions, but something in our latest environment has caused it to break. Expected Output
Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2234/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |