html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/1971#issuecomment-453865008,https://api.github.com/repos/pydata/xarray/issues/1971,453865008,MDEyOklzc3VlQ29tbWVudDQ1Mzg2NTAwOA==,2443309,2019-01-13T20:58:20Z,2019-01-13T20:58:20Z,MEMBER,Closing this now. The distributed integration test module seems to be covering our IO use cases well enough. I don't think we need to do anything here at this time.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302930480 https://github.com/pydata/xarray/issues/1971#issuecomment-392572591,https://api.github.com/repos/pydata/xarray/issues/1971,392572591,MDEyOklzc3VlQ29tbWVudDM5MjU3MjU5MQ==,6404167,2018-05-28T17:12:51Z,2018-05-28T17:13:56Z,CONTRIBUTOR,"Seems like the distributed scheduler is the advised one to use in general, so maybe some tests could be added for this one. For sure for diskIO, would be interesting to see the difference. http://dask.pydata.org/en/latest/setup.html > Note that the newer dask.distributed scheduler is often preferable even on single workstations. It contains many diagnostics and features not found in the older single-machine scheduler. The following pages explain in more detail how to set up Dask on a variety of local and distributed hardware.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302930480 https://github.com/pydata/xarray/issues/1971#issuecomment-371462262,https://api.github.com/repos/pydata/xarray/issues/1971,371462262,MDEyOklzc3VlQ29tbWVudDM3MTQ2MjI2Mg==,306380,2018-03-08T11:35:25Z,2018-03-08T11:35:25Z,MEMBER,"FWIW most of the logic within the dask collections (array, dataframe, delayed) is only tested with `dask.local.get_sync`. This also makes the test suite much faster. Obviously though for things like writing to disk it's useful to check different schedulers. ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302930480 https://github.com/pydata/xarray/issues/1971#issuecomment-371334589,https://api.github.com/repos/pydata/xarray/issues/1971,371334589,MDEyOklzc3VlQ29tbWVudDM3MTMzNDU4OQ==,2443309,2018-03-08T00:27:52Z,2018-03-08T00:27:52Z,MEMBER,"I managed to dig up some more information here. I was having a test failure in [`test_serializable_locks`](https://github.com/jhamman/xarray/blob/5290484ff2d9402dd16a8879351dd9ec1f2d4269/xarray/tests/test_distributed.py#L173-L188) resulting in a traceback that looks like. ``` ... timeout_handle = self.add_timeout(self.time() + timeout, self.stop) self.start() if timeout is not None: self.remove_timeout(timeout_handle) if not future_cell[0].done(): > raise TimeoutError('Operation timed out after %s seconds' % timeout) E tornado.ioloop.TimeoutError: Operation timed out after 10 seconds ../../../anaconda/envs/xarray36/lib/python3.6/site-packages/tornado/ioloop.py:457: TimeoutError ``` From then on we were using the distributed scheduler and any tests that used dask resulted in a additional timeout (or similar error). Unfortunately, my attempts to provide a mcve have come up short. If I can come up with one, I'll report upstream but as it is, I can't really transfer this behavior outside of my example. cc @mrocklin ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302930480 https://github.com/pydata/xarray/issues/1971#issuecomment-371004338,https://api.github.com/repos/pydata/xarray/issues/1971,371004338,MDEyOklzc3VlQ29tbWVudDM3MTAwNDMzOA==,1217238,2018-03-07T02:48:16Z,2018-03-07T02:48:16Z,MEMBER,"Huh, that's interesting. Yes, I suppose should at least consider parametric tests using both dask's multithreaded and distributed schedulers. Though I'll note that for test we actually set the default scheduler to dask's basic non-parallelized get, for easier debugging: https://github.com/pydata/xarray/blob/54468e1924174a03e7ead3be8545f687f084f4dd/xarray/tests/__init__.py#L87 For #1793, the key thing would be to ensure that we run the tests in the isolated context without changing the default scheduler.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302930480