html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/pull/1793#issuecomment-371710575,https://api.github.com/repos/pydata/xarray/issues/1793,371710575,MDEyOklzc3VlQ29tbWVudDM3MTcxMDU3NQ==,2443309,2018-03-09T04:31:05Z,2018-03-09T04:31:05Z,MEMBER,"Any final comments on this? If not, I'll probably merge this in the next day or two.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-371345709,https://api.github.com/repos/pydata/xarray/issues/1793,371345709,MDEyOklzc3VlQ29tbWVudDM3MTM0NTcwOQ==,2443309,2018-03-08T01:26:27Z,2018-03-08T01:26:27Z,MEMBER,"All the test are passing here. I would appreciate another round of reviews. @shoyer - all of your previous comments have been addressed. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-369078817,https://api.github.com/repos/pydata/xarray/issues/1793,369078817,MDEyOklzc3VlQ29tbWVudDM2OTA3ODgxNw==,2443309,2018-02-28T00:38:59Z,2018-02-28T00:38:59Z,MEMBER,I've added some additional tests and cleaned up the implementation a bit. I'd like to get reviews from a few folks and hopefully get this merged later this week. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-367493976,https://api.github.com/repos/pydata/xarray/issues/1793,367493976,MDEyOklzc3VlQ29tbWVudDM2NzQ5Mzk3Ng==,2443309,2018-02-21T22:15:09Z,2018-02-21T22:15:09Z,MEMBER,"Thanks all for the comments. I will clean this up a bit and request a full review later this week. A few things to note: 1. I have not tested `save_mfdataset` yet. In theory, it should work now but it will require some testing. I'll save that for another PR. 2. I will raise an informative error message when either h5netcdf or scipy are used to write files along with distributed. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-367492154,https://api.github.com/repos/pydata/xarray/issues/1793,367492154,MDEyOklzc3VlQ29tbWVudDM2NzQ5MjE1NA==,14314623,2018-02-21T22:09:24Z,2018-02-21T22:09:24Z,CONTRIBUTOR,"I would echo @rabernat in the sense that if distributed writes work for some backends, I would love to see it merged as soon as possible. Thanks for working on this, I am very excited about this feature.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-367377682,https://api.github.com/repos/pydata/xarray/issues/1793,367377682,MDEyOklzc3VlQ29tbWVudDM2NzM3NzY4Mg==,1217238,2018-02-21T16:09:00Z,2018-02-21T16:09:00Z,MEMBER,"> I don't totally understand the scipy constraints on incremental writes but could that be playing a factor here? I'm pretty sure SciPy supports incremental reads but not incremental writes. In general the entire netCDF file needs to get written at once. Certainly it's not possible to update only part of an array -- scipy needs it in memory as a NumPy array to copy its raw data to the netCDF file.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-367358183,https://api.github.com/repos/pydata/xarray/issues/1793,367358183,MDEyOklzc3VlQ29tbWVudDM2NzM1ODE4Mw==,1197350,2018-02-21T15:13:02Z,2018-02-21T15:13:02Z,MEMBER,"Given the fact that distributed writes never actually worked, I would be happy at this point if they worked for *some* backends (e.g. netCDF4 and zarr) but not all (e.g. scipy, h5netcdf). We could document this and make sure to raise an appropriate `NotImplementedError` if the user tries a distributed write with a non-supported backend. I make this suggestion because I know Joe has been slogging through this for a long time and it may be better to get it merged and live to fight another day. On the other hand, if Joe is optimistic about resolving the remaining backends, then by all means carry on!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-367232132,https://api.github.com/repos/pydata/xarray/issues/1793,367232132,MDEyOklzc3VlQ29tbWVudDM2NzIzMjEzMg==,2443309,2018-02-21T07:02:30Z,2018-02-21T07:02:30Z,MEMBER,"The battle of inches continues. Turning off HDF5's file locking fixes all the tests for netCDF4 (🎉 ). Scipy is not working and `h5netcdf` doesn't support `autoclose` so it isn't expected to work. @shoyer - I don't totally understand the scipy constraints on incremental writes but could that be playing a factor here?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-366605287,https://api.github.com/repos/pydata/xarray/issues/1793,366605287,MDEyOklzc3VlQ29tbWVudDM2NjYwNTI4Nw==,2443309,2018-02-19T07:11:37Z,2018-02-19T07:11:37Z,MEMBER,"I've this down to 4 test failures: ``` test_dask_distributed_netcdf_integration_test[NETCDF3_CLASSIC-True-scipy] test_dask_distributed_netcdf_integration_test[NETCDF3_CLASSIC-False-scipy] test_dask_distributed_netcdf_integration_test[NETCDF4_CLASSIC-False-netcdf4] test_dask_distributed_netcdf_integration_test[NETCDF4-False-netcdf4] ``` I think I'm ready for an initial review. I've made some changes to autoclose and sync so I'd like to get feedback on my approach before I spend too much time sorting out the last few failures. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-366585598,https://api.github.com/repos/pydata/xarray/issues/1793,366585598,MDEyOklzc3VlQ29tbWVudDM2NjU4NTU5OA==,2443309,2018-02-19T04:21:37Z,2018-02-19T04:21:37Z,MEMBER,This is mostly working now. I'm getting a test failure from open_dataset + distributed + autoclose so there is something to sort out there. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-366557470,https://api.github.com/repos/pydata/xarray/issues/1793,366557470,MDEyOklzc3VlQ29tbWVudDM2NjU1NzQ3MA==,2443309,2018-02-18T23:18:28Z,2018-02-18T23:18:28Z,MEMBER,@shoyer - I have this working with the `netcdf4` backend for with the `NETCDF3_CLASSIC` file format. I'm still having some locking issues with the HDF library and I'm not sure why. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-363273602,https://api.github.com/repos/pydata/xarray/issues/1793,363273602,MDEyOklzc3VlQ29tbWVudDM2MzI3MzYwMg==,2443309,2018-02-06T00:57:05Z,2018-02-06T00:57:05Z,MEMBER,"I think we're getting close. We're currently failing during the `sync` step and I'm hypothesizing that it is due to the file not being closed after the setup steps. That said, I wasn't able to pinpoint why/where we're missing a close. I think this traceback is pretty informative: ``` Traceback (most recent call last): File ""/Users/jhamman/anaconda/envs/xarray36/lib/python3.6/site-packages/distributed/worker.py"", line 1255, in add_task self.tasks[key] = _deserialize(function, args, kwargs, task) File ""/Users/jhamman/anaconda/envs/xarray36/lib/python3.6/site-packages/distributed/worker.py"", line 641, in _deserialize args = pickle.loads(args) File ""/Users/jhamman/anaconda/envs/xarray36/lib/python3.6/site-packages/distributed/protocol/pickle.py"", line 59, in loads return pickle.loads(x) File ""/Users/jhamman/Dropbox/src/xarray/xarray/backends/common.py"", line 445, in __setstate__ self.ds = self._opener(mode=self._mode) File ""/Users/jhamman/Dropbox/src/xarray/xarray/backends/netCDF4_.py"", line 204, in _open_netcdf4_group ds = nc4.Dataset(filename, mode=mode, **kwargs) File ""netCDF4/_netCDF4.pyx"", line 2015, in netCDF4._netCDF4.Dataset.__init__ File ""netCDF4/_netCDF4.pyx"", line 1636, in netCDF4._netCDF4._ensure_nc_success OSError: [Errno -101] NetCDF: HDF error: b'/var/folders/v0/qnh7jvgx5gnglpxfztxdlhk00000gn/T/tmpn_mo662_/temp-0.nc' distributed.scheduler - ERROR - error from worker tcp://127.0.0.1:63248: [Errno -101] NetCDF: HDF error: b'/var/folders/v0/qnh7jvgx5gnglpxfztxdlhk00000gn/T/tmpn_mo662_/temp-0.nc' HDF5-DIAG: Error detected in HDF5 (1.10.1) thread 140736093991744: #000: H5F.c line 586 in H5Fopen(): unable to open file major: File accessibilty minor: Unable to open file #001: H5Fint.c line 1305 in H5F_open(): unable to lock the file major: File accessibilty minor: Unable to open file #002: H5FD.c line 1839 in H5FD_lock(): driver lock request failed major: Virtual File Layer minor: Can't update object #003: H5FDsec2.c line 940 in H5FD_sec2_lock(): unable to lock file, errno = 35, error message = 'Resource temporarily unavailable' major: File accessibilty minor: Bad file ID accessed ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-362773103,https://api.github.com/repos/pydata/xarray/issues/1793,362773103,MDEyOklzc3VlQ29tbWVudDM2Mjc3MzEwMw==,306380,2018-02-03T03:13:04Z,2018-02-03T03:13:04Z,MEMBER,"Honestly we don't have a very clean mechanism for this. Probably you want to look at `dask.context._globals['get']`. This should either be `None`, which means ""use the collection's default"" (`dask.threaded.get` in your case) or a callable. If you're using the distributed scheduler then this will be a method of a `Client` object. Again, not a very clean thing to test for. My apologies.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-362721418,https://api.github.com/repos/pydata/xarray/issues/1793,362721418,MDEyOklzc3VlQ29tbWVudDM2MjcyMTQxOA==,2443309,2018-02-02T22:05:57Z,2018-02-02T22:05:57Z,MEMBER,@mrocklin - What is the preferred method for determining which scheduler is being used?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-362698024,https://api.github.com/repos/pydata/xarray/issues/1793,362698024,MDEyOklzc3VlQ29tbWVudDM2MjY5ODAyNA==,306380,2018-02-02T20:28:55Z,2018-02-02T20:28:55Z,MEMBER,"Performance-wise Dask locks will probably add 1-10ms of communication overhead (probably on the lower end of that), plus whatever contention there will be from locking. You can make these locks as fine-grained as you want, for example by defining a lock-per-filename with `Lock(filename)` with no cost (which would presumably reduce contention).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-362697439,https://api.github.com/repos/pydata/xarray/issues/1793,362697439,MDEyOklzc3VlQ29tbWVudDM2MjY5NzQzOQ==,1217238,2018-02-02T20:26:30Z,2018-02-02T20:26:30Z,MEMBER,"A simpler way to handle locking for now (but with possibly subpar performance) would be to use a single global distributed lock. As for `autoclose`, perhaps we should make the default `autoclose=None`, which becomes `True` if using dask-distributed (or maybe using dask in general?) and otherwise is `False`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-362673882,https://api.github.com/repos/pydata/xarray/issues/1793,362673882,MDEyOklzc3VlQ29tbWVudDM2MjY3Mzg4Mg==,1217238,2018-02-02T18:57:33Z,2018-02-02T18:57:33Z,MEMBER,"We might always need to use `autoclose=True` with distributed. The problem is that in xarray's default mode of operation, we open a netCDF file (without using dask) to create variables, dimensions and attributes, keeping the file open. Then we write the data using dask (via `AbstractWritableDataStore.sync()`), but the original file is still open. As for the lock, we need locking both: - Per process: only one thread can use HDF5 for reading/writing at the same time. - Per file: only one worker can read/write a file at the same time.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-362673511,https://api.github.com/repos/pydata/xarray/issues/1793,362673511,MDEyOklzc3VlQ29tbWVudDM2MjY3MzUxMQ==,306380,2018-02-02T18:56:16Z,2018-02-02T18:56:16Z,MEMBER,"SerializableLock isn't appropriate here if you want inter process locking. Dask's lock is probably better here if you're running with the distributed scheduler. On Feb 2, 2018 1:38 PM, ""Joe Hamman"" wrote: > The tests failure indicates that the netcdf4/h5netcdf libraries cannot > open the file in write/append mode, and it seems that is because the file > is already open (by another process). > > Two questions: > > 1. autoclose is False to_netcdf. That generally makes sense to me but > I'm concerned that we're not being explicit enough about closing the file > after each process is done interacting with it. Do we have a way to lock > until the file is closed? > 2. The lock we're using is dask's SerializableLock. Is that the > correct Lock to be using? There is also the distributed.Lock. > > xref: dask/dask#1892 > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > , or mute > the thread > > . > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-362657475,https://api.github.com/repos/pydata/xarray/issues/1793,362657475,MDEyOklzc3VlQ29tbWVudDM2MjY1NzQ3NQ==,2443309,2018-02-02T17:56:05Z,2018-02-02T17:56:05Z,MEMBER,"The tests failure indicates that the netcdf4/h5netcdf libraries cannot open the file in write/append mode, and it seems that is because the file is already open (by another process). Two questions: 1. `autoclose` is False `to_netcdf`. That generally makes sense to me but I'm concerned that we're not being explicit enough about closing the file after each process is done interacting with it. Do we have a way to lock until the file is closed? 2. The lock we're using is dask's `SerializableLock`. Is that the correct Lock to be using? There is also the `distributed.Lock`. xref: https://github.com/dask/dask/issues/1892","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-362644064,https://api.github.com/repos/pydata/xarray/issues/1793,362644064,MDEyOklzc3VlQ29tbWVudDM2MjY0NDA2NA==,2443309,2018-02-02T17:03:59Z,2018-02-02T17:37:49Z,MEMBER,"Thanks @mrocklin for taking a look here. I reworked the tests a bit more to put the `to_netcdf` inside the distributed cluster section. Bad news is that the tests are failing again. The good news is we have a semi-informative error message that indicates we're missing a `lock` somewhere. Link to most descriptive failing test: https://travis-ci.org/pydata/xarray/jobs/336643000#L5076","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-362590407,https://api.github.com/repos/pydata/xarray/issues/1793,362590407,MDEyOklzc3VlQ29tbWVudDM2MjU5MDQwNw==,306380,2018-02-02T13:46:18Z,2018-02-02T13:46:18Z,MEMBER,"For reference, the line computed = restored.compute() would have to be replaced with (computed,) = yield c.compute(restored) To get the same result. However there were a few more calls to compute hidden in various functions (like `to_netcdf`) that would be tricky to make asynchronous, so I opted to swich to synchronous style instead.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-362589762,https://api.github.com/repos/pydata/xarray/issues/1793,362589762,MDEyOklzc3VlQ29tbWVudDM2MjU4OTc2Mg==,306380,2018-02-02T13:43:33Z,2018-02-02T13:43:33Z,MEMBER,"I've pushed a fix for the `future` error. We were using a coroutine-style test with synchronous style code. More information here: http://distributed.readthedocs.io/en/latest/develop.html#writing-tests In the future I suspect that the `with cluster` style tests will be easier to use for anyone not familiar with async programming. They're a little more opaque (you don't have access to the scheduler or workers), but probably match the API that you expect most people to use in practice.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-362465131,https://api.github.com/repos/pydata/xarray/issues/1793,362465131,MDEyOklzc3VlQ29tbWVudDM2MjQ2NTEzMQ==,1217238,2018-02-02T02:17:34Z,2018-02-02T02:17:34Z,MEMBER,"Looking into this a little bit, this looks like a dask-distributed bug to me. Somehow `Client.get()` is returning a `tornado.concurrent.Future` object, even though `sync=True`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-361106590,https://api.github.com/repos/pydata/xarray/issues/1793,361106590,MDEyOklzc3VlQ29tbWVudDM2MTEwNjU5MA==,2443309,2018-01-28T23:31:15Z,2018-01-28T23:31:15Z,MEMBER,"xref: https://github.com/pydata/xarray/issues/798 and https://github.com/dask/dask/issues/2488 which are both seem to be relevant to this discussion. I'm also remembering @pwolfram was quite involved with the original distributed integration so pinging him to see if he is interested in this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-360662747,https://api.github.com/repos/pydata/xarray/issues/1793,360662747,MDEyOklzc3VlQ29tbWVudDM2MDY2Mjc0Nw==,1197350,2018-01-26T02:05:19Z,2018-01-26T02:05:28Z,MEMBER,"I have definitely used the distributed scheduler with `dask.array.store` both via Zarr and via a custom store class I wrote: https://gist.github.com/rabernat/e54755e7de4eb5a93cc4e7f9f903e3cc But I cannot recall if I ever got it to work with netCDF.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-360659245,https://api.github.com/repos/pydata/xarray/issues/1793,360659245,MDEyOklzc3VlQ29tbWVudDM2MDY1OTI0NQ==,2443309,2018-01-26T01:43:52Z,2018-01-26T01:43:52Z,MEMBER,"Yes, the zarr backend here in xarray is also using `dask.array.store` and seems to work with distributed just fine. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-360590825,https://api.github.com/repos/pydata/xarray/issues/1793,360590825,MDEyOklzc3VlQ29tbWVudDM2MDU5MDgyNQ==,3019665,2018-01-25T20:29:58Z,2018-01-25T20:29:58Z,NONE,"Yep, using `dask.array.store` regularly with the `distributed` scheduler both on our cluster and in a local Docker image for testing. Am using Zarr Arrays as the targets for `store` to write to. Basically rechunk the data to match the chunking selected for the Zarr Array and then write out in parallel lock-free. Our cluster uses NFS for things like one's home directory. So these are accessible across nodes. Also there are other types of storage available that are a bit faster and still remain accessible across nodes. So these work pretty well.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-360548130,https://api.github.com/repos/pydata/xarray/issues/1793,360548130,MDEyOklzc3VlQ29tbWVudDM2MDU0ODEzMA==,306380,2018-01-25T17:59:34Z,2018-01-25T17:59:34Z,MEMBER,"I can take a look at the future not iterable issue sometime tomorrow. > Has anyone successfully used dask.array.store() with the distributed scheduler? My guess is that this would be easy with a friendly storage target. I'm not sure though. cc @jakirkham who has been active on this topic recently.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-360547539,https://api.github.com/repos/pydata/xarray/issues/1793,360547539,MDEyOklzc3VlQ29tbWVudDM2MDU0NzUzOQ==,1217238,2018-01-25T17:57:29Z,2018-01-25T17:57:29Z,MEMBER,Has anyone successfully used `dask.array.store()` with the distributed scheduler?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-360360093,https://api.github.com/repos/pydata/xarray/issues/1793,360360093,MDEyOklzc3VlQ29tbWVudDM2MDM2MDA5Mw==,1197350,2018-01-25T04:49:37Z,2018-01-25T04:49:37Z,MEMBER,"Kudos for pushing this forward. I don't have much help to offer, but I wanted to recognize your effort...this is hard stuff!.","{""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-360328682,https://api.github.com/repos/pydata/xarray/issues/1793,360328682,MDEyOklzc3VlQ29tbWVudDM2MDMyODY4Mg==,2443309,2018-01-25T01:14:05Z,2018-01-25T01:15:10Z,MEMBER,"I've just taken another swing at this and come up empty. I open to ideas in the following areas: 1. scipy backend is failing to roundtrip a length 1 datetime array: https://travis-ci.org/pydata/xarray/jobs/333068098#L4504 2. scipy, netcdf4, and h5netcdf backends are all failing inside dask-distributed: https://travis-ci.org/pydata/xarray/jobs/333068098#L4919 The good news here is that only 8 tests are failing after applying the array wrapper so I suspect we're quite close. I'm hoping @shoyer may have some ideas on (1) since I think he had implemented some scipy workarounds in the past. @mrocklin, I'm hoping you can point me in the right direction. All of these tests are reproducible locally. *(BTW, I have a use case that is going to need this functionality so I'm personally motivated to see it across the finish line)*","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-357105359,https://api.github.com/repos/pydata/xarray/issues/1793,357105359,MDEyOklzc3VlQ29tbWVudDM1NzEwNTM1OQ==,306380,2018-01-12T00:23:09Z,2018-01-12T00:23:09Z,MEMBER,"I don't know. I would want to look at the fail case locally. I can try to do this near term, no promises though :/","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-357069258,https://api.github.com/repos/pydata/xarray/issues/1793,357069258,MDEyOklzc3VlQ29tbWVudDM1NzA2OTI1OA==,2443309,2018-01-11T21:37:43Z,2018-01-11T21:37:43Z,MEMBER,"@mrocklin - I have a [test failing here](https://travis-ci.org/pydata/xarray/jobs/327790224#L5643) with a familiar message. E TypeError: 'Future' object is not iterable We saw this last week when debugging some pangeo things. Can you remind me what our solution was? ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-352908509,https://api.github.com/repos/pydata/xarray/issues/1793,352908509,MDEyOklzc3VlQ29tbWVudDM1MjkwODUwOQ==,306380,2017-12-19T22:39:43Z,2017-12-19T22:39:43Z,MEMBER,"The zarr test seems a bit different. I think your issue here is that you are trying to use synchronous API with the async test harness. I've changed your test and pushed to your branch (hope you don't mind). Relevant docs are here: http://distributed.readthedocs.io/en/latest/develop.html#writing-tests Async testing is nicer in many ways, but does require you to be a bit familiar with the async/tornado API. I also suspect that operations like `to_zarr` really aren't yet async friendly.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962 https://github.com/pydata/xarray/pull/1793#issuecomment-352906316,https://api.github.com/repos/pydata/xarray/issues/1793,352906316,MDEyOklzc3VlQ29tbWVudDM1MjkwNjMxNg==,1217238,2017-12-19T22:29:47Z,2017-12-19T22:29:47Z,MEMBER,"yes, see https://github.com/pydata/xarray/issues/1464#issuecomment-341329662","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962