html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/1793#issuecomment-371710575,https://api.github.com/repos/pydata/xarray/issues/1793,371710575,MDEyOklzc3VlQ29tbWVudDM3MTcxMDU3NQ==,2443309,2018-03-09T04:31:05Z,2018-03-09T04:31:05Z,MEMBER,"Any final comments on this? If not, I'll probably merge this in the next day or two.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962
https://github.com/pydata/xarray/pull/1793#issuecomment-371345709,https://api.github.com/repos/pydata/xarray/issues/1793,371345709,MDEyOklzc3VlQ29tbWVudDM3MTM0NTcwOQ==,2443309,2018-03-08T01:26:27Z,2018-03-08T01:26:27Z,MEMBER,"All the test are passing here. I would appreciate another round of reviews.
@shoyer - all of your previous comments have been addressed. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962
https://github.com/pydata/xarray/pull/1793#issuecomment-369078817,https://api.github.com/repos/pydata/xarray/issues/1793,369078817,MDEyOklzc3VlQ29tbWVudDM2OTA3ODgxNw==,2443309,2018-02-28T00:38:59Z,2018-02-28T00:38:59Z,MEMBER,I've added some additional tests and cleaned up the implementation a bit. I'd like to get reviews from a few folks and hopefully get this merged later this week. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962
https://github.com/pydata/xarray/pull/1793#issuecomment-367493976,https://api.github.com/repos/pydata/xarray/issues/1793,367493976,MDEyOklzc3VlQ29tbWVudDM2NzQ5Mzk3Ng==,2443309,2018-02-21T22:15:09Z,2018-02-21T22:15:09Z,MEMBER,"Thanks all for the comments. I will clean this up a bit and request a full review later this week.
A few things to note:
1. I have not tested `save_mfdataset` yet. In theory, it should work now but it will require some testing. I'll save that for another PR.
2. I will raise an informative error message when either h5netcdf or scipy are used to write files along with distributed. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962
https://github.com/pydata/xarray/pull/1793#issuecomment-367232132,https://api.github.com/repos/pydata/xarray/issues/1793,367232132,MDEyOklzc3VlQ29tbWVudDM2NzIzMjEzMg==,2443309,2018-02-21T07:02:30Z,2018-02-21T07:02:30Z,MEMBER,"The battle of inches continues. Turning off HDF5's file locking fixes all the tests for netCDF4 (🎉 ). Scipy is not working and `h5netcdf` doesn't support `autoclose` so it isn't expected to work.
@shoyer - I don't totally understand the scipy constraints on incremental writes but could that be playing a factor here?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962
https://github.com/pydata/xarray/pull/1793#issuecomment-366605287,https://api.github.com/repos/pydata/xarray/issues/1793,366605287,MDEyOklzc3VlQ29tbWVudDM2NjYwNTI4Nw==,2443309,2018-02-19T07:11:37Z,2018-02-19T07:11:37Z,MEMBER,"I've this down to 4 test failures:
```
test_dask_distributed_netcdf_integration_test[NETCDF3_CLASSIC-True-scipy]
test_dask_distributed_netcdf_integration_test[NETCDF3_CLASSIC-False-scipy]
test_dask_distributed_netcdf_integration_test[NETCDF4_CLASSIC-False-netcdf4]
test_dask_distributed_netcdf_integration_test[NETCDF4-False-netcdf4]
```
I think I'm ready for an initial review. I've made some changes to autoclose and sync so I'd like to get feedback on my approach before I spend too much time sorting out the last few failures. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962
https://github.com/pydata/xarray/pull/1793#issuecomment-366585598,https://api.github.com/repos/pydata/xarray/issues/1793,366585598,MDEyOklzc3VlQ29tbWVudDM2NjU4NTU5OA==,2443309,2018-02-19T04:21:37Z,2018-02-19T04:21:37Z,MEMBER,This is mostly working now. I'm getting a test failure from open_dataset + distributed + autoclose so there is something to sort out there. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962
https://github.com/pydata/xarray/pull/1793#issuecomment-366557470,https://api.github.com/repos/pydata/xarray/issues/1793,366557470,MDEyOklzc3VlQ29tbWVudDM2NjU1NzQ3MA==,2443309,2018-02-18T23:18:28Z,2018-02-18T23:18:28Z,MEMBER,@shoyer - I have this working with the `netcdf4` backend for with the `NETCDF3_CLASSIC` file format. I'm still having some locking issues with the HDF library and I'm not sure why. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962
https://github.com/pydata/xarray/pull/1793#issuecomment-363273602,https://api.github.com/repos/pydata/xarray/issues/1793,363273602,MDEyOklzc3VlQ29tbWVudDM2MzI3MzYwMg==,2443309,2018-02-06T00:57:05Z,2018-02-06T00:57:05Z,MEMBER,"I think we're getting close. We're currently failing during the `sync` step and I'm hypothesizing that it is due to the file not being closed after the setup steps. That said, I wasn't able to pinpoint why/where we're missing a close. I think this traceback is pretty informative:
```
Traceback (most recent call last):
File ""/Users/jhamman/anaconda/envs/xarray36/lib/python3.6/site-packages/distributed/worker.py"", line 1255, in add_task
self.tasks[key] = _deserialize(function, args, kwargs, task)
File ""/Users/jhamman/anaconda/envs/xarray36/lib/python3.6/site-packages/distributed/worker.py"", line 641, in _deserialize
args = pickle.loads(args)
File ""/Users/jhamman/anaconda/envs/xarray36/lib/python3.6/site-packages/distributed/protocol/pickle.py"", line 59, in loads
return pickle.loads(x)
File ""/Users/jhamman/Dropbox/src/xarray/xarray/backends/common.py"", line 445, in __setstate__
self.ds = self._opener(mode=self._mode)
File ""/Users/jhamman/Dropbox/src/xarray/xarray/backends/netCDF4_.py"", line 204, in _open_netcdf4_group
ds = nc4.Dataset(filename, mode=mode, **kwargs)
File ""netCDF4/_netCDF4.pyx"", line 2015, in netCDF4._netCDF4.Dataset.__init__
File ""netCDF4/_netCDF4.pyx"", line 1636, in netCDF4._netCDF4._ensure_nc_success
OSError: [Errno -101] NetCDF: HDF error: b'/var/folders/v0/qnh7jvgx5gnglpxfztxdlhk00000gn/T/tmpn_mo662_/temp-0.nc'
distributed.scheduler - ERROR - error from worker tcp://127.0.0.1:63248: [Errno -101] NetCDF: HDF error: b'/var/folders/v0/qnh7jvgx5gnglpxfztxdlhk00000gn/T/tmpn_mo662_/temp-0.nc'
HDF5-DIAG: Error detected in HDF5 (1.10.1) thread 140736093991744:
#000: H5F.c line 586 in H5Fopen(): unable to open file
major: File accessibilty
minor: Unable to open file
#001: H5Fint.c line 1305 in H5F_open(): unable to lock the file
major: File accessibilty
minor: Unable to open file
#002: H5FD.c line 1839 in H5FD_lock(): driver lock request failed
major: Virtual File Layer
minor: Can't update object
#003: H5FDsec2.c line 940 in H5FD_sec2_lock(): unable to lock file, errno = 35, error message = 'Resource temporarily unavailable'
major: File accessibilty
minor: Bad file ID accessed
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962
https://github.com/pydata/xarray/pull/1793#issuecomment-362721418,https://api.github.com/repos/pydata/xarray/issues/1793,362721418,MDEyOklzc3VlQ29tbWVudDM2MjcyMTQxOA==,2443309,2018-02-02T22:05:57Z,2018-02-02T22:05:57Z,MEMBER,@mrocklin - What is the preferred method for determining which scheduler is being used?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962
https://github.com/pydata/xarray/pull/1793#issuecomment-362657475,https://api.github.com/repos/pydata/xarray/issues/1793,362657475,MDEyOklzc3VlQ29tbWVudDM2MjY1NzQ3NQ==,2443309,2018-02-02T17:56:05Z,2018-02-02T17:56:05Z,MEMBER,"The tests failure indicates that the netcdf4/h5netcdf libraries cannot open the file in write/append mode, and it seems that is because the file is already open (by another process).
Two questions:
1. `autoclose` is False `to_netcdf`. That generally makes sense to me but I'm concerned that we're not being explicit enough about closing the file after each process is done interacting with it. Do we have a way to lock until the file is closed?
2. The lock we're using is dask's `SerializableLock`. Is that the correct Lock to be using? There is also the `distributed.Lock`.
xref: https://github.com/dask/dask/issues/1892","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962
https://github.com/pydata/xarray/pull/1793#issuecomment-362644064,https://api.github.com/repos/pydata/xarray/issues/1793,362644064,MDEyOklzc3VlQ29tbWVudDM2MjY0NDA2NA==,2443309,2018-02-02T17:03:59Z,2018-02-02T17:37:49Z,MEMBER,"Thanks @mrocklin for taking a look here. I reworked the tests a bit more to put the `to_netcdf` inside the distributed cluster section.
Bad news is that the tests are failing again. The good news is we have a semi-informative error message that indicates we're missing a `lock` somewhere.
Link to most descriptive failing test: https://travis-ci.org/pydata/xarray/jobs/336643000#L5076","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962
https://github.com/pydata/xarray/pull/1793#issuecomment-361106590,https://api.github.com/repos/pydata/xarray/issues/1793,361106590,MDEyOklzc3VlQ29tbWVudDM2MTEwNjU5MA==,2443309,2018-01-28T23:31:15Z,2018-01-28T23:31:15Z,MEMBER,"xref: https://github.com/pydata/xarray/issues/798 and https://github.com/dask/dask/issues/2488 which are both seem to be relevant to this discussion.
I'm also remembering @pwolfram was quite involved with the original distributed integration so pinging him to see if he is interested in this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962
https://github.com/pydata/xarray/pull/1793#issuecomment-360659245,https://api.github.com/repos/pydata/xarray/issues/1793,360659245,MDEyOklzc3VlQ29tbWVudDM2MDY1OTI0NQ==,2443309,2018-01-26T01:43:52Z,2018-01-26T01:43:52Z,MEMBER,"Yes, the zarr backend here in xarray is also using `dask.array.store` and seems to work with distributed just fine. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962
https://github.com/pydata/xarray/pull/1793#issuecomment-360328682,https://api.github.com/repos/pydata/xarray/issues/1793,360328682,MDEyOklzc3VlQ29tbWVudDM2MDMyODY4Mg==,2443309,2018-01-25T01:14:05Z,2018-01-25T01:15:10Z,MEMBER,"I've just taken another swing at this and come up empty. I open to ideas in the following areas:
1. scipy backend is failing to roundtrip a length 1 datetime array: https://travis-ci.org/pydata/xarray/jobs/333068098#L4504
2. scipy, netcdf4, and h5netcdf backends are all failing inside dask-distributed: https://travis-ci.org/pydata/xarray/jobs/333068098#L4919
The good news here is that only 8 tests are failing after applying the array wrapper so I suspect we're quite close. I'm hoping @shoyer may have some ideas on (1) since I think he had implemented some scipy workarounds in the past. @mrocklin, I'm hoping you can point me in the right direction.
All of these tests are reproducible locally.
*(BTW, I have a use case that is going to need this functionality so I'm personally motivated to see it across the finish line)*","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962
https://github.com/pydata/xarray/pull/1793#issuecomment-357069258,https://api.github.com/repos/pydata/xarray/issues/1793,357069258,MDEyOklzc3VlQ29tbWVudDM1NzA2OTI1OA==,2443309,2018-01-11T21:37:43Z,2018-01-11T21:37:43Z,MEMBER,"@mrocklin -
I have a [test failing here](https://travis-ci.org/pydata/xarray/jobs/327790224#L5643) with a familiar message.
E TypeError: 'Future' object is not iterable
We saw this last week when debugging some pangeo things. Can you remind me what our solution was?
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283388962