id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
181017850,MDU6SXNzdWUxODEwMTc4NTA=,1037,attrs empty for open_mfdataset vs population for open_dataset,4295853,closed,0,,,4,2016-10-04T22:08:54Z,2019-02-02T06:30:20Z,2019-02-02T06:30:20Z,CONTRIBUTOR,,,,"Previously, a dataset would store `attrs` corresponding to netCDF global attributes.  For some reason, this behavior does not appear to be supported anymore.  Using this dataset: https://github.com/pydata/xarray-data/raw/master/rasm.nc

``` python
In [1]: import xarray as xr

In [2]: ds = xr.open_dataset('rasm.nc')
/Users/pwolfram/src/xarray/xarray/conventions.py:386: RuntimeWarning: Unable to decode time axis into full numpy.datetime64 objects, continuing using dummy netCDF4.datetime objects instead, reason: dates out of range
  result = decode_cf_datetime(example_value, units, calendar)

In [3]: ds
Out[3]: 
<xarray.Dataset>
Dimensions:  (time: 36, x: 275, y: 205)
Coordinates:
  * time     (time) object 1980-09-16T12:00:00 1980-10-17 ...
    yc       (y, x) float64 16.53 16.78 17.02 17.27 17.51 17.76 18.0 18.25 ...
    xc       (y, x) float64 189.2 189.4 189.6 189.7 189.9 190.1 190.2 190.4 ...
  * y        (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ...
  * x        (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ...
Data variables:
    Tair     (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan ...
Attributes:
    title: /workspace/jhamman/processed/R1002RBRxaaa01a/lnd/temp/R1002RBRxaaa01a.vic.ha.1979-09-01.nc
    institution: U.W.
    source: RACM R1002RBRxaaa01a
    output_frequency: daily
    output_mode: averaged
    convention: CF-1.4
    references: Based on the initial model of Liang et al., 1994, JGR, 99, 14,415- 14,429.
    comment: Output from the Variable Infiltration Capacity (VIC) model.
    nco_openmp_thread_number: 1
    NCO: 4.3.7
    history: history deleted for brevity

In [4]: ds = xr.open_mfdataset('rasm.nc')

In [5]: ds
Out[5]: 
<xarray.Dataset>
Dimensions:  (time: 36, x: 275, y: 205)
Coordinates:
  * time     (time) object 1980-09-16T12:00:00 1980-10-17 ...
    yc       (y, x) float64 16.53 16.78 17.02 17.27 17.51 17.76 18.0 18.25 ...
    xc       (y, x) float64 189.2 189.4 189.6 189.7 189.9 190.1 190.2 190.4 ...
  * y        (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ...
  * x        (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ...
Data variables:
    Tair     (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan ...

```

The attributes for `open_mfdataset` are missing whereas in previous versions of xarray I do not believe that this was the case because one of my scripts is failing because it does not obtain attributes when using the `open_mfdataset` initialization.

@shoyer and @jhamman, is this the expected behavior and was the prior behavior simply an unspecified side-effect of the code vs a design decision?   My preference would be to keep as many attributes as possible when using `open_mfdataset` to best provenance the dataset, i.e., `ds.attrs` should not be empty following initialization.
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1037/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
180729538,MDU6SXNzdWUxODA3Mjk1Mzg=,1033,Extra arguments in templated doc strings are not being replaced properly,4295853,closed,0,,,3,2016-10-03T19:48:32Z,2019-01-26T15:08:30Z,2019-01-26T15:08:30Z,CONTRIBUTOR,,,,"For example, at http://xarray.pydata.org/en/stable/generated/xarray.Dataset.prod.html?highlight=prod, _func_ should actually be _prod_:

![screenshot 2016-10-03 13 48 07](https://cloud.githubusercontent.com/assets/4295853/19051376/09bf41fa-8970-11e6-903d-f10530aef61d.png)
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1033/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
144957100,MDU6SXNzdWUxNDQ5NTcxMDA=,813,Load fails following squeeze,4295853,closed,0,,,2,2016-03-31T16:57:13Z,2019-01-23T00:58:00Z,2019-01-23T00:58:00Z,CONTRIBUTOR,,,,"A `load` that follows a `squeeze` returns an error whereas a `squeeze` following a `load` does not.

For example,

``` python
test = acase.isel(Nb=layernum).sel(Np=np.where(idx)[1])
test = test.squeeze('Nr')
test.load()
```

produces the error

```
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-66-2a98e96bc20c> in <module>()
      1 test = acase.isel(Nb=layernum).sel(Np=np.where(idx)[1])
      2 test = test.squeeze('Nr')
----> 3 test.load()
      4 test = test.squeeze('Nr')

/users/pwolfram/envs/LIGHT_analysis/lib/python2.7/site-packages/xarray/core/dataset.pyc in load(self)
    355 
    356             for k, data in zip(lazy_data, evaluated_data):
--> 357                 self.variables[k].data = data
    358 
    359         return self

/users/pwolfram/envs/LIGHT_analysis/lib/python2.7/site-packages/xarray/core/variable.pyc in data(self, data)
    247         if data.shape != self.shape:
    248             raise ValueError(
--> 249                 ""replacement data must match the Variable's shape"")
    250         self._data = data
    251 

ValueError: replacement data must match the Variable's shape
```

whereas 

``` python
test = acase.isel(Nb=layernum).sel(Np=np.where(idx)[1])
test.load()
test = test.squeeze('Nr')
test.load()
```

works without error.
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/813/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
142498006,MDU6SXNzdWUxNDI0OTgwMDY=,798,Integration  with dask/distributed (xarray backend design),4295853,closed,0,,,59,2016-03-21T23:18:02Z,2019-01-13T04:12:32Z,2019-01-13T04:12:32Z,CONTRIBUTOR,,,,"Dask (https://github.com/dask/dask) currently provides on-node parallelism for medium-size data problems.  However, large climate data sets will require multiple-node parallelism to analyze large climate data sets because this constitutes a big data problem.  A likely solution to this issue is integration of distributed (https://github.com/dask/distributed) with dask.  Distributed is now integrated with dask and its benefits are already starting to be realized, e.g., see http://matthewrocklin.com/blog/work/2016/02/26/dask-distributed-part-3.

Thus, this issue is designed to identify the steps needed to perform this integration, at a high-level. As stated by @shoyer, it will 

> definitely require some refactoring of the xarray backend system to make this work cleanly, but that's 
> OK -- the xarray backend system is indicated as experimental/internal API precisely because we 
> hadn't figured out all the use cases yet.""
> 
> To be honest, I've never been entirely happy with the design we took there (we use inheritance rather 
> than composition for backend classes), but we did get it to work for our use cases. Some refactoring 
> with an eye towards compatibility with dask distributed seems like a very worthwhile endeavor. We 
> do have the benefit of a pretty large test suite covering existing use cases.

Thus, we have the chance to make xarray big-data capable as well as provide improvements to the backend.

To this end, I'm starting this issue to help begin the design process following the xarray mailing list discussion some of us have been having (@shoyer, @mrocklin, @rabernat).

Task To Do List:
- [x] Verify asynchronous access error for `to_netcdf` output is resolved (e.g., https://github.com/pydata/xarray/issues/793)
- [x] LRU-cached file IO supporting serialization to robustly support HDF/NetCDF reads
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/798/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
387892184,MDU6SXNzdWUzODc4OTIxODQ=,2592,Deprecated autoclose option,4295853,closed,0,,,4,2018-12-05T18:41:38Z,2018-12-05T18:54:28Z,2018-12-05T18:54:28Z,CONTRIBUTOR,,,,"In updated versions of xarray we are getting a deprecation error for `autoclose`, e.g., at https://github.com/MPAS-Dev/MPAS-Analysis/pull/501/.  

A look through the issues is not transparent as to this reason and this issue is to collect the high-level information on this change.  

Is there an alternative use that should be considered instead?
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2592/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
148771214,MDU6SXNzdWUxNDg3NzEyMTQ=,826,Storing history of xarray operations,4295853,closed,0,,,5,2016-04-15T21:15:10Z,2018-09-26T16:28:08Z,2016-06-23T14:27:21Z,CONTRIBUTOR,,,,"It may be useful to keep track of operations applied to DataArrays and Datasets in order to enhance provenance of output netcdf datasets, particularly for scientific applications to enhance reproducibility.  However, this essentially would require keeping track of all the operations that were used to produce a given DataArray or Dataset.

Ideally, we would want this to eventually result in appending data to the 'history' attribute for calls to `*.to_netcdf(...)`.  This would keep track of data manipulation similar to nco/ncks/etc operations.
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/826/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
217385961,MDU6SXNzdWUyMTczODU5NjE=,1332,Shape preserving `diff` via new keywords,4295853,closed,0,,,10,2017-03-27T21:49:52Z,2018-09-21T20:02:43Z,2018-09-21T20:02:43Z,CONTRIBUTOR,,,,"Currently, an operation such as `ds.diff('x')` will result in a smaller size dimension, e.g.,

```python
In [1]: import xarray as xr

In [2]: ds = xr.Dataset({'foo': (('x',), [1, 2, 3])}, {'x': [1, 2, 3]})

In [3]: ds
Out[3]: 
<xarray.Dataset>
Dimensions:  (x: 3)
Coordinates:
  * x        (x) int64 1 2 3
Data variables:
    foo      (x) int64 1 2 3

In [4]: ds.diff('x')
Out[4]: 
<xarray.Dataset>
Dimensions:  (x: 2)
Coordinates:
  * x        (x) int64 2 3
Data variables:
    foo      (x) int64 1 1
```

However, there are cases where the same size would be beneficial to keep so that you would get
```python
In [1]: import xarray as xr

In [2]: ds = xr.Dataset({'foo': (('x',), [1, 2, 3])}, {'x': [1, 2, 3]})

In [3]: ds.diff('x', preserve_shape=True, empty_value=0)
Out[3]: 
<xarray.Dataset>
Dimensions:  (x: 3)
Coordinates:
  * x        (x) int64 1 2 3
Data variables:
    foo      (x) int64 0 1 1
```

Is there interest in addition of a `preserve_shape=True` keyword such that it results in this shape-preserving behavior?  I'm proposing you could use this with `label='upper'` and `label='lower'`. 

`empty_value` could be a value or `empty_index` could be an index for the fill value.  If `empty_value=None` and `empty_index=None`, it would produce a `nan`.  

The reason I'm asking the community is because this is at least the second time I've encountered an application where this behavior would be helpful, e.g., computing ocean layer thicknesses from bottom depths.  A previous application was computation of a time step from time slice output and the desire to use this product in an approximated integral, e.g., 
```python
y*diff(t, label='lower', preserve_shape=True)
```
 where `y` and `t` are both of size `n`, which is effectively a left-sided Riemann sum.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1332/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
139956689,MDU6SXNzdWUxMzk5NTY2ODk=,789,Time limitation (between years 1678 and 2262) restrictive to climate community,4295853,closed,0,,,13,2016-03-10T17:21:17Z,2018-05-14T22:42:09Z,2018-05-14T22:42:09Z,CONTRIBUTOR,,,,"The restriction of

>    One unfortunate limitation of using datetime64[ns] is that it limits the native representation of dates to
>    those that fall between the years 1678 and 2262. When a netCDF file contains dates outside of these
>    bounds, dates will be returned as arrays of netcdftime.datetime objects.

is a potential roadblock inhibiting easy adoption of this library in the climate community. 
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/789/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
235278888,MDU6SXNzdWUyMzUyNzg4ODg=,1450,Should an `apply` method exist for `DataArray` similar to the definition for `Dataset`?,4295853,closed,0,,,3,2017-06-12T15:51:52Z,2017-06-13T14:14:27Z,2017-06-13T00:35:40Z,CONTRIBUTOR,,,,The method `apply` is defined for `Dataset`.  Is there a design reason why it is not defined for `DataArray`?  In general I think it would be good to have calculation methods apply to both `Dataset` and `DataArray` as possible but am thinking I'm missing a key design element here.,"{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1450/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
219043002,MDU6SXNzdWUyMTkwNDMwMDI=,1350,"where(..., drop=True) error",4295853,closed,0,,2444330,4,2017-04-03T19:53:33Z,2017-04-14T03:50:53Z,2017-04-14T03:50:53Z,CONTRIBUTOR,,,,"These results appear to be incorrect unless I'm missing something:
```python
In [1]: import xarray as xr

In [2]: import numpy as np

In [3]: array = xr.DataArray(np.zeros((1,2,3)), dims=['time','x','y'], coords={'x':np.arange(2)})

In [4]: array[0,1,1] = 1

In [5]: array.where(array !=0, drop=True)
Out[5]: 
<xarray.DataArray (time: 1, x: 1, y: 1)>
array([[[ 0.]]])
Coordinates:
  * x        (x) int64 1
Dimensions without coordinates: time, y

In [5]: array.where(array !=0, drop=True).values
Out[5]: array([[[ 0.]]])

In [7]: array.values[array.values !=0]
Out[7]: array([ 1.])
```

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1350/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
217660739,MDExOlB1bGxSZXF1ZXN0MTEzMDQzNDI1,1336,"Marks slow, flaky, and failing tests",4295853,closed,0,,,13,2017-03-28T19:03:20Z,2017-04-07T04:36:08Z,2017-04-03T05:30:16Z,CONTRIBUTOR,,0,pydata/xarray/pulls/1336,"Closes #1309

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1336/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
218291642,MDExOlB1bGxSZXF1ZXN0MTEzNDk3NzQ1,1342,Ensures drop=True case works with empty mask,4295853,closed,0,,,7,2017-03-30T18:45:34Z,2017-04-02T22:45:01Z,2017-04-02T22:43:53Z,CONTRIBUTOR,,0,pydata/xarray/pulls/1342,"Resolves error occurring for python 2.7 for the case of `where(mask, drop=True)` where the mask is empty.

 - [x] closes #1341
 - [X] tests added
 - [x]  tests passed
 - [x] passes ``git diff upstream/master | flake8 --diff``
 - [x] whatsnew entry
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1342/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
218277814,MDU6SXNzdWUyMTgyNzc4MTQ=,1341,"where(..., drop=True) failure for empty mask on python 2.7",4295853,closed,0,,,4,2017-03-30T17:55:38Z,2017-04-02T22:43:53Z,2017-04-02T22:43:53Z,CONTRIBUTOR,,,,"The following fails for 2.7 but not 3.5 (reproducible script at https://gist.github.com/89bd5bd62a475510b2611cbff8d5c67a):
```python
In [1]: import xarray as xr

In [2]: import numpy as np

In [3]: da = xr.DataArray(np.random.rand(100,10), dims=['nCells','nVertLevels'])

In [4]: mask = xr.DataArray(np.zeros((100,), dtype='bool'), dims='nCells')

In [5]: da.where(mask, drop=True)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-5-ca5cd9c083a9> in <module>()
----> 1 da.where(mask, drop=True)

/Users/pwolfram/src/xarray/xarray/core/common.pyc in where(self, cond, other, drop)
    681             outcond = cond.isel(**clip)
    682             indexers = {dim: outcond.get_index(dim) for dim in outcond.dims}
--> 683             outobj = self.sel(**indexers)
    684         else:
    685             outobj = self

/Users/pwolfram/src/xarray/xarray/core/dataarray.pyc in sel(self, method, tolerance, drop, **indexers)
    670             self, indexers, method=method, tolerance=tolerance
    671         )
--> 672         result = self.isel(drop=drop, **pos_indexers)
    673         return result._replace_indexes(new_indexes)
    674 

/Users/pwolfram/src/xarray/xarray/core/dataarray.pyc in isel(self, drop, **indexers)
    655         DataArray.sel
    656         """"""
--> 657         ds = self._to_temp_dataset().isel(drop=drop, **indexers)
    658         return self._from_temp_dataset(ds)
    659 

/Users/pwolfram/src/xarray/xarray/core/dataset.pyc in isel(self, drop, **indexers)
   1115         for name, var in iteritems(self._variables):
   1116             var_indexers = dict((k, v) for k, v in indexers if k in var.dims)
-> 1117             new_var = var.isel(**var_indexers)
   1118             if not (drop and name in var_indexers):
   1119                 variables[name] = new_var

/Users/pwolfram/src/xarray/xarray/core/variable.pyc in isel(self, **indexers)
    545             if dim in indexers:
    546                 key[i] = indexers[dim]
--> 547         return self[tuple(key)]
    548 
    549     def squeeze(self, dim=None):

/Users/pwolfram/src/xarray/xarray/core/variable.pyc in __getitem__(self, key)
    375         dims = tuple(dim for k, dim in zip(key, self.dims)
    376                      if not isinstance(k, (int, np.integer)))
--> 377         values = self._indexable_data[key]
    378         # orthogonal indexing should ensure the dimensionality is consistent
    379         if hasattr(values, 'ndim'):

/Users/pwolfram/src/xarray/xarray/core/indexing.pyc in __getitem__(self, key)
    465 
    466     def __getitem__(self, key):
--> 467         key = self._convert_key(key)
    468         return self._ensure_ndarray(self.array[key])
    469 

/Users/pwolfram/src/xarray/xarray/core/indexing.pyc in _convert_key(self, key)
    452         if any(not isinstance(k, (int, np.integer, slice)) for k in key):
    453             # key would trigger fancy indexing
--> 454             key = orthogonal_indexer(key, self.shape)
    455         return key
    456 

/Users/pwolfram/src/xarray/xarray/core/indexing.pyc in orthogonal_indexer(key, shape)
     77     """"""
     78     # replace Ellipsis objects with slices
---> 79     key = list(canonicalize_indexer(key, len(shape)))
     80     # replace 1d arrays and slices with broadcast compatible arrays
     81     # note: we treat integers separately (instead of turning them into 1d

/Users/pwolfram/src/xarray/xarray/core/indexing.pyc in canonicalize_indexer(key, ndim)
     65         return indexer
     66 
---> 67     return tuple(canonicalize(k) for k in expanded_indexer(key, ndim))
     68 
     69 

/Users/pwolfram/src/xarray/xarray/core/indexing.pyc in <genexpr>((k,))
     65         return indexer
     66 
---> 67     return tuple(canonicalize(k) for k in expanded_indexer(key, ndim))
     68 
     69 

/Users/pwolfram/src/xarray/xarray/core/indexing.pyc in canonicalize(indexer)
     62                                      'array indexing; all subkeys must be '
     63                                      'slices, integers or sequences of '
---> 64                                      'integers or Booleans' % indexer)
     65         return indexer
     66 

ValueError: invalid subkey array([], dtype=object) for integer based array indexing; all subkeys must be slices, integers or sequences of integers or Booleans
```
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1341/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
181033674,MDExOlB1bGxSZXF1ZXN0ODc5OTUzNzg=,1038,Attributes from netCDF4 intialization retained,4295853,closed,0,,,26,2016-10-04T23:51:48Z,2017-03-31T03:11:07Z,2017-03-31T03:11:07Z,CONTRIBUTOR,,0,pydata/xarray/pulls/1038,"Ensures that attrs for open_mfdataset are now retained

cc @shoyer 
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1038/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
218013400,MDU6SXNzdWUyMTgwMTM0MDA=,1338,Chunking and dask memory errors,4295853,closed,0,,,2,2017-03-29T21:22:49Z,2017-03-29T22:56:45Z,2017-03-29T22:56:45Z,CONTRIBUTOR,,,,"What is the standard way of sub-chunking to prevent dask memory errors?   For large dataset files there could be a dimension, say `nCells`, that is large enough to fill RAM.  If this occurs, is there an automatic mechanism to prevent out-of-memory errors in dask or is the user's responsibility to specify maximum chunk sizes on their own?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1338/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
199900056,MDExOlB1bGxSZXF1ZXN0MTAwOTI5NTYz,1198,Fixes OS error arising from too many files open,4295853,closed,0,,,54,2017-01-10T18:37:41Z,2017-03-23T19:21:27Z,2017-03-23T19:20:03Z,CONTRIBUTOR,,0,pydata/xarray/pulls/1198,"Previously, DataStore  did not judiciously close files, resulting in opening a large number of files that
could result in an OSError related to too many files being open.  This merge provides a solution for the netCDF, scipy, and h5netcdf backends.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1198/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
180678466,MDExOlB1bGxSZXF1ZXN0ODc3NDI5NjI=,1031,Fixes doc formating error,4295853,closed,0,,,1,2016-10-03T16:00:29Z,2017-03-22T17:27:10Z,2016-10-03T16:15:19Z,CONTRIBUTOR,,0,pydata/xarray/pulls/1031,"This fixes a typo in the what's new documentation.
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1031/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
180717915,MDExOlB1bGxSZXF1ZXN0ODc3NzA5NTk=,1032,Fixes documentation typo,4295853,closed,0,,,0,2016-10-03T18:57:21Z,2017-03-22T17:27:10Z,2016-10-03T21:03:07Z,CONTRIBUTOR,,0,pydata/xarray/pulls/1032,"This is another really minor typo...
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1032/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
140214928,MDU6SXNzdWUxNDAyMTQ5Mjg=,791,Adding cumsum / cumprod reduction operators,4295853,closed,0,,,12,2016-03-11T15:36:41Z,2016-10-04T22:16:26Z,2016-10-04T22:16:26Z,CONTRIBUTOR,,,,"It would be useful to have the cumsum / cumprod reduction operator for DataArray and Dataset, analagous to http://xarray.pydata.org/en/stable/generated/xarray.DataArray.sum.html?highlight=sum#xarray.DataArray.sum
and http://xarray.pydata.org/en/stable/generated/xarray.Dataset.sum.html?highlight=sum#xarray.Dataset.sum

I notice this is on the TODO at https://github.com/pydata/xarray/blob/master/xarray/core/ops.py#L54 and am assuming there is something subtle here about the implementation.  I believe the issue was probably with dask, but the issue / PR at https://github.com/dask/dask/issues/923 & https://github.com/dask/dask/pull/925 may have removed the roadblock.  
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/791/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
144920646,MDExOlB1bGxSZXF1ZXN0NjQ4MDEwMTc=,812,Adds cummulative operators to API ,4295853,closed,0,,,14,2016-03-31T14:37:50Z,2016-10-03T21:06:26Z,2016-10-03T21:05:33Z,CONTRIBUTOR,,0,pydata/xarray/pulls/812,"This PR will add cumsum and cumprod as discussed in https://github.com/pydata/xarray/issues/791 as well ensuring `cumprod` works for the API, resolving issues discussed at https://github.com/pydata/xarray/issues/807.  

TO DO (dependencies)
- [x] Add `nancumprod` and `nancumsum` to numpy (https://github.com/numpy/numpy/pull/7421)
- [x] Add `nancumprod` and `nancumsum` to dask (https://github.com/dask/dask/pull/1077)

This PR extends infrastructure to support `cumsum` and `cumprod` (https://github.com/pydata/xarray/issues/791).

References:
- https://github.com/numpy/numpy/pull/7421

cc @shoyer, @jhamman 
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/812/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
178111738,MDU6SXNzdWUxNzgxMTE3Mzg=,1010,py.test fails on master,4295853,closed,0,,,5,2016-09-20T16:36:51Z,2016-09-20T17:13:14Z,2016-09-20T17:12:09Z,CONTRIBUTOR,,,,"Following creation of a new conda environment: 

``` bash
cd /tmp
conda create -n test_xarray27 python=2.7 -y
source activate test_xarray27
conda install matplotlib dask bottleneck pytest -y
git clone git@github.com:pydata/xarray.git 
cd xarray
git co master
python setup.py develop
py.test
```

returns

``` bash
┌─[pwolfram][shapiro][/tmp/xarray][10:34][±][master ✓]
└─▪ py.test
=========================================================================================================== test session starts ============================================================================================================
platform darwin -- Python 2.7.10 -- py-1.4.27 -- pytest-2.7.1
rootdir: /private/tmp/xarray, inifile: setup.cfg
collected 1028 items 

xarray/test/test_backends.py ............................................................................................................................................................................................sssssssssssssssssssssssssssssssssssss.........sssssssssssssssssssssssss......
xarray/test/test_combine.py ..............
xarray/test/test_conventions.py .............................................s............
xarray/test/test_dask.py ..........F....................
xarray/test/test_dataarray.py ...........................................................s...................................................s.............
xarray/test/test_dataset.py .............................................................................................................................................
xarray/test/test_extensions.py ....
xarray/test/test_formatting.py .........
xarray/test/test_groupby.py ...
xarray/test/test_indexing.py .........
xarray/test/test_merge.py ..............
xarray/test/test_ops.py .............
xarray/test/test_plot.py ..............................................................................................................................................................................................
xarray/test/test_tutorial.py s
xarray/test/test_ufuncs.py ....
xarray/test/test_utils.py ...................
xarray/test/test_variable.py ...............................................................................................................................
xarray/test/test_xray.py .

================================================================================================================= FAILURES =================================================================================================================
_________________________________________________________________________________________________________ TestVariable.test_reduce _________________________________________________________________________________________________________

self = <xarray.test.test_dask.TestVariable testMethod=test_reduce>

    def test_reduce(self):
        u = self.eager_var
        v = self.lazy_var
        self.assertLazyAndAllClose(u.mean(), v.mean())
        self.assertLazyAndAllClose(u.std(), v.std())
>       self.assertLazyAndAllClose(u.argmax(dim='x'), v.argmax(dim='x'))

xarray/test/test_dask.py:145: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
xarray/core/common.py:16: in wrapped_func
    skipna=skipna, allow_lazy=True, **kwargs)
xarray/core/variable.py:899: in reduce
    axis=axis, **kwargs)
xarray/core/ops.py:308: in f
    return func(values, axis=axis, **kwargs)
xarray/core/ops.py:64: in f
    return getattr(module, name)(*args, **kwargs)
/Users/pwolfram/anaconda/lib/python2.7/site-packages/dask/array/reductions.py:542: in _
    return arg_reduction(x, chunk, combine, agg, axis, split_every)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

x = dask.array<from-ar..., shape=(4, 6), dtype=float64, chunksize=(2, 2)>, chunk = <functools.partial object at 0x115f3ef70>, combine = <functools.partial object at 0x115f3efc8>, agg = <functools.partial object at 0x115f4b050>
axis = (0,), split_every = None

    def arg_reduction(x, chunk, combine, agg, axis=None, split_every=None):
        """"""Generic function for argreduction.

        Parameters
        ----------
        x : Array
        chunk : callable
            Partialed ``arg_chunk``.
        combine : callable
            Partialed ``arg_combine``.
        agg : callable
            Partialed ``arg_agg``.
        axis : int, optional
        split_every : int or dict, optional
        """"""
        if axis is None:
            axis = tuple(range(x.ndim))
            ravel = True
        elif isinstance(axis, int):
            if axis < 0:
                axis += x.ndim
            if axis < 0 or axis >= x.ndim:
                raise ValueError(""axis entry is out of bounds"")
            axis = (axis,)
            ravel = x.ndim == 1
        else:
            raise TypeError(""axis must be either `None` or int, ""
                            ""got '{0}'"".format(axis))

        # Map chunk across all blocks
        name = 'arg-reduce-chunk-{0}'.format(tokenize(chunk, axis))
        old = x.name
        keys = list(product(*map(range, x.numblocks)))
        offsets = list(product(*(accumulate(operator.add, bd[:-1], 0)
>                                for bd in x.chunks)))
E       TypeError: type object argument after * must be a sequence, not generator

/Users/pwolfram/anaconda/lib/python2.7/site-packages/dask/array/reductions.py:510: TypeError
============================================================================================ 1 failed, 961 passed, 66 skipped in 52.49 seconds =============================================================================================

```

cc @shoyer 
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1010/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
153484311,MDExOlB1bGxSZXF1ZXN0NjkxNzI3NTU=,845,Fixes doc typo,4295853,closed,0,,,2,2016-05-06T16:04:18Z,2016-05-07T00:35:10Z,2016-05-06T16:35:05Z,CONTRIBUTOR,,0,pydata/xarray/pulls/845,"To entirely add or removing coordinate arrays, you can use dictionary like

to 

To entirely add or remove coordinate arrays, you can use dictionary like
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/845/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
144683276,MDU6SXNzdWUxNDQ2ODMyNzY=,811,Selection based on boolean DataArray,4295853,closed,0,,,17,2016-03-30T18:38:34Z,2016-04-15T20:30:03Z,2016-04-15T20:30:03Z,CONTRIBUTOR,,,,"Should xarray indexing account for boolean values without resorting to a call to `np.where`?  For example, `acase.sel(Np=np.where(idx)[0])` works but `acase.sel(Np=idx)` does not.
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/811/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
145243134,MDExOlB1bGxSZXF1ZXN0NjQ5ODEzMzc=,815,Add drop=True option for where on Dataset and DataArray,4295853,closed,0,,,21,2016-04-01T17:55:55Z,2016-04-10T00:55:47Z,2016-04-10T00:33:00Z,CONTRIBUTOR,,0,pydata/xarray/pulls/815,"Addresses #811 to provide a Dataset and DataArray `sel_where` which returns a Dataset or DataArray of minimal coordinate size.
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/815/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
144037264,MDU6SXNzdWUxNDQwMzcyNjQ=,807,cumprod returns errors,4295853,closed,0,,,3,2016-03-28T17:50:13Z,2016-03-31T23:39:55Z,2016-03-31T23:39:55Z,CONTRIBUTOR,,,,"The xarray implementation of `cumprod` returns an assertion error, presumably because of bottleneck, e.g., https://github.com/pydata/xarray/blob/master/xarray/core/ops.py#L333.  The error is 

```
└─▪ ./test_cumprod.py 
[ 0.8841785   0.54181236  0.29075258  0.28883015  0.1137352   0.09909713
  0.03570122  0.0304542   0.01578143  0.01496195  0.01442681  0.00980845]
Traceback (most recent call last):
  File ""./test_cumprod.py"", line 13, in <module>
    foo.cumprod()
  File ""/Users/pwolfram/anaconda/lib/python2.7/site-packages/xarray/core/common.py"", line 16, in wrapped_func
    skipna=skipna, allow_lazy=True, **kwargs)
  File ""/Users/pwolfram/anaconda/lib/python2.7/site-packages/xarray/core/dataarray.py"", line 991, in reduce
    var = self.variable.reduce(func, dim, axis, keep_attrs, **kwargs)
  File ""/Users/pwolfram/anaconda/lib/python2.7/site-packages/xarray/core/variable.py"", line 871, in reduce
    axis=axis, **kwargs)
  File ""/Users/pwolfram/anaconda/lib/python2.7/site-packages/xarray/core/ops.py"", line 346, in f
    assert using_numpy_nan_func
AssertionError

```

If bottleneck is uninstalled then a value error is returned:

```
└─▪ ./test_cumprod.py 
[  2.99508768e-01   2.80142920e-01   1.56389242e-01   1.10791301e-01
   4.58372649e-02   4.10865622e-02   9.91362500e-03   6.76033435e-03
   3.83574249e-03   9.54972340e-04   1.56846616e-04   6.44088547e-05]
Traceback (most recent call last):
  File ""./test_cumprod.py"", line 13, in <module>
    foo.cumprod()
  File ""/Users/pwolfram/anaconda/lib/python2.7/site-packages/xarray/core/common.py"", line 16, in wrapped_func
    skipna=skipna, allow_lazy=True, **kwargs)
  File ""/Users/pwolfram/anaconda/lib/python2.7/site-packages/xarray/core/dataarray.py"", line 991, in reduce
    var = self.variable.reduce(func, dim, axis, keep_attrs, **kwargs)
  File ""/Users/pwolfram/anaconda/lib/python2.7/site-packages/xarray/core/variable.py"", line 880, in reduce
    return Variable(dims, data, attrs=attrs)
  File ""/Users/pwolfram/anaconda/lib/python2.7/site-packages/xarray/core/variable.py"", line 213, in __init__
    self._dims = self._parse_dimensions(dims)
  File ""/Users/pwolfram/anaconda/lib/python2.7/site-packages/xarray/core/variable.py"", line 321, in _parse_dimensions
    % (dims, self.ndim))
ValueError: dimensions () must have the same length as the number of data dimensions, ndim=1
```

No error occurs if the data array is converted to a numpy array prior to use of `cumprod`.  

This can easily be reproduced by https://gist.github.com/c32f231b773ecc4b0ccf, excerpted below

```
import numpy as np
import pandas as pd
import xarray as xr                                                                      


data = np.random.rand(4, 3)
locs = ['IA', 'IL', 'IN']
times = pd.date_range('2000-01-01', periods=4)
foo = xr.DataArray(data, coords=[times, locs], dims=['time', 'space'])
print foo.values.cumprod()
foo.cumprod()
```
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/807/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
140291221,MDU6SXNzdWUxNDAyOTEyMjE=,793,dask.async.RuntimeError: NetCDF: HDF error on xarray to_netcdf,4295853,closed,0,,,21,2016-03-11T21:04:36Z,2016-03-24T02:49:26Z,2016-03-24T02:49:13Z,CONTRIBUTOR,,,,"Dask appears to be failing on serialization following a ds.to_netcdef() via a NETCDF: HDF error.  
Excerpted error below:

```
Traceback (most recent call last):
  File ""reduce_dispersion_file.py"", line 40, in <module>
    if __name__ == ""__main__"":
  File ""reduce_dispersion_file.py"", line 36, in reduce_dispersion_file
    with timeit_context('output to disk'):
  File ""/users/pwolfram/envs/LIGHT_analysis/lib/python2.7/site-packages/xarray/core/dataset.py"", line 791, in to_netcdf
    engine=engine, encoding=encoding)
  File ""/users/pwolfram/envs/LIGHT_analysis/lib/python2.7/site-packages/xarray/backends/api.py"", line 356, in to_netcdf
    dataset.dump_to_store(store, sync=sync, encoding=encoding)
  File ""/users/pwolfram/envs/LIGHT_analysis/lib/python2.7/site-packages/xarray/core/dataset.py"", line 739, in dump_to_store
    store.sync()
  File ""/users/pwolfram/envs/LIGHT_analysis/lib/python2.7/site-packages/xarray/backends/netCDF4_.py"", line 283, in sync
    super(NetCDF4DataStore, self).sync()
  File ""/users/pwolfram/envs/LIGHT_analysis/lib/python2.7/site-packages/xarray/backends/common.py"", line 186, in sync
    self.writer.sync()
  File ""/users/pwolfram/envs/LIGHT_analysis/lib/python2.7/site-packages/xarray/backends/common.py"", line 165, in sync
    da.store(self.sources, self.targets)
  File ""/users/pwolfram/lib/python2.7/site-packages/dask/array/core.py"", line 712, in store
    Array._get(dsk, keys, **kwargs)
  File ""/users/pwolfram/lib/python2.7/site-packages/dask/base.py"", line 43, in _get
    return get(dsk2, keys, **kwargs)
  File ""/users/pwolfram/lib/python2.7/site-packages/dask/threaded.py"", line 57, in get 
    **kwargs)
  File ""/users/pwolfram/lib/python2.7/site-packages/dask/async.py"", line 481, in get_async
    raise(remote_exception(res, tb))
dask.async.RuntimeError: NetCDF: HDF error

Traceback
---------
  File ""/users/pwolfram/lib/python2.7/site-packages/dask/async.py"", line 264, in execute_task
    result = _execute_task(task, data)
  File ""/users/pwolfram/lib/python2.7/site-packages/dask/async.py"", line 246, in _execute_task
    return func(*args2)
  File ""/users/pwolfram/lib/python2.7/site-packages/dask/array/core.py"", line 1954, in store
    out[index] = np.asanyarray(x)
  File ""netCDF4/_netCDF4.pyx"", line 3678, in netCDF4._netCDF4.Variable.__setitem__ (netCDF4/_netCDF4.c:37215)
  File ""netCDF4/_netCDF4.pyx"", line 3887, in netCDF4._netCDF4.Variable._put (netCDF4/_netCDF4.c:38907)
```

Script used: https://gist.github.com/98acaa31a4533b490f78 
Full output: https://gist.github.com/248efce774ad08cb1dd6
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/793/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
142676995,MDExOlB1bGxSZXF1ZXN0NjM3MzQxODU=,800,Adds dask lock capability for backend writes,4295853,closed,0,,,3,2016-03-22T14:59:56Z,2016-03-22T22:32:37Z,2016-03-22T22:32:30Z,CONTRIBUTOR,,0,pydata/xarray/pulls/800,"This fixes an error on an asynchronous write for `to_netcdf`
resulting in an `dask.async.RuntimeError: NetCDF: HDF error`

Resolves issue https://github.com/pydata/xarray/issues/793
following dask improvement at https://github.com/dask/dask/pull/1053
following advice of @shoyer.
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/800/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
138332032,MDU6SXNzdWUxMzgzMzIwMzI=,783,Array size changes following loading of numpy array,4295853,closed,0,,,19,2016-03-03T23:44:39Z,2016-03-08T23:41:38Z,2016-03-08T23:37:16Z,CONTRIBUTOR,,,,"The issue in a nutshell is that 

```
(Pdb) rlzns.xParticle[rnum*Ntr:(rnum+1)*Ntr,:].shape                                                                    
(30, 1012000)                                                         
(Pdb) rlzns.xParticle[rnum*Ntr:(rnum+1)*Ntr,:].values.shape    
(29, 1012000)                                                  
(Pdb) rlzns.xParticle[rnum*Ntr:(rnum+1)*Ntr,:].data.shape                   
(30, 1012000)                                                                      
(Pdb) rlzns.xParticle[rnum*Ntr:(rnum+1)*Ntr,:].data                                  
dask.array<getitem..., shape=(30, 1012000), dtype=float64, chunksize=(23, 1012000)>  
```

It seems to me that for some reason when the array is loaded via values that it is no longer the same size.  The dask shape appears to be correct.

I previously do a filter on time via `rlzns = rlzns.isel(Time=np.where(reset > 0)[0])`
and do some commands like  `np.reshape(rlzns.Time[rnum*Ntr:(rnum+1)*Ntr].values,(1,Ntr)),axis=1)`
but it seems unlikely that this would be causing the problem.                     

Has anyone had an issue like this?  Any ideas on what could be causing the problem would be greatly appreciated because this behavior is very strange.
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/783/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue