id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 181017850,MDU6SXNzdWUxODEwMTc4NTA=,1037,attrs empty for open_mfdataset vs population for open_dataset,4295853,closed,0,,,4,2016-10-04T22:08:54Z,2019-02-02T06:30:20Z,2019-02-02T06:30:20Z,CONTRIBUTOR,,,,"Previously, a dataset would store `attrs` corresponding to netCDF global attributes. For some reason, this behavior does not appear to be supported anymore. Using this dataset: https://github.com/pydata/xarray-data/raw/master/rasm.nc ``` python In [1]: import xarray as xr In [2]: ds = xr.open_dataset('rasm.nc') /Users/pwolfram/src/xarray/xarray/conventions.py:386: RuntimeWarning: Unable to decode time axis into full numpy.datetime64 objects, continuing using dummy netCDF4.datetime objects instead, reason: dates out of range result = decode_cf_datetime(example_value, units, calendar) In [3]: ds Out[3]: Dimensions: (time: 36, x: 275, y: 205) Coordinates: * time (time) object 1980-09-16T12:00:00 1980-10-17 ... yc (y, x) float64 16.53 16.78 17.02 17.27 17.51 17.76 18.0 18.25 ... xc (y, x) float64 189.2 189.4 189.6 189.7 189.9 190.1 190.2 190.4 ... * y (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ... * x (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ... Data variables: Tair (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan ... Attributes: title: /workspace/jhamman/processed/R1002RBRxaaa01a/lnd/temp/R1002RBRxaaa01a.vic.ha.1979-09-01.nc institution: U.W. source: RACM R1002RBRxaaa01a output_frequency: daily output_mode: averaged convention: CF-1.4 references: Based on the initial model of Liang et al., 1994, JGR, 99, 14,415- 14,429. comment: Output from the Variable Infiltration Capacity (VIC) model. nco_openmp_thread_number: 1 NCO: 4.3.7 history: history deleted for brevity In [4]: ds = xr.open_mfdataset('rasm.nc') In [5]: ds Out[5]: Dimensions: (time: 36, x: 275, y: 205) Coordinates: * time (time) object 1980-09-16T12:00:00 1980-10-17 ... yc (y, x) float64 16.53 16.78 17.02 17.27 17.51 17.76 18.0 18.25 ... xc (y, x) float64 189.2 189.4 189.6 189.7 189.9 190.1 190.2 190.4 ... * y (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ... * x (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ... Data variables: Tair (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan ... ``` The attributes for `open_mfdataset` are missing whereas in previous versions of xarray I do not believe that this was the case because one of my scripts is failing because it does not obtain attributes when using the `open_mfdataset` initialization. @shoyer and @jhamman, is this the expected behavior and was the prior behavior simply an unspecified side-effect of the code vs a design decision? My preference would be to keep as many attributes as possible when using `open_mfdataset` to best provenance the dataset, i.e., `ds.attrs` should not be empty following initialization. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1037/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 180729538,MDU6SXNzdWUxODA3Mjk1Mzg=,1033,Extra arguments in templated doc strings are not being replaced properly,4295853,closed,0,,,3,2016-10-03T19:48:32Z,2019-01-26T15:08:30Z,2019-01-26T15:08:30Z,CONTRIBUTOR,,,,"For example, at http://xarray.pydata.org/en/stable/generated/xarray.Dataset.prod.html?highlight=prod, _func_ should actually be _prod_: ![screenshot 2016-10-03 13 48 07](https://cloud.githubusercontent.com/assets/4295853/19051376/09bf41fa-8970-11e6-903d-f10530aef61d.png) ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1033/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 144957100,MDU6SXNzdWUxNDQ5NTcxMDA=,813,Load fails following squeeze,4295853,closed,0,,,2,2016-03-31T16:57:13Z,2019-01-23T00:58:00Z,2019-01-23T00:58:00Z,CONTRIBUTOR,,,,"A `load` that follows a `squeeze` returns an error whereas a `squeeze` following a `load` does not. For example, ``` python test = acase.isel(Nb=layernum).sel(Np=np.where(idx)[1]) test = test.squeeze('Nr') test.load() ``` produces the error ``` --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () 1 test = acase.isel(Nb=layernum).sel(Np=np.where(idx)[1]) 2 test = test.squeeze('Nr') ----> 3 test.load() 4 test = test.squeeze('Nr') /users/pwolfram/envs/LIGHT_analysis/lib/python2.7/site-packages/xarray/core/dataset.pyc in load(self) 355 356 for k, data in zip(lazy_data, evaluated_data): --> 357 self.variables[k].data = data 358 359 return self /users/pwolfram/envs/LIGHT_analysis/lib/python2.7/site-packages/xarray/core/variable.pyc in data(self, data) 247 if data.shape != self.shape: 248 raise ValueError( --> 249 ""replacement data must match the Variable's shape"") 250 self._data = data 251 ValueError: replacement data must match the Variable's shape ``` whereas ``` python test = acase.isel(Nb=layernum).sel(Np=np.where(idx)[1]) test.load() test = test.squeeze('Nr') test.load() ``` works without error. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/813/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 142498006,MDU6SXNzdWUxNDI0OTgwMDY=,798,Integration with dask/distributed (xarray backend design),4295853,closed,0,,,59,2016-03-21T23:18:02Z,2019-01-13T04:12:32Z,2019-01-13T04:12:32Z,CONTRIBUTOR,,,,"Dask (https://github.com/dask/dask) currently provides on-node parallelism for medium-size data problems. However, large climate data sets will require multiple-node parallelism to analyze large climate data sets because this constitutes a big data problem. A likely solution to this issue is integration of distributed (https://github.com/dask/distributed) with dask. Distributed is now integrated with dask and its benefits are already starting to be realized, e.g., see http://matthewrocklin.com/blog/work/2016/02/26/dask-distributed-part-3. Thus, this issue is designed to identify the steps needed to perform this integration, at a high-level. As stated by @shoyer, it will > definitely require some refactoring of the xarray backend system to make this work cleanly, but that's > OK -- the xarray backend system is indicated as experimental/internal API precisely because we > hadn't figured out all the use cases yet."" > > To be honest, I've never been entirely happy with the design we took there (we use inheritance rather > than composition for backend classes), but we did get it to work for our use cases. Some refactoring > with an eye towards compatibility with dask distributed seems like a very worthwhile endeavor. We > do have the benefit of a pretty large test suite covering existing use cases. Thus, we have the chance to make xarray big-data capable as well as provide improvements to the backend. To this end, I'm starting this issue to help begin the design process following the xarray mailing list discussion some of us have been having (@shoyer, @mrocklin, @rabernat). Task To Do List: - [x] Verify asynchronous access error for `to_netcdf` output is resolved (e.g., https://github.com/pydata/xarray/issues/793) - [x] LRU-cached file IO supporting serialization to robustly support HDF/NetCDF reads ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/798/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 387892184,MDU6SXNzdWUzODc4OTIxODQ=,2592,Deprecated autoclose option,4295853,closed,0,,,4,2018-12-05T18:41:38Z,2018-12-05T18:54:28Z,2018-12-05T18:54:28Z,CONTRIBUTOR,,,,"In updated versions of xarray we are getting a deprecation error for `autoclose`, e.g., at https://github.com/MPAS-Dev/MPAS-Analysis/pull/501/. A look through the issues is not transparent as to this reason and this issue is to collect the high-level information on this change. Is there an alternative use that should be considered instead? ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2592/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 148771214,MDU6SXNzdWUxNDg3NzEyMTQ=,826,Storing history of xarray operations,4295853,closed,0,,,5,2016-04-15T21:15:10Z,2018-09-26T16:28:08Z,2016-06-23T14:27:21Z,CONTRIBUTOR,,,,"It may be useful to keep track of operations applied to DataArrays and Datasets in order to enhance provenance of output netcdf datasets, particularly for scientific applications to enhance reproducibility. However, this essentially would require keeping track of all the operations that were used to produce a given DataArray or Dataset. Ideally, we would want this to eventually result in appending data to the 'history' attribute for calls to `*.to_netcdf(...)`. This would keep track of data manipulation similar to nco/ncks/etc operations. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/826/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 217385961,MDU6SXNzdWUyMTczODU5NjE=,1332,Shape preserving `diff` via new keywords,4295853,closed,0,,,10,2017-03-27T21:49:52Z,2018-09-21T20:02:43Z,2018-09-21T20:02:43Z,CONTRIBUTOR,,,,"Currently, an operation such as `ds.diff('x')` will result in a smaller size dimension, e.g., ```python In [1]: import xarray as xr In [2]: ds = xr.Dataset({'foo': (('x',), [1, 2, 3])}, {'x': [1, 2, 3]}) In [3]: ds Out[3]: Dimensions: (x: 3) Coordinates: * x (x) int64 1 2 3 Data variables: foo (x) int64 1 2 3 In [4]: ds.diff('x') Out[4]: Dimensions: (x: 2) Coordinates: * x (x) int64 2 3 Data variables: foo (x) int64 1 1 ``` However, there are cases where the same size would be beneficial to keep so that you would get ```python In [1]: import xarray as xr In [2]: ds = xr.Dataset({'foo': (('x',), [1, 2, 3])}, {'x': [1, 2, 3]}) In [3]: ds.diff('x', preserve_shape=True, empty_value=0) Out[3]: Dimensions: (x: 3) Coordinates: * x (x) int64 1 2 3 Data variables: foo (x) int64 0 1 1 ``` Is there interest in addition of a `preserve_shape=True` keyword such that it results in this shape-preserving behavior? I'm proposing you could use this with `label='upper'` and `label='lower'`. `empty_value` could be a value or `empty_index` could be an index for the fill value. If `empty_value=None` and `empty_index=None`, it would produce a `nan`. The reason I'm asking the community is because this is at least the second time I've encountered an application where this behavior would be helpful, e.g., computing ocean layer thicknesses from bottom depths. A previous application was computation of a time step from time slice output and the desire to use this product in an approximated integral, e.g., ```python y*diff(t, label='lower', preserve_shape=True) ``` where `y` and `t` are both of size `n`, which is effectively a left-sided Riemann sum.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1332/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 139956689,MDU6SXNzdWUxMzk5NTY2ODk=,789,Time limitation (between years 1678 and 2262) restrictive to climate community,4295853,closed,0,,,13,2016-03-10T17:21:17Z,2018-05-14T22:42:09Z,2018-05-14T22:42:09Z,CONTRIBUTOR,,,,"The restriction of > One unfortunate limitation of using datetime64[ns] is that it limits the native representation of dates to > those that fall between the years 1678 and 2262. When a netCDF file contains dates outside of these > bounds, dates will be returned as arrays of netcdftime.datetime objects. is a potential roadblock inhibiting easy adoption of this library in the climate community. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/789/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 235278888,MDU6SXNzdWUyMzUyNzg4ODg=,1450,Should an `apply` method exist for `DataArray` similar to the definition for `Dataset`?,4295853,closed,0,,,3,2017-06-12T15:51:52Z,2017-06-13T14:14:27Z,2017-06-13T00:35:40Z,CONTRIBUTOR,,,,The method `apply` is defined for `Dataset`. Is there a design reason why it is not defined for `DataArray`? In general I think it would be good to have calculation methods apply to both `Dataset` and `DataArray` as possible but am thinking I'm missing a key design element here.,"{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1450/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 219043002,MDU6SXNzdWUyMTkwNDMwMDI=,1350,"where(..., drop=True) error",4295853,closed,0,,2444330,4,2017-04-03T19:53:33Z,2017-04-14T03:50:53Z,2017-04-14T03:50:53Z,CONTRIBUTOR,,,,"These results appear to be incorrect unless I'm missing something: ```python In [1]: import xarray as xr In [2]: import numpy as np In [3]: array = xr.DataArray(np.zeros((1,2,3)), dims=['time','x','y'], coords={'x':np.arange(2)}) In [4]: array[0,1,1] = 1 In [5]: array.where(array !=0, drop=True) Out[5]: array([[[ 0.]]]) Coordinates: * x (x) int64 1 Dimensions without coordinates: time, y In [5]: array.where(array !=0, drop=True).values Out[5]: array([[[ 0.]]]) In [7]: array.values[array.values !=0] Out[7]: array([ 1.]) ``` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1350/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 217660739,MDExOlB1bGxSZXF1ZXN0MTEzMDQzNDI1,1336,"Marks slow, flaky, and failing tests",4295853,closed,0,,,13,2017-03-28T19:03:20Z,2017-04-07T04:36:08Z,2017-04-03T05:30:16Z,CONTRIBUTOR,,0,pydata/xarray/pulls/1336,"Closes #1309 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1336/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 218291642,MDExOlB1bGxSZXF1ZXN0MTEzNDk3NzQ1,1342,Ensures drop=True case works with empty mask,4295853,closed,0,,,7,2017-03-30T18:45:34Z,2017-04-02T22:45:01Z,2017-04-02T22:43:53Z,CONTRIBUTOR,,0,pydata/xarray/pulls/1342,"Resolves error occurring for python 2.7 for the case of `where(mask, drop=True)` where the mask is empty. - [x] closes #1341 - [X] tests added - [x] tests passed - [x] passes ``git diff upstream/master | flake8 --diff`` - [x] whatsnew entry ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1342/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 218277814,MDU6SXNzdWUyMTgyNzc4MTQ=,1341,"where(..., drop=True) failure for empty mask on python 2.7",4295853,closed,0,,,4,2017-03-30T17:55:38Z,2017-04-02T22:43:53Z,2017-04-02T22:43:53Z,CONTRIBUTOR,,,,"The following fails for 2.7 but not 3.5 (reproducible script at https://gist.github.com/89bd5bd62a475510b2611cbff8d5c67a): ```python In [1]: import xarray as xr In [2]: import numpy as np In [3]: da = xr.DataArray(np.random.rand(100,10), dims=['nCells','nVertLevels']) In [4]: mask = xr.DataArray(np.zeros((100,), dtype='bool'), dims='nCells') In [5]: da.where(mask, drop=True) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () ----> 1 da.where(mask, drop=True) /Users/pwolfram/src/xarray/xarray/core/common.pyc in where(self, cond, other, drop) 681 outcond = cond.isel(**clip) 682 indexers = {dim: outcond.get_index(dim) for dim in outcond.dims} --> 683 outobj = self.sel(**indexers) 684 else: 685 outobj = self /Users/pwolfram/src/xarray/xarray/core/dataarray.pyc in sel(self, method, tolerance, drop, **indexers) 670 self, indexers, method=method, tolerance=tolerance 671 ) --> 672 result = self.isel(drop=drop, **pos_indexers) 673 return result._replace_indexes(new_indexes) 674 /Users/pwolfram/src/xarray/xarray/core/dataarray.pyc in isel(self, drop, **indexers) 655 DataArray.sel 656 """""" --> 657 ds = self._to_temp_dataset().isel(drop=drop, **indexers) 658 return self._from_temp_dataset(ds) 659 /Users/pwolfram/src/xarray/xarray/core/dataset.pyc in isel(self, drop, **indexers) 1115 for name, var in iteritems(self._variables): 1116 var_indexers = dict((k, v) for k, v in indexers if k in var.dims) -> 1117 new_var = var.isel(**var_indexers) 1118 if not (drop and name in var_indexers): 1119 variables[name] = new_var /Users/pwolfram/src/xarray/xarray/core/variable.pyc in isel(self, **indexers) 545 if dim in indexers: 546 key[i] = indexers[dim] --> 547 return self[tuple(key)] 548 549 def squeeze(self, dim=None): /Users/pwolfram/src/xarray/xarray/core/variable.pyc in __getitem__(self, key) 375 dims = tuple(dim for k, dim in zip(key, self.dims) 376 if not isinstance(k, (int, np.integer))) --> 377 values = self._indexable_data[key] 378 # orthogonal indexing should ensure the dimensionality is consistent 379 if hasattr(values, 'ndim'): /Users/pwolfram/src/xarray/xarray/core/indexing.pyc in __getitem__(self, key) 465 466 def __getitem__(self, key): --> 467 key = self._convert_key(key) 468 return self._ensure_ndarray(self.array[key]) 469 /Users/pwolfram/src/xarray/xarray/core/indexing.pyc in _convert_key(self, key) 452 if any(not isinstance(k, (int, np.integer, slice)) for k in key): 453 # key would trigger fancy indexing --> 454 key = orthogonal_indexer(key, self.shape) 455 return key 456 /Users/pwolfram/src/xarray/xarray/core/indexing.pyc in orthogonal_indexer(key, shape) 77 """""" 78 # replace Ellipsis objects with slices ---> 79 key = list(canonicalize_indexer(key, len(shape))) 80 # replace 1d arrays and slices with broadcast compatible arrays 81 # note: we treat integers separately (instead of turning them into 1d /Users/pwolfram/src/xarray/xarray/core/indexing.pyc in canonicalize_indexer(key, ndim) 65 return indexer 66 ---> 67 return tuple(canonicalize(k) for k in expanded_indexer(key, ndim)) 68 69 /Users/pwolfram/src/xarray/xarray/core/indexing.pyc in ((k,)) 65 return indexer 66 ---> 67 return tuple(canonicalize(k) for k in expanded_indexer(key, ndim)) 68 69 /Users/pwolfram/src/xarray/xarray/core/indexing.pyc in canonicalize(indexer) 62 'array indexing; all subkeys must be ' 63 'slices, integers or sequences of ' ---> 64 'integers or Booleans' % indexer) 65 return indexer 66 ValueError: invalid subkey array([], dtype=object) for integer based array indexing; all subkeys must be slices, integers or sequences of integers or Booleans ``` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1341/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 181033674,MDExOlB1bGxSZXF1ZXN0ODc5OTUzNzg=,1038,Attributes from netCDF4 intialization retained,4295853,closed,0,,,26,2016-10-04T23:51:48Z,2017-03-31T03:11:07Z,2017-03-31T03:11:07Z,CONTRIBUTOR,,0,pydata/xarray/pulls/1038,"Ensures that attrs for open_mfdataset are now retained cc @shoyer ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1038/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 218013400,MDU6SXNzdWUyMTgwMTM0MDA=,1338,Chunking and dask memory errors,4295853,closed,0,,,2,2017-03-29T21:22:49Z,2017-03-29T22:56:45Z,2017-03-29T22:56:45Z,CONTRIBUTOR,,,,"What is the standard way of sub-chunking to prevent dask memory errors? For large dataset files there could be a dimension, say `nCells`, that is large enough to fill RAM. If this occurs, is there an automatic mechanism to prevent out-of-memory errors in dask or is the user's responsibility to specify maximum chunk sizes on their own?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1338/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 199900056,MDExOlB1bGxSZXF1ZXN0MTAwOTI5NTYz,1198,Fixes OS error arising from too many files open,4295853,closed,0,,,54,2017-01-10T18:37:41Z,2017-03-23T19:21:27Z,2017-03-23T19:20:03Z,CONTRIBUTOR,,0,pydata/xarray/pulls/1198,"Previously, DataStore did not judiciously close files, resulting in opening a large number of files that could result in an OSError related to too many files being open. This merge provides a solution for the netCDF, scipy, and h5netcdf backends.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1198/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 180678466,MDExOlB1bGxSZXF1ZXN0ODc3NDI5NjI=,1031,Fixes doc formating error,4295853,closed,0,,,1,2016-10-03T16:00:29Z,2017-03-22T17:27:10Z,2016-10-03T16:15:19Z,CONTRIBUTOR,,0,pydata/xarray/pulls/1031,"This fixes a typo in the what's new documentation. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1031/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 180717915,MDExOlB1bGxSZXF1ZXN0ODc3NzA5NTk=,1032,Fixes documentation typo,4295853,closed,0,,,0,2016-10-03T18:57:21Z,2017-03-22T17:27:10Z,2016-10-03T21:03:07Z,CONTRIBUTOR,,0,pydata/xarray/pulls/1032,"This is another really minor typo... ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1032/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 140214928,MDU6SXNzdWUxNDAyMTQ5Mjg=,791,Adding cumsum / cumprod reduction operators,4295853,closed,0,,,12,2016-03-11T15:36:41Z,2016-10-04T22:16:26Z,2016-10-04T22:16:26Z,CONTRIBUTOR,,,,"It would be useful to have the cumsum / cumprod reduction operator for DataArray and Dataset, analagous to http://xarray.pydata.org/en/stable/generated/xarray.DataArray.sum.html?highlight=sum#xarray.DataArray.sum and http://xarray.pydata.org/en/stable/generated/xarray.Dataset.sum.html?highlight=sum#xarray.Dataset.sum I notice this is on the TODO at https://github.com/pydata/xarray/blob/master/xarray/core/ops.py#L54 and am assuming there is something subtle here about the implementation. I believe the issue was probably with dask, but the issue / PR at https://github.com/dask/dask/issues/923 & https://github.com/dask/dask/pull/925 may have removed the roadblock. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/791/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 144920646,MDExOlB1bGxSZXF1ZXN0NjQ4MDEwMTc=,812,Adds cummulative operators to API ,4295853,closed,0,,,14,2016-03-31T14:37:50Z,2016-10-03T21:06:26Z,2016-10-03T21:05:33Z,CONTRIBUTOR,,0,pydata/xarray/pulls/812,"This PR will add cumsum and cumprod as discussed in https://github.com/pydata/xarray/issues/791 as well ensuring `cumprod` works for the API, resolving issues discussed at https://github.com/pydata/xarray/issues/807. TO DO (dependencies) - [x] Add `nancumprod` and `nancumsum` to numpy (https://github.com/numpy/numpy/pull/7421) - [x] Add `nancumprod` and `nancumsum` to dask (https://github.com/dask/dask/pull/1077) This PR extends infrastructure to support `cumsum` and `cumprod` (https://github.com/pydata/xarray/issues/791). References: - https://github.com/numpy/numpy/pull/7421 cc @shoyer, @jhamman ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/812/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 178111738,MDU6SXNzdWUxNzgxMTE3Mzg=,1010,py.test fails on master,4295853,closed,0,,,5,2016-09-20T16:36:51Z,2016-09-20T17:13:14Z,2016-09-20T17:12:09Z,CONTRIBUTOR,,,,"Following creation of a new conda environment: ``` bash cd /tmp conda create -n test_xarray27 python=2.7 -y source activate test_xarray27 conda install matplotlib dask bottleneck pytest -y git clone git@github.com:pydata/xarray.git cd xarray git co master python setup.py develop py.test ``` returns ``` bash ┌─[pwolfram][shapiro][/tmp/xarray][10:34][±][master ✓] └─▪ py.test =========================================================================================================== test session starts ============================================================================================================ platform darwin -- Python 2.7.10 -- py-1.4.27 -- pytest-2.7.1 rootdir: /private/tmp/xarray, inifile: setup.cfg collected 1028 items xarray/test/test_backends.py ............................................................................................................................................................................................sssssssssssssssssssssssssssssssssssss.........sssssssssssssssssssssssss...... xarray/test/test_combine.py .............. xarray/test/test_conventions.py .............................................s............ xarray/test/test_dask.py ..........F.................... xarray/test/test_dataarray.py ...........................................................s...................................................s............. xarray/test/test_dataset.py ............................................................................................................................................. xarray/test/test_extensions.py .... xarray/test/test_formatting.py ......... xarray/test/test_groupby.py ... xarray/test/test_indexing.py ......... xarray/test/test_merge.py .............. xarray/test/test_ops.py ............. xarray/test/test_plot.py .............................................................................................................................................................................................. xarray/test/test_tutorial.py s xarray/test/test_ufuncs.py .... xarray/test/test_utils.py ................... xarray/test/test_variable.py ............................................................................................................................... xarray/test/test_xray.py . ================================================================================================================= FAILURES ================================================================================================================= _________________________________________________________________________________________________________ TestVariable.test_reduce _________________________________________________________________________________________________________ self = def test_reduce(self): u = self.eager_var v = self.lazy_var self.assertLazyAndAllClose(u.mean(), v.mean()) self.assertLazyAndAllClose(u.std(), v.std()) > self.assertLazyAndAllClose(u.argmax(dim='x'), v.argmax(dim='x')) xarray/test/test_dask.py:145: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ xarray/core/common.py:16: in wrapped_func skipna=skipna, allow_lazy=True, **kwargs) xarray/core/variable.py:899: in reduce axis=axis, **kwargs) xarray/core/ops.py:308: in f return func(values, axis=axis, **kwargs) xarray/core/ops.py:64: in f return getattr(module, name)(*args, **kwargs) /Users/pwolfram/anaconda/lib/python2.7/site-packages/dask/array/reductions.py:542: in _ return arg_reduction(x, chunk, combine, agg, axis, split_every) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ x = dask.array, chunk = , combine = , agg = axis = (0,), split_every = None def arg_reduction(x, chunk, combine, agg, axis=None, split_every=None): """"""Generic function for argreduction. Parameters ---------- x : Array chunk : callable Partialed ``arg_chunk``. combine : callable Partialed ``arg_combine``. agg : callable Partialed ``arg_agg``. axis : int, optional split_every : int or dict, optional """""" if axis is None: axis = tuple(range(x.ndim)) ravel = True elif isinstance(axis, int): if axis < 0: axis += x.ndim if axis < 0 or axis >= x.ndim: raise ValueError(""axis entry is out of bounds"") axis = (axis,) ravel = x.ndim == 1 else: raise TypeError(""axis must be either `None` or int, "" ""got '{0}'"".format(axis)) # Map chunk across all blocks name = 'arg-reduce-chunk-{0}'.format(tokenize(chunk, axis)) old = x.name keys = list(product(*map(range, x.numblocks))) offsets = list(product(*(accumulate(operator.add, bd[:-1], 0) > for bd in x.chunks))) E TypeError: type object argument after * must be a sequence, not generator /Users/pwolfram/anaconda/lib/python2.7/site-packages/dask/array/reductions.py:510: TypeError ============================================================================================ 1 failed, 961 passed, 66 skipped in 52.49 seconds ============================================================================================= ``` cc @shoyer ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1010/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 153484311,MDExOlB1bGxSZXF1ZXN0NjkxNzI3NTU=,845,Fixes doc typo,4295853,closed,0,,,2,2016-05-06T16:04:18Z,2016-05-07T00:35:10Z,2016-05-06T16:35:05Z,CONTRIBUTOR,,0,pydata/xarray/pulls/845,"To entirely add or removing coordinate arrays, you can use dictionary like to To entirely add or remove coordinate arrays, you can use dictionary like ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/845/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 144683276,MDU6SXNzdWUxNDQ2ODMyNzY=,811,Selection based on boolean DataArray,4295853,closed,0,,,17,2016-03-30T18:38:34Z,2016-04-15T20:30:03Z,2016-04-15T20:30:03Z,CONTRIBUTOR,,,,"Should xarray indexing account for boolean values without resorting to a call to `np.where`? For example, `acase.sel(Np=np.where(idx)[0])` works but `acase.sel(Np=idx)` does not. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/811/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 145243134,MDExOlB1bGxSZXF1ZXN0NjQ5ODEzMzc=,815,Add drop=True option for where on Dataset and DataArray,4295853,closed,0,,,21,2016-04-01T17:55:55Z,2016-04-10T00:55:47Z,2016-04-10T00:33:00Z,CONTRIBUTOR,,0,pydata/xarray/pulls/815,"Addresses #811 to provide a Dataset and DataArray `sel_where` which returns a Dataset or DataArray of minimal coordinate size. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/815/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 144037264,MDU6SXNzdWUxNDQwMzcyNjQ=,807,cumprod returns errors,4295853,closed,0,,,3,2016-03-28T17:50:13Z,2016-03-31T23:39:55Z,2016-03-31T23:39:55Z,CONTRIBUTOR,,,,"The xarray implementation of `cumprod` returns an assertion error, presumably because of bottleneck, e.g., https://github.com/pydata/xarray/blob/master/xarray/core/ops.py#L333. The error is ``` └─▪ ./test_cumprod.py [ 0.8841785 0.54181236 0.29075258 0.28883015 0.1137352 0.09909713 0.03570122 0.0304542 0.01578143 0.01496195 0.01442681 0.00980845] Traceback (most recent call last): File ""./test_cumprod.py"", line 13, in foo.cumprod() File ""/Users/pwolfram/anaconda/lib/python2.7/site-packages/xarray/core/common.py"", line 16, in wrapped_func skipna=skipna, allow_lazy=True, **kwargs) File ""/Users/pwolfram/anaconda/lib/python2.7/site-packages/xarray/core/dataarray.py"", line 991, in reduce var = self.variable.reduce(func, dim, axis, keep_attrs, **kwargs) File ""/Users/pwolfram/anaconda/lib/python2.7/site-packages/xarray/core/variable.py"", line 871, in reduce axis=axis, **kwargs) File ""/Users/pwolfram/anaconda/lib/python2.7/site-packages/xarray/core/ops.py"", line 346, in f assert using_numpy_nan_func AssertionError ``` If bottleneck is uninstalled then a value error is returned: ``` └─▪ ./test_cumprod.py [ 2.99508768e-01 2.80142920e-01 1.56389242e-01 1.10791301e-01 4.58372649e-02 4.10865622e-02 9.91362500e-03 6.76033435e-03 3.83574249e-03 9.54972340e-04 1.56846616e-04 6.44088547e-05] Traceback (most recent call last): File ""./test_cumprod.py"", line 13, in foo.cumprod() File ""/Users/pwolfram/anaconda/lib/python2.7/site-packages/xarray/core/common.py"", line 16, in wrapped_func skipna=skipna, allow_lazy=True, **kwargs) File ""/Users/pwolfram/anaconda/lib/python2.7/site-packages/xarray/core/dataarray.py"", line 991, in reduce var = self.variable.reduce(func, dim, axis, keep_attrs, **kwargs) File ""/Users/pwolfram/anaconda/lib/python2.7/site-packages/xarray/core/variable.py"", line 880, in reduce return Variable(dims, data, attrs=attrs) File ""/Users/pwolfram/anaconda/lib/python2.7/site-packages/xarray/core/variable.py"", line 213, in __init__ self._dims = self._parse_dimensions(dims) File ""/Users/pwolfram/anaconda/lib/python2.7/site-packages/xarray/core/variable.py"", line 321, in _parse_dimensions % (dims, self.ndim)) ValueError: dimensions () must have the same length as the number of data dimensions, ndim=1 ``` No error occurs if the data array is converted to a numpy array prior to use of `cumprod`. This can easily be reproduced by https://gist.github.com/c32f231b773ecc4b0ccf, excerpted below ``` import numpy as np import pandas as pd import xarray as xr data = np.random.rand(4, 3) locs = ['IA', 'IL', 'IN'] times = pd.date_range('2000-01-01', periods=4) foo = xr.DataArray(data, coords=[times, locs], dims=['time', 'space']) print foo.values.cumprod() foo.cumprod() ``` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/807/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 140291221,MDU6SXNzdWUxNDAyOTEyMjE=,793,dask.async.RuntimeError: NetCDF: HDF error on xarray to_netcdf,4295853,closed,0,,,21,2016-03-11T21:04:36Z,2016-03-24T02:49:26Z,2016-03-24T02:49:13Z,CONTRIBUTOR,,,,"Dask appears to be failing on serialization following a ds.to_netcdef() via a NETCDF: HDF error. Excerpted error below: ``` Traceback (most recent call last): File ""reduce_dispersion_file.py"", line 40, in if __name__ == ""__main__"": File ""reduce_dispersion_file.py"", line 36, in reduce_dispersion_file with timeit_context('output to disk'): File ""/users/pwolfram/envs/LIGHT_analysis/lib/python2.7/site-packages/xarray/core/dataset.py"", line 791, in to_netcdf engine=engine, encoding=encoding) File ""/users/pwolfram/envs/LIGHT_analysis/lib/python2.7/site-packages/xarray/backends/api.py"", line 356, in to_netcdf dataset.dump_to_store(store, sync=sync, encoding=encoding) File ""/users/pwolfram/envs/LIGHT_analysis/lib/python2.7/site-packages/xarray/core/dataset.py"", line 739, in dump_to_store store.sync() File ""/users/pwolfram/envs/LIGHT_analysis/lib/python2.7/site-packages/xarray/backends/netCDF4_.py"", line 283, in sync super(NetCDF4DataStore, self).sync() File ""/users/pwolfram/envs/LIGHT_analysis/lib/python2.7/site-packages/xarray/backends/common.py"", line 186, in sync self.writer.sync() File ""/users/pwolfram/envs/LIGHT_analysis/lib/python2.7/site-packages/xarray/backends/common.py"", line 165, in sync da.store(self.sources, self.targets) File ""/users/pwolfram/lib/python2.7/site-packages/dask/array/core.py"", line 712, in store Array._get(dsk, keys, **kwargs) File ""/users/pwolfram/lib/python2.7/site-packages/dask/base.py"", line 43, in _get return get(dsk2, keys, **kwargs) File ""/users/pwolfram/lib/python2.7/site-packages/dask/threaded.py"", line 57, in get **kwargs) File ""/users/pwolfram/lib/python2.7/site-packages/dask/async.py"", line 481, in get_async raise(remote_exception(res, tb)) dask.async.RuntimeError: NetCDF: HDF error Traceback --------- File ""/users/pwolfram/lib/python2.7/site-packages/dask/async.py"", line 264, in execute_task result = _execute_task(task, data) File ""/users/pwolfram/lib/python2.7/site-packages/dask/async.py"", line 246, in _execute_task return func(*args2) File ""/users/pwolfram/lib/python2.7/site-packages/dask/array/core.py"", line 1954, in store out[index] = np.asanyarray(x) File ""netCDF4/_netCDF4.pyx"", line 3678, in netCDF4._netCDF4.Variable.__setitem__ (netCDF4/_netCDF4.c:37215) File ""netCDF4/_netCDF4.pyx"", line 3887, in netCDF4._netCDF4.Variable._put (netCDF4/_netCDF4.c:38907) ``` Script used: https://gist.github.com/98acaa31a4533b490f78 Full output: https://gist.github.com/248efce774ad08cb1dd6 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/793/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 142676995,MDExOlB1bGxSZXF1ZXN0NjM3MzQxODU=,800,Adds dask lock capability for backend writes,4295853,closed,0,,,3,2016-03-22T14:59:56Z,2016-03-22T22:32:37Z,2016-03-22T22:32:30Z,CONTRIBUTOR,,0,pydata/xarray/pulls/800,"This fixes an error on an asynchronous write for `to_netcdf` resulting in an `dask.async.RuntimeError: NetCDF: HDF error` Resolves issue https://github.com/pydata/xarray/issues/793 following dask improvement at https://github.com/dask/dask/pull/1053 following advice of @shoyer. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/800/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 138332032,MDU6SXNzdWUxMzgzMzIwMzI=,783,Array size changes following loading of numpy array,4295853,closed,0,,,19,2016-03-03T23:44:39Z,2016-03-08T23:41:38Z,2016-03-08T23:37:16Z,CONTRIBUTOR,,,,"The issue in a nutshell is that ``` (Pdb) rlzns.xParticle[rnum*Ntr:(rnum+1)*Ntr,:].shape (30, 1012000) (Pdb) rlzns.xParticle[rnum*Ntr:(rnum+1)*Ntr,:].values.shape (29, 1012000) (Pdb) rlzns.xParticle[rnum*Ntr:(rnum+1)*Ntr,:].data.shape (30, 1012000) (Pdb) rlzns.xParticle[rnum*Ntr:(rnum+1)*Ntr,:].data dask.array ``` It seems to me that for some reason when the array is loaded via values that it is no longer the same size. The dask shape appears to be correct. I previously do a filter on time via `rlzns = rlzns.isel(Time=np.where(reset > 0)[0])` and do some commands like `np.reshape(rlzns.Time[rnum*Ntr:(rnum+1)*Ntr].values,(1,Ntr)),axis=1)` but it seems unlikely that this would be causing the problem. Has anyone had an issue like this? Any ideas on what could be causing the problem would be greatly appreciated because this behavior is very strange. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/783/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue