id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 636493109,MDU6SXNzdWU2MzY0OTMxMDk=,4142,"Should we make ""rasterio"" an engine option?",291576,closed,0,,,6,2020-06-10T19:28:49Z,2021-05-27T16:17:53Z,2021-05-27T16:17:53Z,CONTRIBUTOR,,,,"In a similar vein to how #4003 is going for zarr files, I would like to see if a rasterio engine could be created so that geotiff files could get opened through open_mfdataset() and friends. I am willing to put some cycles to putting this together. It has been a *long* time since I did the initial prototype for the pynio backend back in the xray days. Does anyone see any immediate pitfalls or gotchas for doing this?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4142/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 307318224,MDU6SXNzdWUzMDczMTgyMjQ=,2004,Slicing DataArray can take longer than not slicing,291576,closed,0,,,14,2018-03-21T16:20:49Z,2020-12-03T18:15:35Z,2020-12-03T18:15:35Z,CONTRIBUTOR,,,,"#### Code Sample, a copy-pastable example if possible ```ipython In [1]: import xarray as xr In [2]: radmax_ds = xr.open_dataset('tests/radmax_baseline.nc') In [3]: radmax_ds Out[3]: Dimensions: (latitude: 5650, longitude: 12050, time: 3) Coordinates: * latitude (latitude) float32 13.505002 13.515002 13.525002 13.535002 ... * longitude (longitude) float32 -170.495 -170.485 -170.475 -170.465 ... * time (time) datetime64[ns] 2017-03-07T01:00:00 2017-03-07T02:00:00 ... Data variables: RadarMax (time, latitude, longitude) float32 ... Attributes: start_date: 03/07/2017 01:00 end_date: 03/07/2017 01:55 elapsed: 60 data_rights: Respond (TM) Confidential Data. (c) Insurance Services Offi... In [4]: %timeit foo = radmax_ds.RadarMax.load() The slowest run took 35509.20 times longer than the fastest. This could mean that an intermediate result is being cached. 1 loop, best of 3: 216 µs per loop In [5]: 216 * 35509.2 Out[5]: 7669987.199999999 ``` So, without any slicing, it takes approximately 7.5 seconds for me to load this complete file into memory. Now, let's see what happens when I slice the DataArray and load it: ``` ipython In [1]: import xarray as xr In [2]: radmax_ds = xr.open_dataset('tests/radmax_baseline.nc') In [3]: %timeit foo = radmax_ds.RadarMax[::1, ::1, ::1].load() 1 loop, best of 3: 7.56 s per loop In [4]: radmax_ds.close() In [5]: radmax_ds = xr.open_dataset('tests/radmax_baseline.nc') In [6]: %timeit foo = radmax_ds.RadarMax[::1, ::10, ::10].load() ``` I killed this session after 17 minutes. `top` did not report any unusual io wait, and memory usage was not out of control. I am using v0.10.2 of xarray. My suspicion is that there is something wrong with the indexing system that is causing xarray to read in the data in a bad order. Notice that if I slice all the data, then the timing works out the same as reading it all in straight-up. Not shown here is a run where if I slice every 100 lats and 100 longitudes, then the timing is shorter again, but not to the same amount of time as reading it all in at once. Let me know if you want a copy of the file. It is a compressed netcdf4, taking up only 1.7MB. I wonder if this is related to #1985?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2004/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 306067267,MDU6SXNzdWUzMDYwNjcyNjc=,1997,can't do in-place clip() with DataArrays.,291576,closed,0,,,4,2018-03-16T20:31:03Z,2020-02-19T22:59:08Z,2020-02-19T22:59:08Z,CONTRIBUTOR,,,,"#### Code Sample, a copy-pastable example if possible Where `foo` is a DataArray, there doesn't seem to be a nice way to use `clip()` in-place. ```python foo.clip(0, None, out=foo) Traceback (most recent call last): File """", line 1, in foo.clip(0, None, out=foo) File ""/rd22/scratch/broot/Programs/xarray/xarray/core/dataarray.py"", line 1726, in func **kwargs)) File ""/rd22/scratch/broot/Programs/xarray/xarray/core/ops.py"", line 205, in func return _call_possibly_missing_method(self, name, args, kwargs) File ""/rd22/scratch/broot/Programs/xarray/xarray/core/ops.py"", line 192, in _call_possibly_missing_method return method(*args, **kwargs) TypeError: output must be an array ``` You get a similar exception if you do `np.clip(foo, ..., out=foo)`. #### Problem description Note the docstring for `DataArray.clip()`: ``` Help on method clip in module xarray.core.ops: clip(self, *args, **kwargs) method of xarray.core.dataarray.DataArray instance a.clip(min=None, max=None, out=None) Return an array whose values are limited to ``[min, max]``. One of max or min must be given. Refer to `numpy.clip` for full documentation. See Also -------- numpy.clip : equivalent function ``` So, the docstring advertises support for `out`. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1997/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 395994055,MDU6SXNzdWUzOTU5OTQwNTU=,2647,"getting a ""truth value of an array"" error when supplying my own `concat_dim`.",291576,closed,0,,,7,2019-01-04T16:52:00Z,2019-01-06T14:32:08Z,2019-01-05T06:46:34Z,CONTRIBUTOR,,,,"This bug was introduced sometime after v0.11.0 and has turned up in my test suite using v0.11.2. I'll pass a `DataArray()` as my `concat_dim`, and the failure will happen at line 609 in backends/api.py because of: `if concat_dim is None or concat_dim == _CONCAT_DIM_DEFAULT:` I am not sure how this change got through. In #2048, I added a unit test that passes a DataArray as a concat_dim argument. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2647/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 305327479,MDU6SXNzdWUzMDUzMjc0Nzk=,1988,open_mfdataset() on a single file drops the concat_dim,291576,closed,0,,,6,2018-03-14T21:02:39Z,2018-04-10T20:48:59Z,2018-04-10T20:48:59Z,CONTRIBUTOR,,,,"When calling `xr.open_mfdataset()` on a 1 element list of filenames, the concat dimension is never added. This isn't a MWE at the moment (will make one soon enough), just wanted to get my thoughts down. ```python from datetime import datetime import xarray as xr time_coord = xr.DataArray([datetime.utcnow()], name='time', dims='time') radmax_ds = xr.open_mfdataset(['foobar.nc'], concat_dim=time_coord) print(radmax_ds) ``` ``` Dimensions: (latitude: 5650, longitude: 12050) Coordinates: * latitude (latitude) float32 13.505002 13.515002 13.525002 13.535002 ... * longitude (longitude) float32 -170.495 -170.485 -170.475 -170.465 ... Data variables: RadarMax (latitude, longitude) float32 dask.array Attributes: start_date: 03/07/2017 01:00 end_date: 03/07/2017 01:55 elapsed: 60 ``` #### Problem description If there are two files, then there is a `time` coordinate, and the data array becomes 3D. #### Output of ``xr.show_versions()`` I am currently on a recent-ish master of xarray.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1988/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 110726841,MDU6SXNzdWUxMTA3MjY4NDE=,615,operations with pd.to_timedelta() now fails,291576,closed,0,,,3,2015-10-09T20:01:00Z,2015-10-09T21:21:41Z,2015-10-09T20:15:49Z,CONTRIBUTOR,,,,"Not exactly sure when this started to fail, but I recently upgraded my pandas install and a script of mine started to fail. The SSCCE: ``` from datetime import datetime, timedelta import xray import pandas as pd a = xray.Dataset({'time': [datetime(2000, 1, 1)]}) a['time'] -= pd.to_timedelta(timedelta(hours=6)) Traceback (most recent call last): File """", line 1, in File ""/nas/home/broot/centos6/lib/python2.7/site-packages/xray-0.6.0_154_gf270b9f-py2.7.egg/xray/core/dataarray.py"", line 1091, in func f(self.variable, other_variable) File ""/nas/home/broot/centos6/lib/python2.7/site-packages/xray-0.6.0_154_gf270b9f-py2.7.egg/xray/core/variable.py"", line 799, in func self.values = f(self_data, other_data) TypeError: ufunc subtract cannot use operands with types dtype('>> x1 = xray.DataArray(np.arange(0, 10, 0.2), name='x') >>> a = xray.DataArray(np.zeros(x1.shape), {'dim_0': x1}, name='foo') >>> a.to_dataset().groupby(np.round(x1)).reduce(np.min) Dimensions: (x: 11) Coordinates: * x (x) float64 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 Variables: foo float64 0.0 >>> a.groupby(np.round(x1)).reduce(np.min) array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]) Coordinates: * x (x) float64 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 ``` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/268/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 46745063,MDU6SXNzdWU0Njc0NTA2Mw==,264,align silently upcasts data arrays when NaNs are inserted,291576,closed,0,,,2,2014-10-24T14:36:20Z,2014-10-28T06:47:38Z,2014-10-28T06:47:38Z,CONTRIBUTOR,,,,"The NaNs being inserted during the join is irrespective of the dtype of the array. ``` import numpy as np import xray x1 = np.arange(30) x2 = np.arange(5, 35) a = xray.DataArray(np.random.random((30,)).astype('f32'), {'x': x1}) b = xray.DataArray(np.random.random((30,)).astype('f32'), {'x': x2}) c, d = xray.align(a, b, join='outer') print c.dtype ``` The output is float64. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/264/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 46756880,MDU6SXNzdWU0Njc1Njg4MA==,267,can't use datetime or pandas datetime to index time dimension,291576,closed,0,,,6,2014-10-24T16:28:49Z,2014-10-28T04:15:04Z,2014-10-28T04:15:04Z,CONTRIBUTOR,,,,"Consider the following: ``` >>> c array([ 9., 6., 6., ..., 10., 5., 3.], dtype=float32) Coordinates: latitude float32 64.833 elevation float32 137.5 longitude float32 -147.6 * time (time) datetime64[ns] 2013-01-01T11:15:00 ... Attributes: units: miles per hour >>> c.sel(time='2013-01-01') array([ 9., 6., 6., 1., nan, 1., nan, 2., 1., 1., 1., 1., 2.], dtype=float32) Coordinates: latitude float32 64.833 elevation float32 137.5 longitude float32 -147.6 * time (time) datetime64[ns] 2013-01-01T11:15:00 ... Attributes: units: miles per hour >>> c.sel(time=datetime(2013, 1, 1)) Traceback (most recent call last): File """", line 1, in File ""/nas/home/broot/centos6/lib/python2.7/site-packages/xray-0.3.1.dev_ad43f0b-py2.7.egg/xray/core/dataarray.py"", line 495, in sel return self.isel(**indexing.remap_label_indexers(self, indexers)) File ""/nas/home/broot/centos6/lib/python2.7/site-packages/xray-0.3.1.dev_ad43f0b-py2.7.egg/xray/core/indexing.py"", line 145, in remap_label_indexers for dim, label in iteritems(indexers)) File ""/nas/home/broot/centos6/lib/python2.7/site-packages/xray-0.3.1.dev_ad43f0b-py2.7.egg/xray/core/indexing.py"", line 145, in for dim, label in iteritems(indexers)) File ""/nas/home/broot/centos6/lib/python2.7/site-packages/xray-0.3.1.dev_ad43f0b-py2.7.egg/xray/core/indexing.py"", line 129, in convert_label_indexer indexer = index.get_loc(np.asscalar(label)) File ""/nas/home/broot/centos6/lib/python2.7/site-packages/pandas/tseries/index.py"", line 1280, in get_loc return self._engine.get_loc(stamp) File ""index.pyx"", line 519, in pandas.index.DatetimeEngine.get_loc (pandas/index.c:9229) File ""index.pyx"", line 544, in pandas.index.DatetimeEngine.get_loc (pandas/index.c:9018) KeyError: Timestamp('2013-01-01 00:00:00+0000', tz='UTC') ``` The same thing happens if I do a pd.to_datetime() first. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/267/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 46022646,MDU6SXNzdWU0NjAyMjY0Ng==,254,order matters when doing comparisons against scalar xray objects,291576,closed,0,,799012,2,2014-10-16T19:03:11Z,2014-10-23T06:43:30Z,2014-10-23T06:43:23Z,CONTRIBUTOR,,,,"Working on some bounding box extraction code, I computed a bounding box by taking mins and maxes of the coordinates from an xray object resulting in a dictionary of scalar xray objects. When comparing an xray DataArray against this scalar xray object, the order seems to matter. This results in problems down the road that wouldn't happen if I just had a scalar value instead of a scalar xray object. ``` >>> bbox {'longitude': ( array(-102.8782), array(-94.6244))} >>> a = bbox['longitude'][0] <= mod['longitude'] >>> b = mod['longitude'] <= bbox['longitude'][1] >>> c = mod['longitude'] >= bbox['longitude'][0] >>> a array([False, False, False, ..., True, True, True], dtype=bool) Coordinates: * longitude (longitude) bool False False False False False False False False False ... >>> b array([ True, True, True, ..., False, False, False], dtype=bool) Coordinates: * longitude (longitude) float32 -129.995 -129.985 -129.975 -129.965 -129.955 ... ``` See that the ""a"" object has a name ""longitude"" while the ""b"" object does not. Therefore... ``` >>> a & b Traceback (most recent call last): File """", line 1, in File ""/nas/home/broot/centos6/lib/python2.7/site-packages/xray/core/dataarray.py"", line 850, in func ds = self.coords.merge(other_coords) File ""/nas/home/broot/centos6/lib/python2.7/site-packages/xray/core/coordinates.py"", line 122, in merge conflicts = self._merge_validate(other) File ""/nas/home/broot/centos6/lib/python2.7/site-packages/xray/core/coordinates.py"", line 80, in _merge_validate raise ValueError('index %r not aligned' % k) ValueError: index 'longitude' not aligned ``` But, if I use the ""c"" object instead which was created flipping the comparison around: ``` >>> c * b array([False, False, False, ..., False, False, False], dtype=bool) Coordinates: * longitude (longitude) float32 -129.995 -129.985 -129.975 -129.965 -129.955 ... >>> ``` everything works as expected. I have a vague idea of why this is happening, but I am not exactly sure how one should go about dealing with this. It is a similar problem elsewhere with subclassed numpy arrays. For now, I am going to have to go with the rule of keeping the xray dataarray object first, but that really isn't going to work in other places where I may not know that I am passing xray objects. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/254/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue