home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 108769226

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
108769226 MDU6SXNzdWUxMDg3NjkyMjY= 593 Bug when accessing sorted dataset before loading 2443309 closed 0     6 2015-09-28T23:58:29Z 2016-01-04T23:11:55Z 2015-10-02T21:41:11Z MEMBER      

I ran into this bug this afternoon. If I sort a Dataset using isel before loading the data, I end up with an error in the netCDF4 backend. If I call Dataset.load() before sorting the Dataset, I get the expected behavior.

First some info on my environment (everything should be fresh):

Python version : 3.4.3 |Anaconda 2.3.0 (x86_64)| (default, Mar 6 2015, 12:07:41) [GCC 4.2.1 (Apple Inc. build 5577)] xray version : 0.6.0 numpy version : 1.9.3 netCDF4 version : 1.1.9

Now for a simplified example that reproduces the bug:

``` Python

In [1]: import xray import numpy as np import netCDF4

In [2]: random_data = np.random.random(size=(4, 6)) dim0 = [0, 1, 2, 3] dim1 = [0, 2, 1, 3, 5, 4] # We will sort this in a later step da = xray.DataArray(data=random_data, dims=('dim0', 'dim1'), coords={'dim0': dim0, 'dim1': dim1}, name='randovar') ds = da.to_dataset() ds.to_netcdf('rando.nc')

In [3]: ds2 = xray.open_dataset('rando.nc')

ds2.load() # work around to prevent IndexError

inds = np.argsort(ds2.dim1.values) ds2 = ds2.isel(dim1=inds) print(ds2.randovar)

Out[3]:

IndexError Traceback (most recent call last) <ipython-input-3-9b4ab63c0fd2> in <module>() 2 inds = np.argsort(ds2.dim1.values) 3 ds2 = ds2.isel(dim1=inds) ----> 4 print(ds2.randovar)

...

/Users/jhamman/anaconda/lib/python3.4/site-packages/xray/backends/netCDF4_.py in getitem(self, key) 43 else: 44 getitem = operator.getitem ---> 45 data = getitem(self.array, key) 46 if self.ndim == 0: 47 # work around for netCDF4-python's broken handling of 0-d

netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable.getitem (netCDF4/_netCDF4.c:30994)()

/Users/jhamman/anaconda/lib/python3.4/site-packages/netCDF4/utils.py in _StartCountStride(elem, shape, dimensions, grp, datashape, put) 220 # duplicate indices in the sequence) 221 msg = "integer sequences in slices must be sorted and cannot have duplicates" --> 222 raise IndexError(msg) 223 # convert to boolean array. 224 # if unlim, let boolean array be longer than current dimension

IndexError: integer sequences in slices must be sorted and cannot have duplicates ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/593/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 6 rows from issue in issue_comments
Powered by Datasette · Queries took 0.584ms · About: xarray-datasette