home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

2 rows where comments = 14, type = "issue" and user = 35968931 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

These facets timed out: repo

type 1

  • issue · 2 ✖

state 1

  • open 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1940536602 I_kwDOAMm_X85zqj0a 8298 cftime.DatetimeNoLeap incorrectly decoded from netCDF file TomNicholas 35968931 open 0     14 2023-10-12T18:13:53Z 2024-01-08T01:01:53Z   MEMBER      

What happened?

I have been given a netCDF file (I think it's netCDF3) which when I open it does not decode the time variable in the way I expected it to. The time coordinate created is a numpy object array

What did you expect to happen?

I expected it to automatically create a coordinate backed by a CFTimeIndex object, not a CFTimeIndex object wrapped inside another array type.

Minimal Complete Verifiable Example

The original problematic file is 455MB (I can share it if necessary), but I can create a small netCDF file that displays the same issue.

```python import cftime

time_values = [cftime.DatetimeNoLeap(347, 2, 1, 0, 0, 0, 0, has_year_zero=True)] time_ds = xr.Dataset(coords={'time': (['time'], time_values)}) print(time_ds) time_ds.to_netcdf('time_mwe.nc') <xarray.Dataset> Dimensions: (time: 1) Coordinates: * time (time) object 0347-02-01 00:00:00 Data variables: empty python ds = xr.open_dataset('time_mwe.nc', engine='netcdf4', decode_times=True, use_cftime=True) print(ds) <xarray.Dataset> Dimensions: (time: 1) Coordinates: * time (time) object 0347-02-01 00:00:00 Data variables: empty ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

cftime 1.6.2 netcdf4 1.6.4 xarray 2023.8.0

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8298/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1372035441 I_kwDOAMm_X85Rx5lx 7031 Periodic Boundary Index TomNicholas 35968931 open 0     14 2022-09-13T21:39:40Z 2022-09-16T10:50:10Z   MEMBER      

What is your issue?

I would like to create a PeriodicBoundaryIndex using the Explicit Indexes refactor. I want to do it first in 1D, then 2D, then maybe ND.

I'm thinking this would be useful for: 1) Geoscientists with periodic longitudes 2) Any scientists with periodic domains 3) Road-testing the refactor + how easy the documentation is to follow.

Eventually I think perhaps this index should live in xarray itself? As it's domain-agnostic, doesn't introduce extra dependencies, and could be a conceptually simple example of a custom index.

I had a first go, using the benbovy:add-set-xindex-and-drop-indexes branch, and reading the in-progress docs page. I got a bit stuck early on though.

@benbovy here's what I have so far:

```python import numpy as np import pandas as pd import xarray as xr from xarray.core.variable import Variable from xarray.core.indexes import PandasIndex, is_scalar

from typing import Union, Mapping, Any

class PeriodicBoundaryIndex(PandasIndex): """ An index representing any 1D periodic numberline.

Implementation subclasses a normal xarray PandasIndex object but intercepts indexer queries.
"""

def _periodic_subset(self, indxr: Union[int, slice, np.ndarray]) -> pd.Index:
    """Equivalent of __getitem__ for a pd.Index, but respects periodicity."""

    length = len(self)

    if isinstance(indxr, int):
        return self.index[indxr % length]
    elif isinstance(indxr, slice):
        raise NotImplementedError()
    elif isinstance(indxr, np.ndarray):
        raise NotImplementedError()
    else:
        raise TypeError

def isel(
    self, indexers: Mapping[Any, Union[int, slice, np.ndarray, Variable]]
) -> Union["PeriodicBoundaryIndex", None]:

    print("isel called")

    indxr = indexers[self.dim]
    if isinstance(indxr, Variable):
        if indxr.dims != (self.dim,):
            # can't preserve a index if result has new dimensions
            return None
        else:
            indxr = indxr.data
    if not isinstance(indxr, slice) and is_scalar(indxr):
        # scalar indexer: drop index
        return None

    subsetted_index = self._periodic_subset[indxr]
    return self._replace(subsetted_index)

```

```python airtemps = xr.tutorial.open_dataset("air_temperature")['air']

da = airtemps.drop_indexes("lon")

world = da.set_xindex("lon", index_cls=PeriodicBoundaryIndex) ```

Now selecting a value with isel inside the range works fine, giving the same result same as without my custom index. (The length of the example dataset along lon is 53.)

python world.isel(lon=45)

isel called <xarray.DataArray 'air' (time: 2920, lat: 25)> ...

But indexing with a lon value outside the range of the index data gives an IndexError, seemingly without consulting my new index object. It didn't even print "isel called" :confused: What should I have implemented that I didn't implement?

python world.isel(lon=55)

```python

IndexError Traceback (most recent call last) Input In [35], in <cell line: 1>() ----> 1 world.isel(lon=55)

File ~/Documents/Work/Code/xarray/xarray/core/dataarray.py:1297, in DataArray.isel(self, indexers, drop, missing_dims, **indexers_kwargs) 1292 return self._from_temp_dataset(ds) 1294 # Much faster algorithm for when all indexers are ints, slices, one-dimensional 1295 # lists, or zero or one-dimensional np.ndarray's -> 1297 variable = self._variable.isel(indexers, missing_dims=missing_dims) 1298 indexes, index_variables = isel_indexes(self.xindexes, indexers) 1300 coords = {}

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:1233, in Variable.isel(self, indexers, missing_dims, **indexers_kwargs) 1230 indexers = drop_dims_from_indexers(indexers, self.dims, missing_dims) 1232 key = tuple(indexers.get(dim, slice(None)) for dim in self.dims) -> 1233 return self[key]

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:793, in Variable.getitem(self, key) 780 """Return a new Variable object whose contents are consistent with 781 getting the provided key from the underlying data. 782 (...) 790 array x.values directly. 791 """ 792 dims, indexer, new_order = self._broadcast_indexes(key) --> 793 data = as_indexable(self._data)[indexer] 794 if new_order: 795 data = np.moveaxis(data, range(len(new_order)), new_order)

File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:657, in MemoryCachedArray.getitem(self, key) 656 def getitem(self, key): --> 657 return type(self)(_wrap_numpy_scalars(self.array[key]))

File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:626, in CopyOnWriteArray.getitem(self, key) 625 def getitem(self, key): --> 626 return type(self)(_wrap_numpy_scalars(self.array[key]))

File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:533, in LazilyIndexedArray.getitem(self, indexer) 531 array = LazilyVectorizedIndexedArray(self.array, self.key) 532 return array[indexer] --> 533 return type(self)(self.array, self._updated_key(indexer))

File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:505, in LazilyIndexedArray._updated_key(self, new_key) 503 full_key.append(k) 504 else: --> 505 full_key.append(_index_indexer_1d(k, next(iter_new_key), size)) 506 full_key = tuple(full_key) 508 if all(isinstance(k, integer_types + (slice,)) for k in full_key):

File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:278, in _index_indexer_1d(old_indexer, applied_indexer, size) 276 indexer = slice_slice(old_indexer, applied_indexer, size) 277 else: --> 278 indexer = _expand_slice(old_indexer, size)[applied_indexer] 279 else: 280 indexer = old_indexer[applied_indexer]

IndexError: index 55 is out of bounds for axis 0 with size 53 ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7031/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 3016.421ms · About: xarray-datasette