home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 962467654

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
962467654 MDU6SXNzdWU5NjI0Njc2NTQ= 5677 sel slice fails with cftime index when using dask.distributed client 6063709 closed 0     2 2021-08-06T07:16:20Z 2021-08-09T06:30:26Z 2021-08-09T06:30:26Z CONTRIBUTOR      

What happened: Tried to .sel() a time slice from a multi-file dataset when dask.distributed client active. Got this error:

```python

KeyError Traceback (most recent call last) /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 3360 try: -> 3361 return self._engine.get_loc(casted_key) 3362 except KeyError as err:

/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: cftime.datetime(2086, 1, 1, 0, 0, 0, 0, calendar='gregorian', has_year_zero=False)

The above exception was the direct cause of the following exception:

KeyError Traceback (most recent call last) /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_slice_bound(self, label, side, kind) 5801 try: -> 5802 slc = self.get_loc(label) 5803 except KeyError as err:

/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/coding/cftimeindex.py in get_loc(self, key, method, tolerance) 465 else: --> 466 return pd.Index.get_loc(self, key, method=method, tolerance=tolerance) 467

/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 3362 except KeyError as err: -> 3363 raise KeyError(key) from err 3364

KeyError: cftime.datetime(2086, 1, 1, 0, 0, 0, 0, calendar='gregorian', has_year_zero=False)

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last) src/cftime/_cftime.pyx in cftime._cftime.datetime.richcmp()

src/cftime/_cftime.pyx in cftime._cftime.datetime.change_calendar()

ValueError: change_calendar only works for real-world calendars

During handling of the above exception, another exception occurred:

TypeError Traceback (most recent call last) /local/v45/aph502/tmp/ipykernel_108691/1049912036.py in <module> ----> 1 u.sel(time=slice(start_time,end_time))

/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/core/dataarray.py in sel(self, indexers, method, tolerance, drop, **indexers_kwargs) 1313 Dimensions without coordinates: points 1314 """ -> 1315 ds = self._to_temp_dataset().sel( 1316 indexers=indexers, 1317 drop=drop,

/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/core/dataset.py in sel(self, indexers, method, tolerance, drop, **indexers_kwargs) 2472 """ 2473 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "sel") -> 2474 pos_indexers, new_indexes = remap_label_indexers( 2475 self, indexers=indexers, method=method, tolerance=tolerance 2476 )

/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/core/coordinates.py in remap_label_indexers(obj, indexers, method, tolerance, **indexers_kwargs) 419 } 420 --> 421 pos_indexers, new_indexes = indexing.remap_label_indexers( 422 obj, v_indexers, method=method, tolerance=tolerance 423 )

/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/core/indexing.py in remap_label_indexers(data_obj, indexers, method, tolerance) 115 for dim, index in indexes.items(): 116 labels = grouped_indexers[dim] --> 117 idxr, new_idx = index.query(labels, method=method, tolerance=tolerance) 118 pos_indexers[dim] = idxr 119 if new_idx is not None:

/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/core/indexes.py in query(self, labels, method, tolerance) 196 197 if isinstance(label, slice): --> 198 indexer = _query_slice(index, label, coord_name, method, tolerance) 199 elif is_dict_like(label): 200 raise ValueError(

/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/core/indexes.py in _query_slice(index, label, coord_name, method, tolerance) 89 "cannot use method argument if any indexers are slice objects" 90 ) ---> 91 indexer = index.slice_indexer( 92 _sanitize_slice_element(label.start), 93 _sanitize_slice_element(label.stop),

/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/core/indexes/base.py in slice_indexer(self, start, end, step, kind) 5684 slice(1, 3, None) 5685 """ -> 5686 start_slice, end_slice = self.slice_locs(start, end, step=step) 5687 5688 # return a slice

/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/core/indexes/base.py in slice_locs(self, start, end, step, kind) 5886 start_slice = None 5887 if start is not None: -> 5888 start_slice = self.get_slice_bound(start, "left") 5889 if start_slice is None: 5890 start_slice = 0

/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_slice_bound(self, label, side, kind) 5803 except KeyError as err: 5804 try: -> 5805 return self._searchsorted_monotonic(label, side) 5806 except ValueError: 5807 # raise the original KeyError

/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/core/indexes/base.py in _searchsorted_monotonic(self, label, side) 5754 def _searchsorted_monotonic(self, label, side: str_t = "left"): 5755 if self.is_monotonic_increasing: -> 5756 return self.searchsorted(label, side=side) 5757 elif self.is_monotonic_decreasing: 5758 # np.searchsorted expects ascending sort order, have to reverse

/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/core/base.py in searchsorted(self, value, side, sorter) 1219 @doc(_shared_docs["searchsorted"], klass="Index") 1220 def searchsorted(self, value, side="left", sorter=None) -> np.ndarray: -> 1221 return algorithms.searchsorted(self._values, value, side=side, sorter=sorter) 1222 1223 def drop_duplicates(self, keep="first"):

/g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/core/algorithms.py in searchsorted(arr, value, side, sorter) 1583 arr = ensure_wrapped_if_datetimelike(arr) 1584 -> 1585 return arr.searchsorted(value, side=side, sorter=sorter) 1586 1587

src/cftime/_cftime.pyx in cftime._cftime.datetime.richcmp()

TypeError: cannot compare cftime.datetime(2086, 5, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True) and cftime.datetime(2086, 1, 1, 0, 0, 0, 0, calendar='gregorian', has_year_zero=False) ```

So the slice indexing has created a bounding value with the wrong calendar, should be 365_year but is gregorian. python KeyError: cftime.datetime(2086, 1, 1, 0, 0, 0, 0, calendar='gregorian', has_year_zero=False) Note that this only happens when a dask.distributed client is loaded

What you expected to happen: expected it to return the same slice it does without error if the client is not active.

Minimal Complete Verifiable Example: I tried really really hard to create a synthetic example but I couldn't make one that would fail, but loading the mfdataset from disk will make it fail reliably. I have tested multiple times.

The dataset:

xarray.DataArray
'u'
  • time: 15
  • st_ocean: 75
  • yu_ocean: 2700
  • xu_ocean: 3600
  • <label for="section-cde91b8b-6f17-415e-a2cc-e525088a0a57" title="Show/hide data repr" style="box-sizing: unset; grid-column-start: 1; grid-column-end: auto; vertical-align: top; color: var(--xr-font-color2); cursor: pointer;"><svg class="icon xr-icon-database"><use xlink:href="#icon-database"></use></svg></label>
    Array Chunk Bytes 40.74 GiB 3.20 MiB Shape (15, 75, 2700, 3600) (1, 7, 300, 400) Count 26735 Tasks 13365 Chunks Type float32 numpy.ndarray |   | Array | Chunk | Bytes | 40.74 GiB | 3.20 MiB | Shape | (15, 75, 2700, 3600) | (1, 7, 300, 400) | Count | 26735 Tasks | 13365 Chunks | Type | float32 | numpy.ndarray | 1513600270075 -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --   40.74 GiB | 3.20 MiB (15, 75, 2700, 3600) | (1, 7, 300, 400) 26735 Tasks | 13365 Chunks float32 | numpy.ndarray
  • <label for="section-c8832f0d-583a-448f-9577-08c50450d161" class="xr-section-summary" style="box-sizing: unset; grid-column-start: 1; grid-column-end: auto; color: var(--xr-font-color2); font-weight: 500; padding-top: 4px; padding-bottom: 4px; cursor: pointer;">Coordinates: 
    • st_ocean
      (st_ocean)
      float64
      0.5413 1.681 ... 5.709e+03
      <label for="attrs-460bfc52-3f95-4c90-80f6-fbf61ba08e31" title="Show/Hide attributes" style="box-sizing: unset; background-color: var(--xr-background-color-row-odd); margin-bottom: 0px; color: var(--xr-font-color2); cursor: pointer;"><svg class="icon xr-icon-file-text2"><use xlink:href="#icon-file-text2"></use></svg></label><label for="data-d437d9a9-1b0b-4ddf-95ea-6ec48973a4a1" title="Show/Hide data repr" style="box-sizing: unset; background-color: var(--xr-background-color-row-odd); margin-bottom: 0px; color: var(--xr-font-color2); cursor: pointer;"><svg class="icon xr-icon-database"><use xlink:href="#icon-database"></use></svg></label>
    • time
      (time)
      object
      2085-10-16 12:00:00 ... 2086-12-...
      <label for="attrs-5c3c11ea-3616-4e6c-8da5-d90a3de74cc8" title="Show/Hide attributes" style="box-sizing: unset; background-color: var(--xr-background-color-row-even); margin-bottom: 0px; color: var(--xr-font-color2); cursor: pointer;"><svg class="icon xr-icon-file-text2"><use xlink:href="#icon-file-text2"></use></svg></label><label for="data-c74ab087-7010-4076-9e77-fe8556853756" title="Show/Hide data repr" style="box-sizing: unset; background-color: var(--xr-background-color-row-even); margin-bottom: 0px; color: var(--xr-font-color2); cursor: pointer;"><svg class="icon xr-icon-database"><use xlink:href="#icon-database"></use></svg></label>
      array([cftime.datetime(2085, 10, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True),
             cftime.datetime(2085, 11, 16, 0, 0, 0, 0, calendar='noleap', has_year_zero=True),
             cftime.datetime(2085, 12, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True),
             cftime.datetime(2086, 1, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True),
             cftime.datetime(2086, 2, 15, 0, 0, 0, 0, calendar='noleap', has_year_zero=True),
             cftime.datetime(2086, 3, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True),
             cftime.datetime(2086, 4, 16, 0, 0, 0, 0, calendar='noleap', has_year_zero=True),
             cftime.datetime(2086, 5, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True),
             cftime.datetime(2086, 6, 16, 0, 0, 0, 0, calendar='noleap', has_year_zero=True),
             cftime.datetime(2086, 7, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True),
             cftime.datetime(2086, 8, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True),
             cftime.datetime(2086, 9, 16, 0, 0, 0, 0, calendar='noleap', has_year_zero=True),
             cftime.datetime(2086, 10, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True),
             cftime.datetime(2086, 11, 16, 0, 0, 0, 0, calendar='noleap', has_year_zero=True),
             cftime.datetime(2086, 12, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True)],
            dtype=object)
    • xu_ocean
      (xu_ocean)
      float64
      -279.9 -279.8 -279.7 ... 79.9 80.0
      <label for="attrs-deb0e0ca-d92a-4695-8544-a9985caa3df3" title="Show/Hide attributes" style="box-sizing: unset; background-color: var(--xr-background-color-row-odd); margin-bottom: 0px; color: var(--xr-font-color2); cursor: pointer;"><svg class="icon xr-icon-file-text2"><use xlink:href="#icon-file-text2"></use></svg></label><label for="data-aafd5159-4edd-4505-a77a-687ba340da33" title="Show/Hide data repr" style="box-sizing: unset; background-color: var(--xr-background-color-row-odd); margin-bottom: 0px; color: var(--xr-font-color2); cursor: pointer;"><svg class="icon xr-icon-database"><use xlink:href="#icon-database"></use></svg></label>
    • yu_ocean
      (yu_ocean)
      float64
      -81.09 -81.05 -81.0 ... 89.96 90.0
      <label for="attrs-0cea6a87-ca0c-47ab-a25c-5784ea14a5ba" title="Show/Hide attributes" style="box-sizing: unset; background-color: var(--xr-background-color-row-even); margin-bottom: 0px; color: var(--xr-font-color2); cursor: pointer;"><svg class="icon xr-icon-file-text2"><use xlink:href="#icon-file-text2"></use></svg></label><label for="data-282162ad-9547-401b-976f-a22fa5efeae9" title="Show/Hide data repr" style="box-sizing: unset; background-color: var(--xr-background-color-row-even); margin-bottom: 0px; color: var(--xr-font-color2); cursor: pointer;"><svg class="icon xr-icon-database"><use xlink:href="#icon-database"></use></svg></label>
    • <label for="section-c71a9525-5800-445c-b401-78088cfc4247" class="xr-section-summary" style="box-sizing: unset; grid-column-start: 1; grid-column-end: auto; color: var(--xr-font-color2); font-weight: 500; padding-top: 4px; padding-bottom: 4px; cursor: pointer;">Attributes: 
      <dl class="xr-attrs" style="box-sizing: unset; padding: 0px; grid-column-start: 1; grid-column-end: -1; display: grid; width: 700px; overflow: hidden; margin: 0px; grid-template-columns: 125px auto;"><dt style="box-sizing: unset; display: block; font-weight: normal; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; grid-column-start: 1; grid-column-end: auto;">long_name :</dt><dd style="box-sizing: unset; display: block; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; grid-column-start: 2; grid-column-end: auto; white-space: pre-wrap; word-break: break-all;">i-current</dd><dt style="box-sizing: unset; display: block; font-weight: normal; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; grid-column-start: 1; grid-column-end: auto;">units :</dt><dd style="box-sizing: unset; display: block; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; grid-column-start: 2; grid-column-end: auto; white-space: pre-wrap; word-break: break-all;">m/sec</dd><dt style="box-sizing: unset; display: block; font-weight: normal; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; grid-column-start: 1; grid-column-end: auto;">valid_range :</dt><dd style="box-sizing: unset; display: block; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; grid-column-start: 2; grid-column-end: auto; white-space: pre-wrap; word-break: break-all;">[-10. 10.]</dd><dt style="box-sizing: unset; display: block; font-weight: normal; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; grid-column-start: 1; grid-column-end: auto;">cell_methods :</dt><dd style="box-sizing: unset; display: block; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; grid-column-start: 2; grid-column-end: auto; white-space: pre-wrap; word-break: break-all;">time: mean</dd><dt style="box-sizing: unset; display: block; font-weight: normal; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; grid-column-start: 1; grid-column-end: auto;">time_avg_info :</dt><dd style="box-sizing: unset; display: block; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; grid-column-start: 2; grid-column-end: auto; white-space: pre-wrap; word-break: break-all;">average_T1,average_T2,average_DT</dd><dt style="box-sizing: unset; display: block; font-weight: normal; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; grid-column-start: 1; grid-column-end: auto;">coordinates :</dt><dd style="box-sizing: unset; display: block; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; grid-column-start: 2; grid-column-end: auto; white-space: pre-wrap; word-break: break-all;">geolon_c geolat_c</dd><dt style="box-sizing: unset; display: block; font-weight: normal; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; grid-column-start: 1; grid-column-end: auto;">standard_name :</dt><dd style="box-sizing: unset; display: block; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; grid-column-start: 2; grid-column-end: auto; white-space: pre-wrap; word-break: break-all;">sea_water_x_velocity</dd><dt style="box-sizing: unset; display: block; font-weight: normal; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; grid-column-start: 1; grid-column-end: auto;">time_bounds :</dt><dd style="box-sizing: unset; display: block; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; grid-column-start: 2; grid-column-end: auto; white-space: pre-wrap; word-break: break-all;"><xarray.DataArray 'time_bounds' (time: 15, nv: 2)> dask.array<concatenate, shape=(15, 2), dtype=timedelta64[ns], chunksize=(1, 2), chunktype=numpy.ndarray> Coordinates: * time (time) object 2085-10-16 12:00:00 ... 2086-12-16 12:00:00 * nv (nv) float64 1.0 2.0 Attributes: long_name: time axis boundaries calendar: NOLEAP</dd></dl>
      </label>
    </label>

```python

FWIW

start_time = '2086-01-01' end_time = '2086-12-31' u.sel(time=slice(start_time,end_time)) ```

Anything else we need to know?: I tried following the code execution through with pdb and it seems to start going wrong here

https://github.com/pydata/xarray/blob/eea76733770be03e78a0834803291659136bca31/xarray/core/indexing.py#L55

by line 63 data_obj.xindexes is already in a bad state

https://github.com/pydata/xarray/blob/eea76733770be03e78a0834803291659136bca31/xarray/core/indexing.py#L63

python (Pdb) data_obj.xindexes *** TypeError: cannot compute the time difference between dates with different calendars

It is called here

https://github.com/pydata/xarray/blob/eea76733770be03e78a0834803291659136bca31/xarray/core/indexing.py#L106-L108

but it isn't obvious to me how that bad state is generated.

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:39:48) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 4.18.0-326.el8.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_AU.utf8 LANG: en_US.UTF-8 LOCALE: ('en_AU', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.19.0 pandas: 1.3.1 numpy: 1.21.1 scipy: 1.7.0 netCDF4: 1.5.6 pydap: installed h5netcdf: 0.11.0 h5py: 2.10.0 Nio: None zarr: 2.8.3 cftime: 1.5.0 nc_time_axis: 1.3.1 PseudoNetCDF: None rasterio: 1.2.6 cfgrib: 0.9.9.0 iris: 3.0.4 bottleneck: 1.3.2 dask: 2021.07.2 distributed: 2021.07.2 matplotlib: 3.4.2 cartopy: 0.19.0.post1 seaborn: 0.11.1 numbagg: None pint: 0.17 setuptools: 52.0.0.post20210125 pip: 21.1.3 conda: 4.10.3 pytest: 6.2.4 IPython: 7.26.0 sphinx: 4.1.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5677/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 1.048ms · About: xarray-datasette