home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 963688125

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
963688125 MDU6SXNzdWU5NjM2ODgxMjU= 5686 xindexes set incorrectly for mfdataset with dask client and parallel=True 6063709 closed 0     8 2021-08-09T06:29:41Z 2021-08-09T23:44:10Z 2021-08-09T22:36:53Z CONTRIBUTOR      

What happened: Using open_mfdataset with parallel=True with a dask.distributed client active fails to set .xindexes correctly.

What you expected to happen: The indexes should contain an index that can be printed correctly. When using repr the .xindexes fails with TypeError: cannot compute the time difference between dates with different calendars due to an error in .asi8

Minimal Complete Verifiable Example:

```python import xarray as xr import numpy as np from dask.distributed import Client

Need a main routine for dask.distributed if run as script

if name == "main":

client = Client(n_workers=1)

# Create some synthetic data
time_365_decade = xr.cftime_range(start="2100", periods=120, freq="1MS", calendar="noleap")

ds = xr.Dataset(
        {"a": ("time", np.arange(time_365_decade.size))},
        coords={"time": time_365_decade},
)

index_microseconds = ds.xindexes['time'].array.asi8

# Save to a file per year
years, datasets = zip(*ds.groupby("time.year"))
xr.save_mfdataset(datasets, [f"{y}.nc" for y in years])

# Open saved files, parallel=False and asi8 ok
assert (index_microseconds == xr.open_mfdataset('2???.nc', parallel=False).xindexes['time'].array.asi8).all()

# Open saved files, parallel=True and asi8 fails
assert (index_microseconds == xr.open_mfdataset('2???.nc', parallel=True).xindexes['time'].array.asi8).all()

```

Anything else we need to know?: the asi8 function fails

https://github.com/pydata/xarray/blob/main/xarray/coding/cftimeindex.py#L677

because python epoch = self.date_type(1970, 1, 1) returns a cftime.datetime with a calendar and has_year_zero attribute that do not match the index (Pdb) p epoch cftime.datetime(1970, 1, 1, 0, 0, 0, 0, calendar='gregorian', has_year_zero=False)

Previously reported this as https://github.com/pydata/xarray/issues/5677

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:39:48) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 4.18.0-305.7.1.el8.nci.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_AU.utf8 LANG: en_AU.ISO8859-1 LOCALE: ('en_AU', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.19.0 pandas: 1.3.1 numpy: 1.21.1 scipy: 1.7.1 netCDF4: 1.5.6 pydap: installed h5netcdf: 0.11.0 h5py: 2.10.0 Nio: None zarr: 2.8.3 cftime: 1.5.0 nc_time_axis: 1.3.1 PseudoNetCDF: None rasterio: 1.2.6 cfgrib: 0.9.9.0 iris: 3.0.4 bottleneck: 1.3.2 dask: 2021.07.2 distributed: 2021.07.2 matplotlib: 3.4.2 cartopy: 0.19.0.post1 seaborn: 0.11.1 numbagg: None pint: 0.17 setuptools: 52.0.0.post20210125 pip: 21.1.3 conda: 4.10.3 pytest: 6.2.4 IPython: 7.26.0 sphinx: 4.1.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5686/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 8 rows from issue in issue_comments
Powered by Datasette · Queries took 1.031ms · About: xarray-datasette