github: issues: 10 rows where user = 6063709 sorted by updated

10 rows where user = 6063709 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	comments	created_at	updated_at ▲	closed_at	author_association	draft	pull_request	body	reactions	state_reason	repo	type
677296128	MDU6SXNzdWU2NzcyOTYxMjg=	4336	cftime_range fails for base cftime.datetime object	aidanheerdegen 6063709	open	8	2020-08-12T00:56:18Z	2023-11-26T22:40:03Z		CONTRIBUTOR			What happened: `xarray.cftime_range` does not accept dates that use base class`cftime.datetime` objects. What you expected to happen: I expected `xarray.cftime_range` to raise an exception that this is an unsupported `cftime.datetime` type and for the documentation to reflect this. Minimal Complete Verifiable Example: `python import cftime import xarray date = cftime.datetime(10,1,1) xarray.cftime_range(date, periods=3, freq='Y')` Anything else we need to know?: Returns this error: ```python TypeError Traceback (most recent call last) <ipython-input-29-d090ea15e436> in <module> 2 import xarray 3 date = cftime.datetime(10,1,1) ----> 4 xarray.cftime_range(date, periods=3, freq='Y') /g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.07/lib/python3.7/site-packages/xarray/coding/cftime_offsets.py in cftime_range(start, end, periods, freq, normalize, name, closed, calendar) 973 else: 974 offset = to_offset(freq) --> 975 dates = np.array(list(_generate_range(start, end, periods, offset))) 976 977 left_closed = False /g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.07/lib/python3.7/site-packages/xarray/coding/cftime_offsets.py in _generate_range(start, end, periods, offset) 744 """ 745 if start: --> 746 start = offset.rollforward(start) 747 748 if end: /g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.07/lib/python3.7/site-packages/xarray/coding/cftime_offsets.py in rollforward(self, date) 526 def rollforward(self, date): 527 """Roll date forward to nearest end of year""" --> 528 if self.onOffset(date): 529 return date 530 else: /g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.07/lib/python3.7/site-packages/xarray/coding/cftime_offsets.py in onOffset(self, date) 522 """Check if the given date is in the set of possible dates created 523 using a length-one version of this offset class.""" --> 524 return date.day == _days_in_month(date) and date.month == self.month 525 526 def rollforward(self, date): /g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.07/lib/python3.7/site-packages/xarray/coding/cftime_offsets.py in _days_in_month(date) 195 else: 196 reference = type(date)(date.year, date.month + 1, 1) --> 197 return (reference - timedelta(days=1)).day 198 199 TypeError: unsupported operand type(s) for -: 'cftime._cftime.datetime' and 'datetime.timedelta' Works if a `datetime` object with a calendar is used: import cftime import xarray date = cftime.DatetimeGregorian(10,1,1) xarray.cftime_range(date, periods=3, freq='Y') `Returns:`python CFTimeIndex([0010-12-31 00:00:00, 0011-12-31 00:00:00, 0012-12-31 00:00:00], dtype='object') ``` as expected. The error occurs here https://github.com/pydata/xarray/blob/master/xarray/coding/cftime_offsets.py#L197 because this operation is not defined for the base class https://github.com/Unidata/cftime/blob/master/cftime/_cftime.pyx#L1054 The relevant tests all seem to use datetime strings which are by default `standard` calendar: https://github.com/pydata/xarray/blob/master/xarray/coding/cftime_offsets.py#L788 Environment: Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.7.8 \| packaged by conda-forge \| (default, Jul 31 2020, 02:25:08) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 2.6.32-754.18.2.el6.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_AU.utf8 LANG: en_AU.UTF-8 LOCALE: en_AU.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.4 xarray: 0.16.0 pandas: 1.1.0 numpy: 1.19.1 scipy: 1.5.2 netCDF4: 1.5.3 pydap: installed h5netcdf: 0.8.1 h5py: 2.10.0 Nio: 1.5.5 zarr: 2.4.0 cftime: 1.0.3.4 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: 1.1.5 cfgrib: 0.9.8.4 iris: 2.4.0 bottleneck: 1.3.2 dask: 2.22.0 distributed: 2.22.0 matplotlib: 3.3.0 cartopy: 0.18.0 seaborn: 0.10.1 numbagg: None pint: 0.14 setuptools: 49.2.0.post20200712 pip: 20.1.1 conda: installed pytest: 6.0.1 IPython: 7.17.0 sphinx: 3.2.0	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4336/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
553930127	MDU6SXNzdWU1NTM5MzAxMjc=	3717	reduce on groupby auto-adds axis argument and complains when axis argument is specified	aidanheerdegen 6063709	open	3	2020-01-23T04:29:58Z	2022-04-06T15:38:59Z		CONTRIBUTOR			The behaviour of `reduce` appears to have changed in recent versions of `xarray` such that previous code that worked now throws errors. MCVE Code Sample I have repurposed someone else's nice code sample for this, thanks! ```python import pandas as pd import xarray as xr import numpy as np s_date = '1990-01-01' e_date = '2019-05-01' days = pd.date_range(start=s_date, end=e_date, freq='B', name='time') items = pd.Index([str(i) for i in range(300)], name = 'item') dat = xr.DataArray(np.random.rand(len(days), len(items)), coords=[days, items]) print(dat) def simplesum(array, axis): print(axis) return np.sum(array, axis) dat.groupby('time.month').reduce(simplesum) dat.groupby('time.month').reduce(simplesum, axis=0) ``` The `reduce` appears to insert an `axis` argument if none is specified. This is the output of the first `groupby` operations with no axis argument: python 0 0 0 0 0 0 0 0 0 0 0 0 Out[41]: <xarray.DataArray (month: 12, item: 300)> array([[330.18949303, 336.97901528, 337.80472647, ..., 322.37053342, 326.84789948, 342.22782336], [300.3301059 , 307.79967902, 322.53148357, ..., 310.20975273, 291.04344738, 310.56010997], [325.71587689, 337.25153307, 331.35493521, ..., 332.43547569, 328.23330226, 326.43909063], ..., [322.96255713, 321.44723754, 312.59983716, ..., 318.79682437, 315.81592617, 314.27316547], [294.29894222, 291.77253983, 310.85452639, ..., 314.0461447 , 298.99012623, 326.08321702], [323.6778518 , 332.71638634, 324.47244831, ..., 326.82774826, 322.09233181, 327.6385762 ]]) Coordinates: * item (item) object '0' '1' '2' '3' '4' ... '295' '296' '297' '298' '299' * month (month) int64 1 2 3 4 5 6 7 8 9 10 11 12 The second `groupby` with `axis=0` argument throws an error: ```python ValueError Traceback (most recent call last) <ipython-input-42-381dec6862e6> in <module> ----> 1 dat.groupby('time.month').reduce(simplesum, axis=0) /g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.01/lib/python3.7/site-packages/xarray/core/groupby.py in reduce(self, func, dim, axis, keep_attrs, shortcut, kwargs) 836 check_reduce_dims(dim, self.dims) 837 --> 838 return self.map(reduce_array, shortcut=shortcut) 839 840 /g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.01/lib/python3.7/site-packages/xarray/core/groupby.py in map(self, func, shortcut, args, kwargs) 755 grouped = self._iter_grouped() 756 applied = (maybe_wrap_array(arr, func(arr, args,* kwargs)) for arr in grouped) --> 757 return self._combine(applied, shortcut=shortcut) 758 759 def apply(self, func, shortcut=False, args=(), *kwargs): /g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.01/lib/python3.7/site-packages/xarray/core/groupby.py in _combine(self, applied, restore_coord_dims, shortcut) 774 def _combine(self, applied, restore_coord_dims=False, shortcut=False): 775 """Recombine the applied objects like the original.""" --> 776 applied_example, applied = peek_at(applied) 777 coord, dim, positions = self._infer_concat_args(applied_example) 778 if shortcut: /g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.01/lib/python3.7/site-packages/xarray/core/utils.py in peek_at(iterable) 180 """ 181 gen = iter(iterable) --> 182 peek = next(gen) 183 return peek, itertools.chain([peek], gen) 184 /g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.01/lib/python3.7/site-packages/xarray/core/groupby.py in <genexpr>(.0) 754 else: 755 grouped = self._iter_grouped() --> 756 applied = (maybe_wrap_array(arr, func(arr, args,* kwargs)) for arr in grouped) 757 return self._combine(applied, shortcut=shortcut) 758 /g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.01/lib/python3.7/site-packages/xarray/core/groupby.py in reduce_array(ar) 832 833 def reduce_array(ar): --> 834 return ar.reduce(func, dim, axis, keep_attrs=keep_attrs, kwargs) 835 836 check_reduce_dims(dim, self.dims) /g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.01/lib/python3.7/site-packages/xarray/core/variable.py in reduce(self, func, dim, axis, keep_attrs, keepdims, allow_lazy, *kwargs) 1511 dim = None 1512 if dim is not None and axis is not None: -> 1513 raise ValueError("cannot supply both 'axis' and 'dim' arguments") 1514 1515 if dim is not None: ValueError: cannot supply both 'axis' and 'dim' arguments ``` Expected Output I would expect the output of both `groupby` operations to be the same, though `reduce` says it should flatten the input if there is no `dim` or `axis` argument supplied, it doesn't seem to do this. The second `groupby`, with `axis=0` argument works with older versions of `xarray`(0.13.0). Problem Description It is impossible to specify a `dim` argument to `reduce`. It defaults to `axis=0` and when a different axis is specified it throws an error. Output of `xr.show_versions()` Version used and produces error: INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 \| packaged by conda-forge \| (default, Jan 7 2020, 22:33:48) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.18.0-80.11.2.el8_0.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_AU.utf8 LANG: en_AU.ISO8859-1 LOCALE: en_AU.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.1 xarray: 0.14.1 pandas: 0.25.3 numpy: 1.17.5 scipy: 1.4.1 netCDF4: 1.5.3 pydap: installed h5netcdf: 0.7.4 h5py: 2.10.0 Nio: 1.5.5 zarr: 2.4.0 cftime: 1.0.3.4 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: 1.1.1 cfgrib: 0.9.7.6 iris: 2.3.0 bottleneck: 1.3.1 dask: 2.9.2 distributed: 2.9.3 matplotlib: 2.2.4 cartopy: 0.17.0 seaborn: 0.9.0 numbagg: None setuptools: 45.0.0.post20200113 pip: 19.3.1 conda: None pytest: 5.3.4 IPython: 7.11.1 sphinx: None None The version of `xarray` does not throw an error when `axis` argument is supplied: INSTALLED VERSIONS ------------------ commit: None python: 3.6.7 \| packaged by conda-forge \| (default, Jul 2 2019, 02:18:42) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.18.0-80.11.2.el8_0.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_AU.utf8 LANG: en_AU.ISO8859-1 LOCALE: en_AU.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.13.0 pandas: 0.25.1 numpy: 1.17.2 scipy: 1.2.1 netCDF4: 1.5.1.2 pydap: installed h5netcdf: 0.7.4 h5py: 2.9.0 Nio: 1.5.5 zarr: 2.3.2 cftime: 1.0.3.4 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: None cfgrib: 0.9.7.2 iris: 2.2.1dev0 bottleneck: 1.2.1 dask: 2.4.0 distributed: 2.4.0 matplotlib: 2.2.4 cartopy: 0.17.0 seaborn: 0.9.0 numbagg: None setuptools: 41.2.0 pip: 19.2.3 conda: None pytest: 5.1.2 IPython: 7.8.0 sphinx: None None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3717/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1120276279	I_kwDOAMm_X85Cxg83	6226	open_mfdataset fails with cftime index when using parallel and dask delayed client	aidanheerdegen 6063709	closed	6	2022-02-01T06:14:07Z	2022-02-10T22:37:37Z	2022-02-10T22:37:37Z	CONTRIBUTOR			What happened? A call to `open_mfdataset` with `parallel=true` fails when using a dask delayed client with newer version of `cftime` and `xarray`. This happens with `cftime==1.5.2` and `xarray==0.20.2` but not `cftime==1.5.1` and `xarray==0.20.2`. What did you expect to happen? I expected the call to `open_mfdataset` to work without error with `parallel=True` as it does with `parallel=False` and a previous version of `cftime` Minimal Complete Verifiable Example ```python import xarray as xr import numpy as np from dask.distributed import Client Need a main routine for dask.distributed if run as script if name == "main": `client = Client(n_workers=1) t = xr.cftime_range('20010101','20010501', closed='left', calendar='noleap') x = np.arange(100) v = np.random.random((t.size,x.size)) da = xr.DataArray(v, coords=[('time',t), ('x',x)]) da.to_netcdf('sample.nc') # Works xr.open_mfdataset('sample.nc', parallel=False) # Throws TypeError exception xr.open_mfdataset('sample.nc', parallel=True)` ``` Relevant log output python distributed.protocol.core - CRITICAL - Failed to deserialize [32/525] Traceback (most recent call last): File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/protocol/core.py", line 111, in loads return msgpack.loads( File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackb File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/protocol/core.py", line 103, in _decode_default return merge_and_deserialize( File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/protocol/serialize.py", line 488, in merge_and_deserialize return deserialize(header, merged_frames, deserializers=deserializers) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/protocol/serialize.py", line 417, in deserialize return loads(header, frames) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/protocol/serialize.py", line 96, in pickle_loads return pickle.loads(x, buffers=new) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/protocol/pickle.py", line 75, in loads return pickle.loads(x) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 255, in _new_Index return cls.__new__(cls, d) TypeError: __new__() got an unexpected keyword argument 'dtype' Traceback (most recent call last): File "/g/data/v45/aph502/notebooks/test_pickle.py", line 21, in <module> xr.open_mfdataset('sample.nc', parallel=True) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/xarray/backends/api.py", line 916, in open_mfdataset datasets, closers = dask.compute(datasets, closers) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/dask/base.py", line 571, in compute results = schedule(dsk, keys, kwargs) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/client.py", line 2746, in get results = self.gather(packed, asynchronous=asynchronous, direct=direct) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/client.py", line 1946, in gather return self.sync( File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/utils.py", line 310, in sync return sync( File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/utils.py", line 364, in sync raise exc.with_traceback(tb) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/utils.py", line 349, in f result[0] = yield future File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/tornado/gen.py", line 762, in run value = future.result() File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/client.py", line 1840, in _gather response = await future File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/client.py", line 1891, in _gather_remote response = await retry_operation(self.scheduler.gather, keys=keys) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/utils_comm.py", line 385, in retry_operation return await retry( File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/utils_comm.py", line 370, in retry return await coro() File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/core.py", line 900, in send_recv_from_rpc return await send_recv(comm=comm, op=key, kwargs) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/core.py", line 669, in send_recv response = await comm.read(deserializers=deserializers) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/comm/tcp.py", line 232, in read msg = await from_frames( File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/comm/utils.py", line 78, in from_frames res = _from_frames() File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/comm/utils.py", line 61, in _from_frames return protocol.loads( File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/protocol/core.py", line 111, in loads return msgpack.loads( File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackb File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/protocol/core.py", line 103, in _decode_default return merge_and_deserialize( File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/protocol/serialize.py", line 488, in merge_and_deserialize return deserialize(header, merged_frames, deserializers=deserializers) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/protocol/serialize.py", line 417, in deserialize return loads(header, frames) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/protocol/serialize.py", line 96, in pickle_loads return pickle.loads(x) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/distributed/protocol/pickle.py", line 75, in loads return pickle.loads(x) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-22.01/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 255, in _new_Index return cls.__new__(cls, d) TypeError: __new__() got an unexpected keyword argument 'dtype' Anything else we need to know? It seems similar to previous issues with pickling https://github.com/pydata/xarray/issues/5686 which was fixed in `cftime` https://github.com/Unidata/cftime/pull/252 but the tests in previous issues still work, so it isn't exactly the same. Environment ``` INSTALLED VERSIONS commit: None python: 3.9.9 \| packaged by conda-forge \| (main, Dec 20 2021, 02:41:03) [GCC 9.4.0] python-bits: 64 OS: Linux OS-release: 4.18.0-348.2.1.el8.nci.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_AU.utf8 LANG: en_AU.ISO8859-1 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.20.2 pandas: 1.4.0 numpy: 1.22.1 scipy: 1.7.3 netCDF4: 1.5.6 pydap: installed h5netcdf: 0.13.1 h5py: 3.6.0 Nio: None zarr: 2.10.3 cftime: 1.5.2 nc_time_axis: 1.4.0 PseudoNetCDF: None rasterio: 1.2.6 cfgrib: 0.9.9.1 iris: 3.1.0 bottleneck: 1.3.2 dask: 2022.01.0 distributed: 2022.01.0 matplotlib: 3.5.1 cartopy: 0.19.0.post1 seaborn: 0.11.2 numbagg: None fsspec: 2022.01.0 cupy: 10.1.0 pint: 0.18 sparse: 0.13.0 setuptools: 59.8.0 pip: 21.3.1 conda: 4.11.0 pytest: 6.2.5 IPython: 8.0.1 sphinx: 4.4.0 ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6226/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
963688125	MDU6SXNzdWU5NjM2ODgxMjU=	5686	xindexes set incorrectly for mfdataset with dask client and parallel=True	aidanheerdegen 6063709	closed	8	2021-08-09T06:29:41Z	2021-08-09T23:44:10Z	2021-08-09T22:36:53Z	CONTRIBUTOR			What happened: Using `open_mfdataset` with `parallel=True` with a `dask.distributed` client active fails to set `.xindexes` correctly. What you expected to happen: The `indexes` should contain an index that can be printed correctly. When using `repr` the `.xindexes` fails with `TypeError: cannot compute the time difference between dates with different calendars` due to an error in `.asi8` Minimal Complete Verifiable Example: ```python import xarray as xr import numpy as np from dask.distributed import Client Need a main routine for dask.distributed if run as script if name == "main": client = Client(n_workers=1) # Create some synthetic data time_365_decade = xr.cftime_range(start="2100", periods=120, freq="1MS", calendar="noleap") ds = xr.Dataset( {"a": ("time", np.arange(time_365_decade.size))}, coords={"time": time_365_decade}, ) index_microseconds = ds.xindexes['time'].array.asi8 # Save to a file per year years, datasets = zip(ds.groupby("time.year")) xr.save_mfdataset(datasets, [f"{y}.nc" for y in years]) # Open saved files, parallel=False and asi8 ok assert (index_microseconds == xr.open_mfdataset('2???.nc', parallel=False).xindexes['time'].array.asi8).all() # Open saved files, parallel=True and asi8 fails assert (index_microseconds == xr.open_mfdataset('2???.nc', parallel=True).xindexes['time'].array.asi8).all() ``` Anything else we need to know?: the `asi8` function fails https://github.com/pydata/xarray/blob/main/xarray/coding/cftimeindex.py#L677 because `python epoch = self.date_type(1970, 1, 1)` returns a `cftime.datetime` with a calendar and `has_year_zero` attribute that do not match the index `(Pdb) p epoch cftime.datetime(1970, 1, 1, 0, 0, 0, 0, calendar='gregorian', has_year_zero=False)` Previously reported this as https://github.com/pydata/xarray/issues/5677 Environment*: Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.9.6 \| packaged by conda-forge \| (default, Jul 11 2021, 03:39:48) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 4.18.0-305.7.1.el8.nci.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_AU.utf8 LANG: en_AU.ISO8859-1 LOCALE: ('en_AU', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.19.0 pandas: 1.3.1 numpy: 1.21.1 scipy: 1.7.1 netCDF4: 1.5.6 pydap: installed h5netcdf: 0.11.0 h5py: 2.10.0 Nio: None zarr: 2.8.3 cftime: 1.5.0 nc_time_axis: 1.3.1 PseudoNetCDF: None rasterio: 1.2.6 cfgrib: 0.9.9.0 iris: 3.0.4 bottleneck: 1.3.2 dask: 2021.07.2 distributed: 2021.07.2 matplotlib: 3.4.2 cartopy: 0.19.0.post1 seaborn: 0.11.1 numbagg: None pint: 0.17 setuptools: 52.0.0.post20210125 pip: 21.1.3 conda: 4.10.3 pytest: 6.2.4 IPython: 7.26.0 sphinx: 4.1.2	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5686/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
962467654	MDU6SXNzdWU5NjI0Njc2NTQ=	5677	sel slice fails with cftime index when using dask.distributed client	aidanheerdegen 6063709	closed	2	2021-08-06T07:16:20Z	2021-08-09T06:30:26Z	2021-08-09T06:30:26Z	CONTRIBUTOR			What happened: Tried to `.sel()` a time slice from a multi-file dataset when `dask.distributed` client active. Got this error: ```python KeyError Traceback (most recent call last) /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 3360 try: -> 3361 return self._engine.get_loc(casted_key) 3362 except KeyError as err: /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc() /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: cftime.datetime(2086, 1, 1, 0, 0, 0, 0, calendar='gregorian', has_year_zero=False) The above exception was the direct cause of the following exception: KeyError Traceback (most recent call last) /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_slice_bound(self, label, side, kind) 5801 try: -> 5802 slc = self.get_loc(label) 5803 except KeyError as err: /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/coding/cftimeindex.py in get_loc(self, key, method, tolerance) 465 else: --> 466 return pd.Index.get_loc(self, key, method=method, tolerance=tolerance) 467 /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 3362 except KeyError as err: -> 3363 raise KeyError(key) from err 3364 KeyError: cftime.datetime(2086, 1, 1, 0, 0, 0, 0, calendar='gregorian', has_year_zero=False) During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) src/cftime/_cftime.pyx in cftime._cftime.datetime.richcmp() src/cftime/_cftime.pyx in cftime._cftime.datetime.change_calendar() ValueError: change_calendar only works for real-world calendars During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) /local/v45/aph502/tmp/ipykernel_108691/1049912036.py in <module> ----> 1 u.sel(time=slice(start_time,end_time)) /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/core/dataarray.py in sel(self, indexers, method, tolerance, drop, indexers_kwargs) 1313 Dimensions without coordinates: points 1314 """ -> 1315 ds = self._to_temp_dataset().sel( 1316 indexers=indexers, 1317 drop=drop, /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/core/dataset.py in sel(self, indexers, method, tolerance, drop, indexers_kwargs) 2472 """ 2473 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "sel") -> 2474 pos_indexers, new_indexes = remap_label_indexers( 2475 self, indexers=indexers, method=method, tolerance=tolerance 2476 ) /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/core/coordinates.py in remap_label_indexers(obj, indexers, method, tolerance, indexers_kwargs) 419 } 420 --> 421 pos_indexers, new_indexes = indexing.remap_label_indexers( 422 obj, v_indexers, method=method, tolerance=tolerance 423 ) /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/core/indexing.py in remap_label_indexers(data_obj, indexers, method, tolerance) 115 for dim, index in indexes.items(): 116 labels = grouped_indexers[dim] --> 117 idxr, new_idx = index.query(labels, method=method, tolerance=tolerance) 118 pos_indexers[dim] = idxr 119 if new_idx is not None: /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/core/indexes.py in query(self, labels, method, tolerance) 196 197 if isinstance(label, slice): --> 198 indexer = _query_slice(index, label, coord_name, method, tolerance) 199 elif is_dict_like(label): 200 raise ValueError( /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/core/indexes.py in _query_slice(index, label, coord_name, method, tolerance) 89 "cannot use `method` argument if any indexers are slice objects" 90 ) ---> 91 indexer = index.slice_indexer( 92 _sanitize_slice_element(label.start), 93 _sanitize_slice_element(label.stop), /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/core/indexes/base.py in slice_indexer(self, start, end, step, kind) 5684 slice(1, 3, None) 5685 """ -> 5686 start_slice, end_slice = self.slice_locs(start, end, step=step) 5687 5688 # return a slice /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/core/indexes/base.py in slice_locs(self, start, end, step, kind) 5886 start_slice = None 5887 if start is not None: -> 5888 start_slice = self.get_slice_bound(start, "left") 5889 if start_slice is None: 5890 start_slice = 0 /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_slice_bound(self, label, side, kind) 5803 except KeyError as err: 5804 try: -> 5805 return self._searchsorted_monotonic(label, side) 5806 except ValueError: 5807 # raise the original KeyError /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/core/indexes/base.py in _searchsorted_monotonic(self, label, side) 5754 def _searchsorted_monotonic(self, label, side: str_t = "left"): 5755 if self.is_monotonic_increasing: -> 5756 return self.searchsorted(label, side=side) 5757 elif self.is_monotonic_decreasing: 5758 # np.searchsorted expects ascending sort order, have to reverse /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/core/base.py in searchsorted(self, value, side, sorter) 1219 @doc(_shared_docs["searchsorted"], klass="Index") 1220 def searchsorted(self, value, side="left", sorter=None) -> np.ndarray: -> 1221 return algorithms.searchsorted(self._values, value, side=side, sorter=sorter) 1222 1223 def drop_duplicates(self, keep="first"): /g/data/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/pandas/core/algorithms.py in searchsorted(arr, value, side, sorter) 1583 arr = ensure_wrapped_if_datetimelike(arr) 1584 -> 1585 return arr.searchsorted(value, side=side, sorter=sorter) 1586 1587 src/cftime/_cftime.pyx in cftime._cftime.datetime.richcmp() TypeError: cannot compare cftime.datetime(2086, 5, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True) and cftime.datetime(2086, 1, 1, 0, 0, 0, 0, calendar='gregorian', has_year_zero=False) ``` So the slice indexing has created a bounding value with the wrong calendar, should be `365_year` but is `gregorian`. `python KeyError: cftime.datetime(2086, 1, 1, 0, 0, 0, 0, calendar='gregorian', has_year_zero=False)` Note that this only happens when a `dask.distributed` client is loaded What you expected to happen: expected it to return the same slice it does without error if the client is not active. Minimal Complete Verifiable Example*: I tried really really hard to create a synthetic example but I couldn't make one that would fail, but loading the `mfdataset` from disk will make it fail reliably. I have tested multiple times. The dataset: xarray.DataArray 'u' time: 15 st_ocean: 75 yu_ocean: 2700 xu_ocean: 3600 <label for="section-cde91b8b-6f17-415e-a2cc-e525088a0a57" title="Show/hide data repr" style="box-sizing: unset; grid-column-start: 1; grid-column-end: auto; vertical-align: top; color: var(--xr-font-color2); cursor: pointer;"><svg class="icon xr-icon-database"><use xlink:href="#icon-database"></use></svg></label> Array Chunk Bytes 40.74 GiB 3.20 MiB Shape (15, 75, 2700, 3600) (1, 7, 300, 400) Count 26735 Tasks 13365 Chunks Type float32 numpy.ndarray \| \| Array \| Chunk \| Bytes \| 40.74 GiB \| 3.20 MiB \| Shape \| (15, 75, 2700, 3600) \| (1, 7, 300, 400) \| Count \| 26735 Tasks \| 13365 Chunks \| Type \| float32 \| numpy.ndarray \| 1513600270075 -- \| -- \| -- \| -- \| -- \| -- \| -- \| -- \| -- \| -- \| -- \| -- \| -- \| -- \| -- \| -- \| -- 40.74 GiB \| 3.20 MiB (15, 75, 2700, 3600) \| (1, 7, 300, 400) 26735 Tasks \| 13365 Chunks float32 \| numpy.ndarray <label for="section-c8832f0d-583a-448f-9577-08c50450d161" class="xr-section-summary" style="box-sizing: unset; grid-column-start: 1; grid-column-end: auto; color: var(--xr-font-color2); font-weight: 500; padding-top: 4px; padding-bottom: 4px; cursor: pointer;">Coordinates: st_ocean (st_ocean) float64 0.5413 1.681 ... 5.709e+03 <label for="attrs-460bfc52-3f95-4c90-80f6-fbf61ba08e31" title="Show/Hide attributes" style="box-sizing: unset; background-color: var(--xr-background-color-row-odd); margin-bottom: 0px; color: var(--xr-font-color2); cursor: pointer;"><svg class="icon xr-icon-file-text2"><use xlink:href="#icon-file-text2"></use></svg></label><label for="data-d437d9a9-1b0b-4ddf-95ea-6ec48973a4a1" title="Show/Hide data repr" style="box-sizing: unset; background-color: var(--xr-background-color-row-odd); margin-bottom: 0px; color: var(--xr-font-color2); cursor: pointer;"><svg class="icon xr-icon-database"><use xlink:href="#icon-database"></use></svg></label> time (time) object 2085-10-16 12:00:00 ... 2086-12-... <label for="attrs-5c3c11ea-3616-4e6c-8da5-d90a3de74cc8" title="Show/Hide attributes" style="box-sizing: unset; background-color: var(--xr-background-color-row-even); margin-bottom: 0px; color: var(--xr-font-color2); cursor: pointer;"><svg class="icon xr-icon-file-text2"><use xlink:href="#icon-file-text2"></use></svg></label><label for="data-c74ab087-7010-4076-9e77-fe8556853756" title="Show/Hide data repr" style="box-sizing: unset; background-color: var(--xr-background-color-row-even); margin-bottom: 0px; color: var(--xr-font-color2); cursor: pointer;"><svg class="icon xr-icon-database"><use xlink:href="#icon-database"></use></svg></label> array([cftime.datetime(2085, 10, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True), cftime.datetime(2085, 11, 16, 0, 0, 0, 0, calendar='noleap', has_year_zero=True), cftime.datetime(2085, 12, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True), cftime.datetime(2086, 1, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True), cftime.datetime(2086, 2, 15, 0, 0, 0, 0, calendar='noleap', has_year_zero=True), cftime.datetime(2086, 3, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True), cftime.datetime(2086, 4, 16, 0, 0, 0, 0, calendar='noleap', has_year_zero=True), cftime.datetime(2086, 5, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True), cftime.datetime(2086, 6, 16, 0, 0, 0, 0, calendar='noleap', has_year_zero=True), cftime.datetime(2086, 7, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True), cftime.datetime(2086, 8, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True), cftime.datetime(2086, 9, 16, 0, 0, 0, 0, calendar='noleap', has_year_zero=True), cftime.datetime(2086, 10, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True), cftime.datetime(2086, 11, 16, 0, 0, 0, 0, calendar='noleap', has_year_zero=True), cftime.datetime(2086, 12, 16, 12, 0, 0, 0, calendar='noleap', has_year_zero=True)], dtype=object) xu_ocean (xu_ocean) float64 -279.9 -279.8 -279.7 ... 79.9 80.0 <label for="attrs-deb0e0ca-d92a-4695-8544-a9985caa3df3" title="Show/Hide attributes" style="box-sizing: unset; background-color: var(--xr-background-color-row-odd); margin-bottom: 0px; color: var(--xr-font-color2); cursor: pointer;"><svg class="icon xr-icon-file-text2"><use xlink:href="#icon-file-text2"></use></svg></label><label for="data-aafd5159-4edd-4505-a77a-687ba340da33" title="Show/Hide data repr" style="box-sizing: unset; background-color: var(--xr-background-color-row-odd); margin-bottom: 0px; color: var(--xr-font-color2); cursor: pointer;"><svg class="icon xr-icon-database"><use xlink:href="#icon-database"></use></svg></label> yu_ocean (yu_ocean) float64 -81.09 -81.05 -81.0 ... 89.96 90.0 <label for="attrs-0cea6a87-ca0c-47ab-a25c-5784ea14a5ba" title="Show/Hide attributes" style="box-sizing: unset; background-color: var(--xr-background-color-row-even); margin-bottom: 0px; color: var(--xr-font-color2); cursor: pointer;"><svg class="icon xr-icon-file-text2"><use xlink:href="#icon-file-text2"></use></svg></label><label for="data-282162ad-9547-401b-976f-a22fa5efeae9" title="Show/Hide data repr" style="box-sizing: unset; background-color: var(--xr-background-color-row-even); margin-bottom: 0px; color: var(--xr-font-color2); cursor: pointer;"><svg class="icon xr-icon-database"><use xlink:href="#icon-database"></use></svg></label> <label for="section-c71a9525-5800-445c-b401-78088cfc4247" class="xr-section-summary" style="box-sizing: unset; grid-column-start: 1; grid-column-end: auto; color: var(--xr-font-color2); font-weight: 500; padding-top: 4px; padding-bottom: 4px; cursor: pointer;">Attributes: <dl class="xr-attrs" style="box-sizing: unset; padding: 0px; grid-column-start: 1; grid-column-end: -1; display: grid; width: 700px; overflow: hidden; margin: 0px; grid-template-columns: 125px auto;"><dt style="box-sizing: unset; display: block; font-weight: normal; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; grid-column-start: 1; grid-column-end: auto;">long_name :</dt><dd style="box-sizing: unset; display: block; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; grid-column-start: 2; grid-column-end: auto; white-space: pre-wrap; word-break: break-all;">i-current</dd><dt style="box-sizing: unset; display: block; font-weight: normal; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; grid-column-start: 1; grid-column-end: auto;">units :</dt><dd style="box-sizing: unset; display: block; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; grid-column-start: 2; grid-column-end: auto; white-space: pre-wrap; word-break: break-all;">m/sec</dd><dt style="box-sizing: unset; display: block; font-weight: normal; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; grid-column-start: 1; grid-column-end: auto;">valid_range :</dt><dd style="box-sizing: unset; display: block; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; grid-column-start: 2; grid-column-end: auto; white-space: pre-wrap; word-break: break-all;">[-10. 10.]</dd><dt style="box-sizing: unset; display: block; font-weight: normal; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; grid-column-start: 1; grid-column-end: auto;">cell_methods :</dt><dd style="box-sizing: unset; display: block; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; grid-column-start: 2; grid-column-end: auto; white-space: pre-wrap; word-break: break-all;">time: mean</dd><dt style="box-sizing: unset; display: block; font-weight: normal; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; grid-column-start: 1; grid-column-end: auto;">time_avg_info :</dt><dd style="box-sizing: unset; display: block; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; grid-column-start: 2; grid-column-end: auto; white-space: pre-wrap; word-break: break-all;">average_T1,average_T2,average_DT</dd><dt style="box-sizing: unset; display: block; font-weight: normal; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; grid-column-start: 1; grid-column-end: auto;">coordinates :</dt><dd style="box-sizing: unset; display: block; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; grid-column-start: 2; grid-column-end: auto; white-space: pre-wrap; word-break: break-all;">geolon_c geolat_c</dd><dt style="box-sizing: unset; display: block; font-weight: normal; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; grid-column-start: 1; grid-column-end: auto;">standard_name :</dt><dd style="box-sizing: unset; display: block; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; grid-column-start: 2; grid-column-end: auto; white-space: pre-wrap; word-break: break-all;">sea_water_x_velocity</dd><dt style="box-sizing: unset; display: block; font-weight: normal; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; grid-column-start: 1; grid-column-end: auto;">time_bounds :</dt><dd style="box-sizing: unset; display: block; float: left; width: auto; padding: 0px 10px 0px 0px; margin: 0px; grid-column-start: 2; grid-column-end: auto; white-space: pre-wrap; word-break: break-all;"><xarray.DataArray 'time_bounds' (time: 15, nv: 2)> dask.array<concatenate, shape=(15, 2), dtype=timedelta64[ns], chunksize=(1, 2), chunktype=numpy.ndarray> Coordinates: time (time) object 2085-10-16 12:00:00 ... 2086-12-16 12:00:00 * nv (nv) float64 1.0 2.0 Attributes: long_name: time axis boundaries calendar: NOLEAP</dd></dl> </label> </label> ```python FWIW start_time = '2086-01-01' end_time = '2086-12-31' u.sel(time=slice(start_time,end_time)) ``` Anything else we need to know?: I tried following the code execution through with `pdb` and it seems to start going wrong here https://github.com/pydata/xarray/blob/eea76733770be03e78a0834803291659136bca31/xarray/core/indexing.py#L55 by line 63 `data_obj.xindexes` is already in a bad state https://github.com/pydata/xarray/blob/eea76733770be03e78a0834803291659136bca31/xarray/core/indexing.py#L63 `python (Pdb) data_obj.xindexes * TypeError: cannot compute the time difference between dates with different calendars` It is called here https://github.com/pydata/xarray/blob/eea76733770be03e78a0834803291659136bca31/xarray/core/indexing.py#L106-L108 but it isn't obvious to me how that bad state is generated. Environment**: Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.9.6 \| packaged by conda-forge \| (default, Jul 11 2021, 03:39:48) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 4.18.0-326.el8.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_AU.utf8 LANG: en_US.UTF-8 LOCALE: ('en_AU', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.19.0 pandas: 1.3.1 numpy: 1.21.1 scipy: 1.7.0 netCDF4: 1.5.6 pydap: installed h5netcdf: 0.11.0 h5py: 2.10.0 Nio: None zarr: 2.8.3 cftime: 1.5.0 nc_time_axis: 1.3.1 PseudoNetCDF: None rasterio: 1.2.6 cfgrib: 0.9.9.0 iris: 3.0.4 bottleneck: 1.3.2 dask: 2021.07.2 distributed: 2021.07.2 matplotlib: 3.4.2 cartopy: 0.19.0.post1 seaborn: 0.11.1 numbagg: None pint: 0.17 setuptools: 52.0.0.post20210125 pip: 21.1.3 conda: 4.10.3 pytest: 6.2.4 IPython: 7.26.0 sphinx: 4.1.2	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5677/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
677307460	MDU6SXNzdWU2NzczMDc0NjA=	4337	cftime_range does not support default cftime.datetime formatted output strings	aidanheerdegen 6063709	closed	5	2020-08-12T01:28:30Z	2020-08-17T23:27:07Z	2020-08-17T23:27:07Z	CONTRIBUTOR			Is your feature request related to a problem? Please describe. The `xarray.cftime_range` does not support datetime strings that are the default output from `cftime.datetime.strftime()` which are the format which `cftime_range` itself uses internally. `python import cftime import xarray date = cftime.datetime(10,1,1).strftime() print(date) xarray.cftime_range(date, periods=3, freq='Y')` outputs ``` 10-01-01 00:00:00 ValueError Traceback (most recent call last) <ipython-input-70-a16c1fcab8d6> in <module> 3 date = cftime.datetime(10,1,1).strftime() 4 print(date) ----> 5 xarray.cftime_range(date, periods=3, freq='Y') /g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.07/lib/python3.7/site-packages/xarray/coding/cftime_offsets.py in cftime_range(start, end, periods, freq, normalize, name, closed, calendar) 963 964 if start is not None: --> 965 start = to_cftime_datetime(start, calendar) 966 start = _maybe_normalize_date(start, normalize) 967 if end is not None: /g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.07/lib/python3.7/site-packages/xarray/coding/cftime_offsets.py in to_cftime_datetime(date_str_or_date, calendar) 683 "a calendar type must be provided" 684 ) --> 685 date, _ = _parse_iso8601_with_reso(get_date_type(calendar), date_str_or_date) 686 return date 687 elif isinstance(date_str_or_date, cftime.datetime): /g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.07/lib/python3.7/site-packages/xarray/coding/cftimeindex.py in _parse_iso8601_with_reso(date_type, timestr) 101 102 default = date_type(1, 1, 1) --> 103 result = parse_iso8601(timestr) 104 replace = {} 105 /g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.07/lib/python3.7/site-packages/xarray/coding/cftimeindex.py in parse_iso8601(datetime_string) 94 if match: 95 return match.groupdict() ---> 96 raise ValueError("no ISO-8601 match for string: %s" % datetime_string) 97 98 ValueError: no ISO-8601 match for string: 10-01-01 00:00:00 ``` Describe the solution you'd like It would be good if `xarray.cftime_range` supported the default `strftime` format output from cftime.datetime objects. It is confusing that it uses this format with `repr` but explicitly does not support it. Describe alternatives you've considered Specifying an ISO-8601 compatible format (using `T` separator) isn't general as it doesn't work for years < 1000 because the year field is not zero padded. `python import cftime import xarray date = cftime.datetime(10,1,1).strftime('%Y-%m-%dT%H:%M:%S') print('\|{}\|'.format(date)) xarray.cftime_range(date, periods=3, freq='Y')` produces `\| 10-01-01T00:00:00\|` and the error as above. A work-around is to zero-pad manually `python import cftime import xarray date = '{:0>19}'.format(cftime.datetime(10,1,1).strftime('%Y-%m-%dT%H:%M:%S').lstrip()) print(date) xarray.cftime_range(date, periods=3, freq='Y')` produces `0010-01-01T00:00:00 CFTimeIndex([0010-12-31 00:00:00, 0011-12-31 00:00:00, 0012-12-31 00:00:00], dtype='object')` Additional context I think this is a relatively small addition to the codebase but would make it easier and less confusing to use the default format that is also used by the the function itself. It is easy to support as it is consistent and uniform.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4337/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
481005183	MDExOlB1bGxSZXF1ZXN0MzA3NTkwNDYw	3220	BUG: Fixes GH3215	aidanheerdegen 6063709	closed	7	2019-08-15T05:55:36Z	2019-08-28T06:45:42Z	2019-08-28T06:45:35Z	CONTRIBUTOR	0	pydata/xarray/pulls/3220	Explicit cast to numpy array to avoid np.ravel calling out to dask [x] Closes #3215	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3220/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
480512400	MDU6SXNzdWU0ODA1MTI0MDA=	3215	decode_cf called on mfdataset throws error: 'Array' object has no attribute 'tolist'	aidanheerdegen 6063709	closed	9	2019-08-14T06:56:35Z	2019-08-28T06:45:35Z	2019-08-28T06:45:35Z	CONTRIBUTOR			MCVE Code Sample ```python import xarray file = 'temp_048.nc' Works ok with open_dataset ds = xarray.open_dataset(file, decode_cf=True) ds = xarray.open_dataset(file, decode_cf=False) ds = xarray.decode_cf(ds) Fails with open_mfdataset ds = xarray.open_mfdataset(file, decode_cf=True) ds = xarray.open_mfdataset(file, decode_cf=False) This line throws an exception ds = xarray.decode_cf(ds) ``` Expected Output Nothing Problem Description When opening data with `open_mfdataset` calling `decode_cf` throws an error, when called as a separate step, but works as part of the `open_mfdataset` call. Error is: Traceback (most recent call last): File "tmp.py", line 11, in <module> ds = xarray.decode_cf(ds) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.07/lib/python3.6/site-packages/xarray/conventions.py", line 479, in decode_cf decode_coords, drop_variables=drop_variables, use_cftime=use_cftime) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.07/lib/python3.6/site-packages/xarray/conventions.py", line 401, in decode_cf_variables stack_char_dim=stack_char_dim, use_cftime=use_cftime) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.07/lib/python3.6/site-packages/xarray/conventions.py", line 306, in decode_cf_variable var = coder.decode(var, name=name) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.07/lib/python3.6/site-packages/xarray/coding/times.py", line 419, in decode self.use_cftime) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.07/lib/python3.6/site-packages/xarray/coding/times.py", line 90, in _decode_cf_datetime_dtype last_item(values) or [0]]) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.07/lib/python3.6/site-packages/xarray/core/formatting.py", line 99, in last_item return np.ravel(array[indexer]).tolist() AttributeError: 'Array' object has no attribute 'tolist' Output of `xr.show_versions()` # Paste the output here xr.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.6.7 \| packaged by conda-forge \| (default, Jul 2 2019, 02:18:42) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-957.21.3.el6.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_AU.utf8 LANG: C LOCALE: en_AU.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.1 pandas: 0.25.0 numpy: 1.17.0 scipy: 1.2.1 netCDF4: 1.5.1.2 pydap: installed h5netcdf: 0.7.4 h5py: 2.9.0 Nio: 1.5.5 zarr: 2.3.2 cftime: 1.0.3.4 nc_time_axis: 1.2.0 PseudonetCDF: None rasterio: None cfgrib: 0.9.7.1 iris: 2.2.1dev0 bottleneck: 1.2.1 dask: 2.2.0 distributed: 2.2.0 matplotlib: 2.2.4 cartopy: 0.17.0 seaborn: 0.9.0 setuptools: 41.0.1 pip: 19.1.1 conda: installed pytest: 5.0.1 IPython: 7.7.0 sphinx: None There is no error using an older version of `numpy` with the same `xarray` version: INSTALLED VERSIONS ------------------ commit: None python: 3.6.7 \| packaged by conda-forge \| (default, Feb 28 2019, 09:07:38) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-957.21.3.el6.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_AU.utf8 LANG: C LOCALE: en_AU.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.4 scipy: 1.2.1 netCDF4: 1.5.1.2 pydap: installed h5netcdf: 0.7.4 h5py: 2.9.0 Nio: None zarr: 2.3.2 cftime: 1.0.3.4 nc_time_axis: 1.2.0 PseudonetCDF: None rasterio: None cfgrib: 0.9.7 iris: 2.2.1dev0 bottleneck: 1.2.1 dask: 1.2.2 distributed: 1.28.1 matplotlib: 2.2.3 cartopy: 0.17.0 seaborn: 0.9.0 setuptools: 41.0.1 pip: 19.1.1 conda: installed pytest: 4.6.3 IPython: 7.5.0 sphinx: None Looks like the `tollst()` method has disappeared from something, but even in the debugger it isn't obvious to me exactly why this is happening. I can call `list` on `np.ravel(array[indexer])` at the same point and it works. The netcdf file I am using can be recreated from this CDL dump ``` netcdf temp_048 { dimensions: time = UNLIMITED ; // (5 currently) nv = 2 ; variables: double average_T1(time) ; average_T1:long_name = "Start time for average period" ; average_T1:units = "days since 1958-01-01 00:00:00" ; average_T1:missing_value = 1.e+20 ; average_T1:_FillValue = 1.e+20 ; double time(time) ; time:long_name = "time" ; time:units = "days since 1958-01-01 00:00:00" ; time:cartesian_axis = "T" ; time:calendar_type = "GREGORIAN" ; time:calendar = "GREGORIAN" ; time:bounds = "time_bounds" ; double time_bounds(time, nv) ; time_bounds:long_name = "time axis boundaries" ; time_bounds:units = "days" ; time_bounds:missing_value = 1.e+20 ; time_bounds:_FillValue = 1.e+20 ; // global attributes: :filename = "ocean.nc" ; :title = "MOM5" ; :grid_type = "mosaic" ; :grid_tile = "1" ; :history = "Wed Aug 14 16:38:53 2019: ncks -O -v average_T1 /g/data3/hh5/tmp/cosima/access-om2/1deg_jra55v13_iaf_spinup1_B1_lastcycle/output048/ocean/ocean.nc temp_048.nc" ; :NCO = "netCDF Operators version 4.7.7 (Homepage = http://nco.sf.net, Code = http://github.com/nco/nco)" ; data: average_T1 = 87659, 88024, 88389, 88754, 89119 ; time = 87841.5, 88206.5, 88571.5, 88936.5, 89301.5 ; time_bounds = 87659, 88024, 88024, 88389, 88389, 88754, 88754, 89119, 89119, 89484 ; } ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3215/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
334778045	MDU6SXNzdWUzMzQ3NzgwNDU=	2244	Implement shift for CFTimeIndex	aidanheerdegen 6063709	closed	3	2018-06-22T07:42:16Z	2018-10-02T14:44:30Z	2018-10-02T14:44:30Z	CONTRIBUTOR			Code Sample ```python import numpy as np import xarray as xr import pandas as pd from cftime import num2date, DatetimeNoLeap times = num2date(np.arange(730), calendar='noleap', units='days since 0001-01-01') da = xr.DataArray(np.arange(730), coords=[times], dims=['time']) ``` Problem description I am trying to shift a time index as I need to align datasets to a common start point. Directly incrementing one of the `CFTimeIndex` values works: ```python da.time.get_index('time')[0] + pd.Timedelta('365 days') cftime.DatetimeNoLeap(2, 1, 1, 0, 0, 0, 0, -1, 1) Trying to use `shift` does not:python da.time.get_index('time').shift(1,'Y') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-18.04/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 2629, in shift type(self).name) NotImplementedError: Not supported for type CFTimeIndex ``` If I want to shift a time index is the only way currently is to loop over all the individual elements of the index and add a time offset to each. Expected Output I would expect to have CFTimeIndex shifted by the desired time delta. Output of `xr.show_versions()` # Paste the output here xr.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 3.10.0-693.17.1.el6.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: C LOCALE: None.None xarray: 0.10.7 pandas: 0.23.1 numpy: 1.14.5 scipy: 1.1.0 netCDF4: 1.3.1 h5netcdf: 0.5.1 h5py: 2.8.0 Nio: None zarr: 2.2.0 bottleneck: 1.2.1 cyordereddict: None dask: 0.17.5 distributed: 1.21.8 matplotlib: 1.5.3 cartopy: 0.16.0 seaborn: 0.8.1 setuptools: 39.2.0 pip: 9.0.3 conda: None pytest: 3.6.1 IPython: 6.4.0 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2244/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
102703065	MDU6SXNzdWUxMDI3MDMwNjU=	548	Support for netcdf4/hdf5 compression	aidanheerdegen 6063709	closed	4	2015-08-24T04:22:07Z	2015-10-08T01:08:51Z	2015-10-08T01:08:51Z	CONTRIBUTOR			It would be great to be able to specify netCDF4 compression parameters when saving datasets. If this is unlikely to be supported, can you suggest a reasonable work-around? I am assuming it would involve directly accessing a backend?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/548/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

10 rows where user = 6063709 sorted by updated_at descending

MCVE Code Sample

Expected Output

Problem Description

Output of `xr.show_versions()`

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

Need a main routine for dask.distributed if run as script

Relevant log output

Anything else we need to know?

Environment

INSTALLED VERSIONS

Need a main routine for dask.distributed if run as script

```python

FWIW

MCVE Code Sample

Works ok with open_dataset

Fails with open_mfdataset

This line throws an exception

Expected Output

Problem Description

Output of `xr.show_versions()`

Code Sample

Problem description

Expected Output

Output of `xr.show_versions()`

Advanced export

issues

10 rows where user = 6063709 sorted by updated_at descending

MCVE Code Sample

Expected Output

Problem Description

Output of xr.show_versions()

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

Need a main routine for dask.distributed if run as script

Relevant log output

Anything else we need to know?

Environment

INSTALLED VERSIONS

Need a main routine for dask.distributed if run as script

```python

FWIW

MCVE Code Sample

Works ok with open_dataset

Fails with open_mfdataset

This line throws an exception

Expected Output

Problem Description

Output of xr.show_versions()

Code Sample

Problem description

Expected Output

Output of xr.show_versions()

Advanced export

Output of `xr.show_versions()`

Output of `xr.show_versions()`

Output of `xr.show_versions()`