id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1063046540,I_kwDOAMm_X84_XM2M,6026,Delaying open produces different type of `cftime` object,42455466,closed,0,,,3,2021-11-25T00:47:22Z,2022-01-13T13:49:27Z,2022-01-13T13:49:27Z,NONE,,,,"**What happened**: The task is opening a dataset (e.g. a netcdf or zarr file) with a time coordinate using `use_cftime=True`. Delaying the task with dask results in the time coordinate being represented as `cftime.datetime` objects, whereas when the task is not delayed `cftime.Datetime` objects are used. **What you expected to happen**: Consistent `cftime` objects to be used, regardless of whether the opening task is delayed or not. **Minimal Complete Verifiable Example**: ```python import dask import numpy as np import xarray as xr from dask.distributed import LocalCluster, Client cluster = LocalCluster() client = Client(cluster) # Write some data var = np.random.random(4) time = xr.cftime_range('2000-01-01', periods=4, calendar='julian') ds = xr.Dataset(data_vars={'var': ('time', var)}, coords={'time': time}) ds.to_netcdf('test.nc', mode='w') # Open written data ds1 = xr.open_dataset('test.nc', use_cftime=True) print(f'ds1: {ds1.time} \n') # Delayed open written data ds2 = dask.delayed(xr.open_dataset)('test.nc', use_cftime=True) ds2 = dask.compute(ds2)[0] print(f'ds2: {ds2.time} \n') # Operations like xr.open_mfdataset which use dask.delayed internally # when parallel=True (I think) produce the same result as ds2 ds3 = xr.open_mfdataset('test.nc', use_cftime=True, parallel=True) print(f'ds3: {ds3.time}') ``` returns ``` ds1: array([cftime.DatetimeJulian(2000, 1, 1, 0, 0, 0, 0, has_year_zero=False), cftime.DatetimeJulian(2000, 1, 2, 0, 0, 0, 0, has_year_zero=False), cftime.DatetimeJulian(2000, 1, 3, 0, 0, 0, 0, has_year_zero=False), cftime.DatetimeJulian(2000, 1, 4, 0, 0, 0, 0, has_year_zero=False)], dtype=object) Coordinates: * time (time) object 2000-01-01 00:00:00 ... 2000-01-04 00:00:00 ds2: array([cftime.datetime(2000, 1, 1, 0, 0, 0, 0, calendar='julian', has_year_zero=False), cftime.datetime(2000, 1, 2, 0, 0, 0, 0, calendar='julian', has_year_zero=False), cftime.datetime(2000, 1, 3, 0, 0, 0, 0, calendar='julian', has_year_zero=False), cftime.datetime(2000, 1, 4, 0, 0, 0, 0, calendar='julian', has_year_zero=False)], dtype=object) Coordinates: * time (time) object 2000-01-01 00:00:00 ... 2000-01-04 00:00:00 ds3: array([cftime.datetime(2000, 1, 1, 0, 0, 0, 0, calendar='julian', has_year_zero=False), cftime.datetime(2000, 1, 2, 0, 0, 0, 0, calendar='julian', has_year_zero=False), cftime.datetime(2000, 1, 3, 0, 0, 0, 0, calendar='julian', has_year_zero=False), cftime.datetime(2000, 1, 4, 0, 0, 0, 0, calendar='julian', has_year_zero=False)], dtype=object) Coordinates: * time (time) object 2000-01-01 00:00:00 ... 2000-01-04 00:00:00 ``` **Anything else we need to know?**: I noticed this because the DatetimeAccessor `ceil`, `floor` and `round` methods return errors for `cftime.datetime` objects (but not `cftime.Datetime` objects) for all calendar types other than 'gregorian'. For example, ```python ds3.time.dt.floor('D') ``` returns the following traceback: ``` --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in ----> 1 ds3.time.dt.floor('D') /g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/accessor_dt.py in floor(self, freq) 220 """""" 221 --> 222 return self._tslib_round_accessor(""floor"", freq) 223 224 def ceil(self, freq): /g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/accessor_dt.py in _tslib_round_accessor(self, name, freq) 202 def _tslib_round_accessor(self, name, freq): 203 obj_type = type(self._obj) --> 204 result = _round_field(self._obj.data, name, freq) 205 return obj_type(result, name=name, coords=self._obj.coords, dims=self._obj.dims) 206 /g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/accessor_dt.py in _round_field(values, name, freq) 142 ) 143 else: --> 144 return _round_through_series_or_index(values, name, freq) 145 146 /g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/accessor_dt.py in _round_through_series_or_index(values, name, freq) 110 method = getattr(values_as_cftimeindex, name) 111 --> 112 field_values = method(freq=freq).values 113 114 return field_values.reshape(values.shape) /g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/coding/cftimeindex.py in floor(self, freq) 733 CFTimeIndex 734 """""" --> 735 return self._round_via_method(freq, _floor_int) 736 737 def ceil(self, freq): /g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/coding/cftimeindex.py in _round_via_method(self, freq, method) 714 715 unit = _total_microseconds(offset.as_timedelta()) --> 716 values = self.asi8 717 rounded = method(values, unit) 718 return _cftimeindex_from_i8(rounded, self.date_type, self.name) /g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/coding/cftimeindex.py in asi8(self) 684 epoch = self.date_type(1970, 1, 1) 685 return np.array( --> 686 [ 687 _total_microseconds(exact_cftime_datetime_difference(epoch, date)) 688 for date in self.values /g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/coding/cftimeindex.py in (.0) 685 return np.array( 686 [ --> 687 _total_microseconds(exact_cftime_datetime_difference(epoch, date)) 688 for date in self.values 689 ], /g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/resample_cftime.py in exact_cftime_datetime_difference(a, b) 356 datetime.timedelta 357 """""" --> 358 seconds = b.replace(microsecond=0) - a.replace(microsecond=0) 359 seconds = int(round(seconds.total_seconds())) 360 microseconds = b.microsecond - a.microsecond src/cftime/_cftime.pyx in cftime._cftime.datetime.__sub__() TypeError: cannot compute the time difference between dates with different calendars ``` My apologies for conflating two issues here. I'm happy to open a separate issue for this if that's preferred. **Environment**:
Output of xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.9.4 | packaged by conda-forge | (default, May 10 2021, 22:13:33) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 4.18.0-305.19.1.el8.nci.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.20.1 pandas: 1.3.4 numpy: 1.21.4 scipy: 1.6.3 netCDF4: 1.5.6 pydap: None h5netcdf: 0.11.0 h5py: 3.3.0 Nio: None zarr: 2.9.5 cftime: 1.5.0 nc_time_axis: 1.4.0 PseudoNetCDF: None rasterio: 1.2.4 cfgrib: None iris: None bottleneck: 1.3.2 dask: 2021.11.2 distributed: 2021.11.2 matplotlib: 3.4.2 cartopy: 0.19.0.post1 seaborn: None numbagg: None fsspec: 2021.05.0 cupy: None pint: 0.18 sparse: None setuptools: 49.6.0.post20210108 pip: 21.1.2 conda: 4.10.1 pytest: None IPython: 7.24.0 sphinx: None ​
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6026/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 789755611,MDU6SXNzdWU3ODk3NTU2MTE=,4833,Strange behaviour when overwriting files with to_netcdf and html repr,42455466,closed,0,,,2,2021-01-20T08:28:35Z,2021-01-20T20:00:23Z,2021-01-20T20:00:23Z,NONE,,,,"**What happened**: I'm experiencing some strange behaviour when overwriting netcdf files using `to_netcdf` in a Jupyter notebook. The issue is a bit quirky and convoluted and only seems to come about when using xarray's html repr in Jupyter. I've tried to find a reproducible example that demonstrates the issue (it's still quite convoluted, sorry): I can generate some data, save it to a netcdf file, reopen it and everything works as expected: ```python import numpy as np import xarray as xr ones = xr.DataArray(np.ones(5), coords=[range(5)], dims=['x']).to_dataset(name='a') ones.to_netcdf('./a.nc') print(xr.open_dataset('./a.nc')['a']) ``` ``` array([1., 1., 1., 1., 1.]) Coordinates: * x (x) int64 0 1 2 3 4 ``` I can overwrite `a.nc` with a modified dataset and everything still works as expected: ```python twos = 2 * ones twos.to_netcdf('./a.nc') print(xr.open_dataset('./a.nc', cache=False)['a']) ``` ``` array([2., 2., 2., 2., 2.]) Coordinates: * x (x) int64 0 1 2 3 4 ``` I can run the above cell as many times as I like and always get the expected behaviour. However, if instead of `print`ing the `open_dataset` line, I allow it to be rendered by the xarray html repr, I find that the cell will run once and then will fail with a `Permission denied` error the second time it is run: ```python twos.to_netcdf('./a.nc') xr.open_dataset('./a.nc', cache=False)['a'] ``` ``` --------------------------------------------------------------------------- KeyError Traceback (most recent call last) .../lib/python3.8/site-packages/xarray/backends/file_manager.py in _acquire_with_cache_info(self, needs_lock) 198 try: --> 199 file = self._cache[self._key] 200 except KeyError: .../lib/python3.8/site-packages/xarray/backends/lru_cache.py in __getitem__(self, key) 52 with self._lock: ---> 53 value = self._cache[key] 54 self._cache.move_to_end(key) KeyError: [, ('.../a.nc',), 'a', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False))] During handling of the above exception, another exception occurred: . . . PermissionError: [Errno 13] Permission denied: b'.../a.nc' ``` If I manually remove the file in question, I can resave it, but from then on xarray seems to have its wires crossed somehow and will present `twos` from `a.nc` regardless of what it actually contains: ```python !rm ./a.nc ones.to_netcdf('./a.nc') print(xr.open_dataset('./a.nc')['a']) ``` ``` array([2., 2., 2., 2., 2.]) Coordinates: * x (x) int64 0 1 2 3 4 ``` Note that in the last example, the data saved on disk is correct (i.e. contains ones) but xarray is still somehow linked to the `twos` data **Anything else we need to know?**: I've come across this unexpected behaviour a few times. In the above example, I've had to add `cache=True` to consistently produce the behaviour, but in the past I've managed to produce these symptoms _without_ `cache=True` (I'm just not exactly sure how). Anecdotally, the behaviour always seems to occur after having rendered the xarray object in Jupyter using the html repr ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4833/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue