id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1474785646,I_kwDOAMm_X85X53Fu,7354,'open_mfdataset' zarr zip timestamp issue,34686298,open,0,,,5,2022-12-04T12:45:12Z,2022-12-16T00:29:54Z,,NONE,,,,"### What happened? We have been collecting Satellite data and we save each image as one `{time}.zarr.zip` file. We then collate the images using `xr.open_mfdataset` and same them to `large.zarr.zip` file. When loading this file the timestamps are all the same. This bug did not appear in `2022.3.0` but it did in `2022.6.0` I tried to keep this as minimum as possible, but its a bit of a long example. Hopefully the comments help. Sorry if this has already been reported, but I could not find it in the `issue` list ### What did you expect to happen? Expected the time stamps to reflect the data that went in ### Minimal Complete Verifiable Example ```Python import pandas as pd import xarray as xr import numpy as np from datetime import datetime, timedelta import zarr import os import glob # ids and times path = ""tmp.zarr.zip"" ids = np.array(range(0, 10)) times = [datetime(2022, 9, 1) + timedelta(minutes=60 * i) for i in range(0, 10)] # make 10 random zipp files for time in times: dataset = xr.DataArray( np.random.uniform(size=(1, len(ids))), coords=((""time"", [time]), (""id"", ids)), name=""data"", ).to_dataset(name=""data"") file_name = f""tmp_dir/{time.isoformat()}.zarr.zip"" if os.path.exists(file_name): os.remove(file_name) with zarr.ZipStore(file_name) as store: dataset.to_zarr(store) # load them all together files = list(glob.glob(f""tmp_dir/*.zarr.zip"")) dataset = xr.open_mfdataset(files, engine=""zarr"").sortby(""time"") # this is fine! assert pd.to_datetime(dataset.time.values[0]) == times[0] assert pd.to_datetime(dataset.time.values[1]) == times[1] # save to file if os.path.exists(path): os.remove(path) with zarr.ZipStore(path) as store: dataset.to_zarr(store) # read the file dataset_read = xr.open_dataset(path, engine=""zarr"") print(dataset_read) # this casues an error assert pd.to_datetime(dataset_read.time.values[0]) == times[0] assert pd.to_datetime(dataset_read.time.values[1]) == times[1] ``` ### MVCE confirmation - [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [X] Complete example — the example is self-contained, including all data and the text of any traceback. - [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [X] New issue — a search of GitHub Issues suggests this is not a duplicate. ### Relevant log output ```Python /Users/peterdudfield/Documents/Github/nwp/venv/lib/python3.8/site-packages/xarray/core/dataset.py:2060: SerializationWarning: saving variable None with floating point data as an integer dtype without any _FillValue to use for NaNs return to_zarr( # type: ignore Dimensions: (time: 10, id: 10) Coordinates: * id (id) int64 0 1 2 3 4 5 6 7 8 9 * time (time) datetime64[ns] 2022-09-01 2022-09-01 ... 2022-09-01 Data variables: data (time, id) float64 ... Traceback (most recent call last): File ""/Users/peterdudfield/Documents/Github/nwp/venv/lib/python3.8/site-packages/IPython/core/interactiveshell.py"", line 3251, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File """", line 36, in assert pd.to_datetime(dataset_read.time.values[1]) == times[1] AssertionError ``` ### Anything else we need to know? _No response_ ### Environment
INSTALLED VERSIONS ------------------ commit: None python: 3.8.2 (default, Jun 8 2021, 11:59:35) [Clang 12.0.5 (clang-1205.0.22.11)] python-bits: 64 OS: Darwin OS-release: 20.4.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: None LOCALE: ('en_GB', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.7.4 xarray: 2022.6.0 pandas: 1.4.2 numpy: 1.22.0 scipy: 1.7.3 netCDF4: 1.5.8 pydap: None h5netcdf: 0.13.1 h5py: 3.6.0 Nio: None zarr: 2.10.3 cftime: 1.6.0 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.10 cfgrib: 0.9.9.1 iris: None bottleneck: 1.3.4 dask: 2022.01.0 distributed: None matplotlib: 3.5.1 cartopy: None seaborn: None numbagg: None fsspec: 2022.11.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 57.0.0 pip: 21.1.2 conda: None pytest: 6.2.5 IPython: 8.0.1 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7354/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue