issue_comments: 1529894939
This data as json
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/pydata/xarray/issues/7790#issuecomment-1529894939 | https://api.github.com/repos/pydata/xarray/issues/7790 | 1529894939 | IC_kwDOAMm_X85bMFgb | 5821660 | 2023-05-01T16:05:19Z | 2023-05-01T16:05:19Z | MEMBER | So, after some debugging I think I've found two issues here with the current code. First, we need to give the fillvalue with a fitting resolution. Second, we have an issue with inferring the units from the data (if not given). Here is some workaround code which (finally, :crossed_fingers:) should at least write and read correct data (added comments below): ```python Create a numpy array of type np.datetime64 with one fill value and one dateFIRST ISSUE WITH _FillValuewe need to provide ns resolution here too, otherwise we get wrong fillvalues (day-reference)time_fill_value = np.datetime64("1900-01-01 00:00:00.00000000", "ns") time = np.array([np.datetime64("NaT", "ns"), '2023-01-02 00:00:00.00000000'], dtype='M8[ns]') Create a dataset with this one arrayxr_time_array = xr.DataArray(data=time,dims=['time'],name='time') xr_ds = xr.Dataset(dict(time=xr_time_array)) print("******") print("Created with fill value 1900-01-01") print(xr_ds["time"]) Save the dataset to zarrlocation_new_fill = "from_xarray_new_fill.zarr" SECOND ISSUE with inferring units from dataWe need to specify "dtype" and "units" which fit our dataNote: as we provide a _FillValue with a reference to unix-epochwe need to provide a fitting units tooencoding = { "time":{"_FillValue":time_fill_value, "dtype":np.int64, "units":"nanoseconds since 1970-01-01"} } xr_ds.to_zarr(location_new_fill, mode="w", encoding=encoding) xr_read = xr.open_zarr(location_new_fill) print("******") print("Read back out of the zarr store with xarray") print(xr_read["time"]) print(xr_read["time"].attrs) print(xr_read["time"].encoding) z_new_fill = zarr.open('from_xarray_new_fill.zarr','r', ) print("******") print("Read back out of the zarr store with zarr") print(z_new_fill["time"]) print(z_new_fill["time"].attrs) print(z_new_fill["time"][:]) ``` ```python Created with fill value 1900-01-01 <xarray.DataArray 'time' (time: 2)> array([ 'NaT', '2023-01-02T00:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] NaT 2023-01-02 Read back out of the zarr store with xarray <xarray.DataArray 'time' (time: 2)> array([ 'NaT', '2023-01-02T00:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] NaT 2023-01-02 {} {'chunks': (2,), 'preferred_chunks': {'time': 2}, 'compressor': Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0), 'filters': None, '_FillValue': -2208988800000000000, 'units': 'nanoseconds since 1970-01-01', 'calendar': 'proleptic_gregorian', 'dtype': dtype('int64')} Read back out of the zarr store with zarr <zarr.core.Array '/time' (2,) int64 read-only> <zarr.attrs.Attributes object at 0x7f086ab8e710> [-2208988800000000000 1672617600000000000] ``` @christine-e-smit Please let me know, if the above workaround gives you correct results in your workflow. If so, then we can think about how to automatically align fillvalue-resolution with data-resolution and what needs to be done to correctly deduce the units. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
1685803922 |