issue_comments
9 rows where user = 14983768 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1532152709 | https://github.com/pydata/xarray/issues/7790#issuecomment-1532152709 | https://api.github.com/repos/pydata/xarray/issues/7790 | IC_kwDOAMm_X85bUsuF | christine-e-smit 14983768 | 2023-05-02T21:07:27Z | 2023-05-02T21:09:10Z | NONE | @kmuehlbauer - genius! Yes. That pull request should fix this issue exactly! And it explains why I see this issue and you don't - with undefined behavior anything can happen. Since we are on different OSes, our systems behave differently. I just double checked with pandas and this fix will do the right thing:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fill values in time arrays (numpy.datetime64) are lost in zarr 1685803922 | |
1530347592 | https://github.com/pydata/xarray/issues/7790#issuecomment-1530347592 | https://api.github.com/repos/pydata/xarray/issues/7790 | IC_kwDOAMm_X85bN0BI | christine-e-smit 14983768 | 2023-05-01T21:43:08Z | 2023-05-01T21:43:56Z | NONE | Ah hah! Well, I don't know why this is working for you @kmuehlbauer, but I can see why it is not working for me. I've been debugging through the code and it looks like the problem is the It all starts in the There's a bunch of stuff that gets called, but eventually we get to the function and then, in In line 254, |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fill values in time arrays (numpy.datetime64) are lost in zarr 1685803922 | |
1530186148 | https://github.com/pydata/xarray/issues/7790#issuecomment-1530186148 | https://api.github.com/repos/pydata/xarray/issues/7790 | IC_kwDOAMm_X85bNMmk | christine-e-smit 14983768 | 2023-05-01T20:25:34Z | 2023-05-01T20:25:34Z | NONE | @kmuehlbauer - I ran https://github.com/pydata/xarray/issues/7790#issuecomment-1529894939 and I get an incorrect fill value: ``` Created with fill value 1900-01-01 <xarray.DataArray 'time' (time: 2)> array([ 'NaT', '2023-01-02T00:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] NaT 2023-01-02 Read back out of the zarr store with xarray <xarray.DataArray 'time' (time: 2)> array(['1970-01-01T00:00:00.000000000', '2023-01-02T00:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] 1970-01-01 2023-01-02 {} {'chunks': (2,), 'preferred_chunks': {'time': 2}, 'compressor': Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0), 'filters': None, '_FillValue': -2208988800000000000, 'units': 'nanoseconds since 1970-01-01', 'calendar': 'proleptic_gregorian', 'dtype': dtype('int64')} Read back out of the zarr store with zarr <zarr.core.Array '/time' (2,) int64 read-only> <zarr.attrs.Attributes object at 0x132802a50> [-2208988800000000000 1672617600000000000]
commit: None python: 3.11.3 | packaged by conda-forge | (main, Apr 6 2023, 08:58:31) [Clang 14.0.6 ] python-bits: 64 OS: Darwin OS-release: 22.4.0 machine: arm64 processor: arm byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2023.4.2 pandas: 2.0.1 numpy: 1.24.3 scipy: None netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.14.2 cftime: None nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: None dask: None distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None fsspec: None cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 67.7.2 pip: 23.1.2 conda: None pytest: None mypy: None IPython: 8.13.1 sphinx: None ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fill values in time arrays (numpy.datetime64) are lost in zarr 1685803922 | |
1530056660 | https://github.com/pydata/xarray/issues/7790#issuecomment-1530056660 | https://api.github.com/repos/pydata/xarray/issues/7790 | IC_kwDOAMm_X85bMs_U | christine-e-smit 14983768 | 2023-05-01T18:37:47Z | 2023-05-01T18:39:21Z | NONE | Oops! Yes. You are right. I had some cross-wording on the variable names. So I started a new notebook. Unfortunately, I think you may have also gotten some wires crossed? You set the time fill value to 1900-01-01, but then use NaT in the actual array? Here is a fresh notebook with a stand-alone cell with everything that I think you were doing, but I'm not 100%. The fill value is still wrong when it gets read out, but it is at least different? The fill value is now set to the units for some reason. This seems like progress? ```python import numpy as np import xarray as xr import zarr Create a time array with one fill value, NaTtime = np.array([np.datetime64("NaT", "ns"), '2023-01-02 00:00:00.00000000'], dtype='M8[ns]') Create xarray with this fill valuexr_time_array = xr.DataArray(data=time,dims=['time'],name='time') xr_ds = xr.Dataset(dict(time=xr_time_array)) print("****") print("xarray created with NaT fill value") print("----------------------") print(xr_ds["time"]) Save as zarrlocation_with_units = "xarray_and_units.zarr" encoding = { "time":{"_FillValue":np.datetime64("NaT","ns"),"dtype":np.int64,"units":"nanoseconds since 1970-01-01"} } xr_ds.to_zarr(location_with_units,mode="w",encoding=encoding) Read it back out againxr_read = xr.open_zarr(location_with_units) print("****") print("xarray created read with NaT fill value") print("----------------------") print(xr_read["time"]) print(xr_read["time"].attrs) print(xr_read["time"].encoding) ``` ``` xarray created with NaT fill value<xarray.DataArray 'time' (time: 2)> array([ 'NaT', '2023-01-02T00:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] NaT 2023-01-02 xarray created read with NaT fill value<xarray.DataArray 'time' (time: 2)> array(['1970-01-01T00:00:00.000000000', '2023-01-02T00:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] 1970-01-01 2023-01-02 {} {'chunks': (2,), 'preferred_chunks': {'time': 2}, 'compressor': Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0), 'filters': None, '_FillValue': -9223372036854775808, 'units': 'nanoseconds since 1970-01-01', 'calendar': 'proleptic_gregorian', 'dtype': dtype('int64')} ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fill values in time arrays (numpy.datetime64) are lost in zarr 1685803922 | |
1527948787 | https://github.com/pydata/xarray/issues/7790#issuecomment-1527948787 | https://api.github.com/repos/pydata/xarray/issues/7790 | IC_kwDOAMm_X85bEqXz | christine-e-smit 14983768 | 2023-04-28T18:39:01Z | 2023-04-28T18:39:01Z | NONE | Where in the code is the time array being decoded? That seems to be where a lot of the issue is? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fill values in time arrays (numpy.datetime64) are lost in zarr 1685803922 | |
1527918654 | https://github.com/pydata/xarray/issues/7790#issuecomment-1527918654 | https://api.github.com/repos/pydata/xarray/issues/7790 | IC_kwDOAMm_X85bEjA- | christine-e-smit 14983768 | 2023-04-28T18:08:16Z | 2023-04-28T18:08:16Z | NONE | The zarr store does indeed use an integer in this case according to the .zmetadata file:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fill values in time arrays (numpy.datetime64) are lost in zarr 1685803922 | |
1527917772 | https://github.com/pydata/xarray/issues/7790#issuecomment-1527917772 | https://api.github.com/repos/pydata/xarray/issues/7790 | IC_kwDOAMm_X85bEizM | christine-e-smit 14983768 | 2023-04-28T18:07:40Z | 2023-04-28T18:07:40Z | NONE | @kmuehlbauer - I think I'm not understanding what you are suggesting because the zarr store is still not being read correctly when I switch the fill value to a different date: ```python Create a numpy array of type np.datetime64 with one fill value and one datetime_fill_value = np.datetime64("1900-01-01") time = np.array([time_fill_value,'2023-01-02'],dtype='M8[ns]') Create a dataset with this one arrayxr_time_array = xr.DataArray(data=time,dims=['time'],name='time') xr_ds = xr.Dataset(dict(time=xr_time_array)) print("******") print("Created with fill value 1900-01-01") print(xr_ds["time"]) Save the dataset to zarrlocation_new_fill = "from_xarray_new_fill.zarr" encoding = { "time":{"_FillValue":time_fill_value,"dtype":np.int64} } xr_ds.to_zarr(location_new_fill,encoding=encoding) xr_read = xr.open_zarr(location)
print("******")
print("Read back out of the zarr store with xarray")
print(xr_read["time"])
print(xr_read["time"].encoding)
Created with fill value 1900-01-01 <xarray.DataArray 'time' (time: 2)> array(['1900-01-01T00:00:00.000000000', '2023-01-02T00:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] 1900-01-01 2023-01-02 <xarray.DataArray 'time' (time: 2)> array(['2023-01-02T00:00:00.000000000', '2023-01-02T00:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] 2023-01-02 2023-01-02 {'chunks': (2,), 'preferred_chunks': {'time': 2}, 'compressor': Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0), 'filters': None, '_FillValue': -9.223372036854776e+18, 'units': 'days since 2023-01-02 00:00:00', 'calendar': 'proleptic_gregorian', 'dtype': dtype('float64')} ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fill values in time arrays (numpy.datetime64) are lost in zarr 1685803922 | |
1525774670 | https://github.com/pydata/xarray/issues/7790#issuecomment-1525774670 | https://api.github.com/repos/pydata/xarray/issues/7790 | IC_kwDOAMm_X85a8XlO | christine-e-smit 14983768 | 2023-04-27T14:13:58Z | 2023-04-27T14:13:58Z | NONE | Interestingly, xarray is also perfectly happy to read a numpy.datetime64 array out of a zarr store as long as the xarray metadata is present. xarray even helpfully creates an '_FillValue" attribute for the array so there is no confusion: ``` Create a zarr store directly with numpy.datetime64 typelocation_zarr_direct = "from_zarr.zarr" root = zarr.open(location_zarr_direct,mode='w') z_time_array = root.create_dataset( "time",data=time,shape=time.shape,chunks=time.shape,dtype=time.dtype, fill_value=time_fill_value ) Add xarray metadataz_time_array.attrs["_ARRAY_DIMENSIONS"] = ["time"] zarr.convenience.consolidate_metadata(location_zarr_direct) Use xarray to read this data outxr_read_from_zarr = xr.open_zarr(location_zarr_direct)
print(xr_read_from_zarr["time"])
So I am extremely confused as to why xarray encodes time arrays so strangely when it creates the zarr store itself! (Hence https://github.com/pydata/xarray/discussions/7776) |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fill values in time arrays (numpy.datetime64) are lost in zarr 1685803922 | |
1525766244 | https://github.com/pydata/xarray/issues/7790#issuecomment-1525766244 | https://api.github.com/repos/pydata/xarray/issues/7790 | IC_kwDOAMm_X85a8Vhk | christine-e-smit 14983768 | 2023-04-27T14:08:37Z | 2023-04-27T14:08:37Z | NONE | Ah! Okay. I did not know about the Interestingly, -9.223372036854776e+18 is just the float equivalent of numpy.datetime64('NaT'):
And I know this isn't an issue with zarr and NaT because I can create the zarr store directly with the zarr library and it's perfectly happy: ```python Create a zarr store directly with numpy.datetime64 typelocation_zarr_direct = "from_zarr.zarr" root = zarr.open(location_zarr_direct,mode='w') z_time_array = root.create_dataset( "time",data=time,shape=time.shape,chunks=time.shape,dtype=time.dtype, fill_value=time_fill_value ) zarr.convenience.consolidate_metadata(location_zarr_direct) Read it back out againread_zarr = zarr.open(location_zarr_direct,mode='r')
print(read_zarr["time"][:])
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Fill values in time arrays (numpy.datetime64) are lost in zarr 1685803922 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 1