html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/7146#issuecomment-1272560073,https://api.github.com/repos/pydata/xarray/issues/7146,1272560073,IC_kwDOAMm_X85L2bnJ,14808389,2022-10-09T14:56:28Z,2022-10-09T14:57:44Z,MEMBER,"Since we have eliminated `xarray` with this, you should be able to submit an issue to the [`h5py`](https://github.com/h5py/h5py) issue tracker while mentioning that this is probably a bug in `libhdf5` since `netcdf4` also fails with the same error (and you can also link this issue for more information)","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1402002645
https://github.com/pydata/xarray/issues/7146#issuecomment-1272558504,https://api.github.com/repos/pydata/xarray/issues/7146,1272558504,IC_kwDOAMm_X85L2bOo,11075246,2022-10-09T14:49:33Z,2022-10-09T14:49:33Z,NONE,"I had to change ints and floats to doubles to reproduce the issue.
```python
import h5py
N_TIMES = 48
with h5py.File(""/my_s3_fs/test.nc"", mode=""w"") as f:
time = f.create_dataset(""time"", (N_TIMES,), dtype=""d"")
time[:] = 0
d1 = f.create_dataset(""d1"", (N_TIMES, 201, 201), dtype=""d"")
d1[:] = 0
```","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1402002645
https://github.com/pydata/xarray/issues/7146#issuecomment-1272555653,https://api.github.com/repos/pydata/xarray/issues/7146,1272555653,IC_kwDOAMm_X85L2aiF,14808389,2022-10-09T14:36:13Z,2022-10-09T14:36:13Z,MEMBER,"great, good to know. Can you try this with `h5py`:
```python
import h5py
N_TIMES = 48
with h5py.File(""test.nc"", mode=""w"") as f:
time = f.create_dataset(""time"", (N_TIMES,), dtype=""i"")
time[:] = 0
d1 = f.create_dataset(""d1"", (N_TIMES, 201, 201), dtype=""f"")
d1[:] = 0
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1402002645
https://github.com/pydata/xarray/issues/7146#issuecomment-1272553921,https://api.github.com/repos/pydata/xarray/issues/7146,1272553921,IC_kwDOAMm_X85L2aHB,11075246,2022-10-09T14:27:06Z,2022-10-09T14:27:06Z,NONE,"datatype seems to be not important. But the two variables are required to get a segfault. The following with just floats produces a segfault
```
import numpy as np
import xarray as xr
N_TIMES=48
ds = xr.Dataset({""time"": (""T"", np.zeros((N_TIMES))), 'd1': ([""T"", ""x"", ""y""], np.zeros((N_TIMES, 201,201)))})
ds.to_netcdf(path=""/my_s3_fs/test_netcdf.nc"", format=""NETCDF4"", mode=""w"")
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1402002645
https://github.com/pydata/xarray/issues/7146#issuecomment-1272544819,https://api.github.com/repos/pydata/xarray/issues/7146,1272544819,IC_kwDOAMm_X85L2X4z,11075246,2022-10-09T13:37:57Z,2022-10-09T14:25:51Z,NONE,"It seems that we need the time variable to reproduce the problem. The following code does not fail:
```
import numpy as np
import xarray as xr
import pandas as pd
N_TIMES=64
ds = xr.Dataset({'d1': ([""T"", ""x"", ""y""], np.zeros((N_TIMES, 201,201)))})
ds.to_netcdf(path=""/my_s3_fs/test_netcdf.nc"", format=""NETCDF4"", mode=""w"")
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1402002645
https://github.com/pydata/xarray/issues/7146#issuecomment-1272550986,https://api.github.com/repos/pydata/xarray/issues/7146,1272550986,IC_kwDOAMm_X85L2ZZK,14808389,2022-10-09T14:09:44Z,2022-10-09T14:09:44Z,MEMBER,"okay, then does changing the dtype do anything? I.e. does this only happen with `datetime64` / bytes, or do int / float / str also fail?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1402002645
https://github.com/pydata/xarray/issues/7146#issuecomment-1272542780,https://api.github.com/repos/pydata/xarray/issues/7146,1272542780,IC_kwDOAMm_X85L2XY8,14808389,2022-10-09T13:26:25Z,2022-10-09T13:26:25Z,MEMBER,"with this:
```python
ds2 = ds.time.dt.strftime(""%Y%m%d%H%M%S"").str.encode(""utf-8"").to_dataset().assign(d1=ds.d1)
```
but we don't really need to check if the first dataset already fails.
Now I'd probably check if it's just the size that makes it fail (i.e. remove `""time""` from `ds` and keep just `d1` while maybe increasing it by one if it does not fail as-is), or if it depends on the `dtype` (i.e. replace set `time_vals` to `np.arange(N_TIMES, dtype=int)`).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1402002645
https://github.com/pydata/xarray/issues/7146#issuecomment-1272541759,https://api.github.com/repos/pydata/xarray/issues/7146,1272541759,IC_kwDOAMm_X85L2XI_,11075246,2022-10-09T13:21:22Z,2022-10-09T13:21:40Z,NONE,"The first one results in a segfault:
```python
import numpy as np
import xarray as xr
import pandas as pd
N_TIMES = 48
time_vals = pd.date_range(""2022-10-06"", freq=""20 min"", periods=N_TIMES)
ds = xr.Dataset({""time"": (""T"", time_vals), 'd1': ([""T"", ""x"", ""y""], np.zeros((len(time_vals), 201,201)))})
ds.to_netcdf(path=""/my_s3_fs/test_netcdf.nc"", format=""NETCDF4"", mode=""w"")
```
Not sure how to add the 3D var to the second dataset.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1402002645
https://github.com/pydata/xarray/issues/7146#issuecomment-1272539394,https://api.github.com/repos/pydata/xarray/issues/7146,1272539394,IC_kwDOAMm_X85L2WkC,14808389,2022-10-09T13:10:12Z,2022-10-09T13:10:25Z,MEMBER,which ones fail if you add the 3D variable?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1402002645
https://github.com/pydata/xarray/issues/7146#issuecomment-1272539102,https://api.github.com/repos/pydata/xarray/issues/7146,1272539102,IC_kwDOAMm_X85L2Wfe,11075246,2022-10-09T13:08:26Z,2022-10-09T13:08:26Z,NONE,"Will try to reproduce this with h5py. For the bug to show up the file has to be large enough. That is why my example has a 2D array variable alongside the time dimension. With just the time dimension the script completes without an error. All three cases work without an error: `ds.to_netcdf()`, `ds2.to_netcdf()`, and `ds3.to_netcdf()`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1402002645
https://github.com/pydata/xarray/issues/7146#issuecomment-1272535683,https://api.github.com/repos/pydata/xarray/issues/7146,1272535683,IC_kwDOAMm_X85L2VqD,14808389,2022-10-09T12:48:51Z,2022-10-09T12:49:28Z,MEMBER,"if this crashes with both `netcdf4` and `h5netcdf` this might be a bug in the `libhdf5` library. If we can manage to reduce this to use just `h5py` (or `netCDF4`), it should be suitable for reporting on their issue tracker, and those libraries can then push it further to `libhdf5` (otherwise, if you're up for investigating / debugging the C library, you could also report to `libhdf5` directly).
As for the MCVE: I wonder if we can trim it a bit. Can you reproduce with
```python
import xarray as xr
import pandas as pd
N_TIMES = 48
time_vals = pd.date_range(""2022-10-06"", freq=""20 min"", periods=N_TIMES)
ds = xr.Dataset({""time"": (""T"", time_vals)})
ds.to_netcdf(path=""/my_s3_fs/test_netcdf.nc"", format=""NETCDF4"", mode=""w"")
```
or, if it is important to have bytes:
```python
ds2 = ds.time.dt.strftime(""%Y%m%d%H%M%S"").str.encode(""utf-8"").to_dataset()
ds2.to_netcdf(...)
```
also it would be interesting to know if this happens only for data variables, or if coordinates have the same effect (use `ds2` instead of `ds2` if bytes are important):
```python
ds3 = ds.set_coords(""time"")
ds3.to_netcdf(...)
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1402002645
https://github.com/pydata/xarray/issues/7146#issuecomment-1272514059,https://api.github.com/repos/pydata/xarray/issues/7146,1272514059,IC_kwDOAMm_X85L2QYL,11075246,2022-10-09T10:48:35Z,2022-10-09T10:48:35Z,NONE,"Adding a gdb [stackrace.txt](https://github.com/pydata/xarray/files/9741351/stackrace.txt) from corefile obtained with
```
docker run -v /mnt/fs:/my_s3_fs -it --rm --ulimit core=-1 --privileged netcdf:latest /bin/bash
```
and
```
sudo sysctl -w kernel.core_pattern=/tmp/core-%e.%p.%h.%t
python mcve.py
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1402002645
https://github.com/pydata/xarray/issues/7146#issuecomment-1272366287,https://api.github.com/repos/pydata/xarray/issues/7146,1272366287,IC_kwDOAMm_X85L1sTP,5635139,2022-10-08T17:43:18Z,2022-10-08T17:43:18Z,MEMBER,Thanks @d1mach . Could it be related to https://github.com/pydata/xarray/issues/7136 ?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1402002645
https://github.com/pydata/xarray/issues/7146#issuecomment-1272365368,https://api.github.com/repos/pydata/xarray/issues/7146,1272365368,IC_kwDOAMm_X85L1sE4,11075246,2022-10-08T17:37:24Z,2022-10-08T17:37:24Z,NONE,"libnetcdf, netcdf4 and hdf5 are at their latest versions available on conda-forge","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1402002645
https://github.com/pydata/xarray/issues/7146#issuecomment-1272364031,https://api.github.com/repos/pydata/xarray/issues/7146,1272364031,IC_kwDOAMm_X85L1rv_,11075246,2022-10-08T17:30:41Z,2022-10-08T17:30:41Z,NONE,Can confirm the issue with xarray 2022.6.0 and dask 2022.9.2. The latest versions available on conda-forge. The issue might be related to netcdf4 and hdf5 libraries. Will try to update this as well.,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1402002645
https://github.com/pydata/xarray/issues/7146#issuecomment-1272362257,https://api.github.com/repos/pydata/xarray/issues/7146,1272362257,IC_kwDOAMm_X85L1rUR,5635139,2022-10-08T17:20:00Z,2022-10-08T17:20:00Z,MEMBER,That's quite an old version of xarray! Could we confirm it has similar results on a more recent version?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1402002645