home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

8 rows where user = 11075246 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • d1mach · 8 ✖

issue 1

  • Segfault writing large netcdf files to s3fs 8

author_association 1

  • NONE 8
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1272558504 https://github.com/pydata/xarray/issues/7146#issuecomment-1272558504 https://api.github.com/repos/pydata/xarray/issues/7146 IC_kwDOAMm_X85L2bOo d1mach 11075246 2022-10-09T14:49:33Z 2022-10-09T14:49:33Z NONE

I had to change ints and floats to doubles to reproduce the issue. ```python import h5py

N_TIMES = 48 with h5py.File("/my_s3_fs/test.nc", mode="w") as f: time = f.create_dataset("time", (N_TIMES,), dtype="d") time[:] = 0

d1 = f.create_dataset("d1", (N_TIMES, 201, 201), dtype="d")
d1[:] = 0

```

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segfault writing large netcdf files to s3fs 1402002645
1272553921 https://github.com/pydata/xarray/issues/7146#issuecomment-1272553921 https://api.github.com/repos/pydata/xarray/issues/7146 IC_kwDOAMm_X85L2aHB d1mach 11075246 2022-10-09T14:27:06Z 2022-10-09T14:27:06Z NONE

datatype seems to be not important. But the two variables are required to get a segfault. The following with just floats produces a segfault ``` import numpy as np import xarray as xr

N_TIMES=48 ds = xr.Dataset({"time": ("T", np.zeros((N_TIMES))), 'd1': (["T", "x", "y"], np.zeros((N_TIMES, 201,201)))}) ds.to_netcdf(path="/my_s3_fs/test_netcdf.nc", format="NETCDF4", mode="w") ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segfault writing large netcdf files to s3fs 1402002645
1272544819 https://github.com/pydata/xarray/issues/7146#issuecomment-1272544819 https://api.github.com/repos/pydata/xarray/issues/7146 IC_kwDOAMm_X85L2X4z d1mach 11075246 2022-10-09T13:37:57Z 2022-10-09T14:25:51Z NONE

It seems that we need the time variable to reproduce the problem. The following code does not fail: ``` import numpy as np import xarray as xr import pandas as pd

N_TIMES=64 ds = xr.Dataset({'d1': (["T", "x", "y"], np.zeros((N_TIMES, 201,201)))}) ds.to_netcdf(path="/my_s3_fs/test_netcdf.nc", format="NETCDF4", mode="w") ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segfault writing large netcdf files to s3fs 1402002645
1272541759 https://github.com/pydata/xarray/issues/7146#issuecomment-1272541759 https://api.github.com/repos/pydata/xarray/issues/7146 IC_kwDOAMm_X85L2XI_ d1mach 11075246 2022-10-09T13:21:22Z 2022-10-09T13:21:40Z NONE

The first one results in a segfault: ```python import numpy as np import xarray as xr import pandas as pd

N_TIMES = 48 time_vals = pd.date_range("2022-10-06", freq="20 min", periods=N_TIMES) ds = xr.Dataset({"time": ("T", time_vals), 'd1': (["T", "x", "y"], np.zeros((len(time_vals), 201,201)))}) ds.to_netcdf(path="/my_s3_fs/test_netcdf.nc", format="NETCDF4", mode="w") ``` Not sure how to add the 3D var to the second dataset.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segfault writing large netcdf files to s3fs 1402002645
1272539102 https://github.com/pydata/xarray/issues/7146#issuecomment-1272539102 https://api.github.com/repos/pydata/xarray/issues/7146 IC_kwDOAMm_X85L2Wfe d1mach 11075246 2022-10-09T13:08:26Z 2022-10-09T13:08:26Z NONE

Will try to reproduce this with h5py. For the bug to show up the file has to be large enough. That is why my example has a 2D array variable alongside the time dimension. With just the time dimension the script completes without an error. All three cases work without an error: ds.to_netcdf(), ds2.to_netcdf(), and ds3.to_netcdf()

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segfault writing large netcdf files to s3fs 1402002645
1272514059 https://github.com/pydata/xarray/issues/7146#issuecomment-1272514059 https://api.github.com/repos/pydata/xarray/issues/7146 IC_kwDOAMm_X85L2QYL d1mach 11075246 2022-10-09T10:48:35Z 2022-10-09T10:48:35Z NONE

Adding a gdb stackrace.txt from corefile obtained with docker run -v /mnt/fs:/my_s3_fs -it --rm --ulimit core=-1 --privileged netcdf:latest /bin/bash and sudo sysctl -w kernel.core_pattern=/tmp/core-%e.%p.%h.%t python mcve.py

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segfault writing large netcdf files to s3fs 1402002645
1272365368 https://github.com/pydata/xarray/issues/7146#issuecomment-1272365368 https://api.github.com/repos/pydata/xarray/issues/7146 IC_kwDOAMm_X85L1sE4 d1mach 11075246 2022-10-08T17:37:24Z 2022-10-08T17:37:24Z NONE

libnetcdf, netcdf4 and hdf5 are at their latest versions available on conda-forge

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segfault writing large netcdf files to s3fs 1402002645
1272364031 https://github.com/pydata/xarray/issues/7146#issuecomment-1272364031 https://api.github.com/repos/pydata/xarray/issues/7146 IC_kwDOAMm_X85L1rv_ d1mach 11075246 2022-10-08T17:30:41Z 2022-10-08T17:30:41Z NONE

Can confirm the issue with xarray 2022.6.0 and dask 2022.9.2. The latest versions available on conda-forge. The issue might be related to netcdf4 and hdf5 libraries. Will try to update this as well.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segfault writing large netcdf files to s3fs 1402002645

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.45ms · About: xarray-datasette