home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

8 rows where author_association = "MEMBER" and issue = 1402002645 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 2

  • keewis 6
  • max-sixty 2

issue 1

  • Segfault writing large netcdf files to s3fs · 8 ✖

author_association 1

  • MEMBER · 8 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1272560073 https://github.com/pydata/xarray/issues/7146#issuecomment-1272560073 https://api.github.com/repos/pydata/xarray/issues/7146 IC_kwDOAMm_X85L2bnJ keewis 14808389 2022-10-09T14:56:28Z 2022-10-09T14:57:44Z MEMBER

Since we have eliminated xarray with this, you should be able to submit an issue to the h5py issue tracker while mentioning that this is probably a bug in libhdf5 since netcdf4 also fails with the same error (and you can also link this issue for more information)

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segfault writing large netcdf files to s3fs 1402002645
1272555653 https://github.com/pydata/xarray/issues/7146#issuecomment-1272555653 https://api.github.com/repos/pydata/xarray/issues/7146 IC_kwDOAMm_X85L2aiF keewis 14808389 2022-10-09T14:36:13Z 2022-10-09T14:36:13Z MEMBER

great, good to know. Can you try this with h5py: ```python import h5py

N_TIMES = 48 with h5py.File("test.nc", mode="w") as f: time = f.create_dataset("time", (N_TIMES,), dtype="i") time[:] = 0

d1 = f.create_dataset("d1", (N_TIMES, 201, 201), dtype="f")
d1[:] = 0

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segfault writing large netcdf files to s3fs 1402002645
1272550986 https://github.com/pydata/xarray/issues/7146#issuecomment-1272550986 https://api.github.com/repos/pydata/xarray/issues/7146 IC_kwDOAMm_X85L2ZZK keewis 14808389 2022-10-09T14:09:44Z 2022-10-09T14:09:44Z MEMBER

okay, then does changing the dtype do anything? I.e. does this only happen with datetime64 / bytes, or do int / float / str also fail?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segfault writing large netcdf files to s3fs 1402002645
1272542780 https://github.com/pydata/xarray/issues/7146#issuecomment-1272542780 https://api.github.com/repos/pydata/xarray/issues/7146 IC_kwDOAMm_X85L2XY8 keewis 14808389 2022-10-09T13:26:25Z 2022-10-09T13:26:25Z MEMBER

with this: python ds2 = ds.time.dt.strftime("%Y%m%d%H%M%S").str.encode("utf-8").to_dataset().assign(d1=ds.d1) but we don't really need to check if the first dataset already fails.

Now I'd probably check if it's just the size that makes it fail (i.e. remove "time" from ds and keep just d1 while maybe increasing it by one if it does not fail as-is), or if it depends on the dtype (i.e. replace set time_vals to np.arange(N_TIMES, dtype=int)).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segfault writing large netcdf files to s3fs 1402002645
1272539394 https://github.com/pydata/xarray/issues/7146#issuecomment-1272539394 https://api.github.com/repos/pydata/xarray/issues/7146 IC_kwDOAMm_X85L2WkC keewis 14808389 2022-10-09T13:10:12Z 2022-10-09T13:10:25Z MEMBER

which ones fail if you add the 3D variable?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segfault writing large netcdf files to s3fs 1402002645
1272535683 https://github.com/pydata/xarray/issues/7146#issuecomment-1272535683 https://api.github.com/repos/pydata/xarray/issues/7146 IC_kwDOAMm_X85L2VqD keewis 14808389 2022-10-09T12:48:51Z 2022-10-09T12:49:28Z MEMBER

if this crashes with both netcdf4 and h5netcdf this might be a bug in the libhdf5 library. If we can manage to reduce this to use just h5py (or netCDF4), it should be suitable for reporting on their issue tracker, and those libraries can then push it further to libhdf5 (otherwise, if you're up for investigating / debugging the C library, you could also report to libhdf5 directly).

As for the MCVE: I wonder if we can trim it a bit. Can you reproduce with ```python import xarray as xr import pandas as pd

N_TIMES = 48 time_vals = pd.date_range("2022-10-06", freq="20 min", periods=N_TIMES) ds = xr.Dataset({"time": ("T", time_vals)}) ds.to_netcdf(path="/my_s3_fs/test_netcdf.nc", format="NETCDF4", mode="w") or, if it is important to have bytes:python ds2 = ds.time.dt.strftime("%Y%m%d%H%M%S").str.encode("utf-8").to_dataset() ds2.to_netcdf(...) also it would be interesting to know if this happens only for data variables, or if coordinates have the same effect (use `ds2` instead of `ds2` if bytes are important):python ds3 = ds.set_coords("time") ds3.to_netcdf(...) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segfault writing large netcdf files to s3fs 1402002645
1272366287 https://github.com/pydata/xarray/issues/7146#issuecomment-1272366287 https://api.github.com/repos/pydata/xarray/issues/7146 IC_kwDOAMm_X85L1sTP max-sixty 5635139 2022-10-08T17:43:18Z 2022-10-08T17:43:18Z MEMBER

Thanks @d1mach . Could it be related to https://github.com/pydata/xarray/issues/7136 ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segfault writing large netcdf files to s3fs 1402002645
1272362257 https://github.com/pydata/xarray/issues/7146#issuecomment-1272362257 https://api.github.com/repos/pydata/xarray/issues/7146 IC_kwDOAMm_X85L1rUR max-sixty 5635139 2022-10-08T17:20:00Z 2022-10-08T17:20:00Z MEMBER

That's quite an old version of xarray! Could we confirm it has similar results on a more recent version?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Segfault writing large netcdf files to s3fs 1402002645

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 960.049ms · About: xarray-datasette