home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

10 rows where issue = 1596115847 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 6

  • kmuehlbauer 3
  • pp-mo 2
  • trexfeathers 2
  • dcherian 1
  • Mikejmnez 1
  • gewitterblitz 1

author_association 3

  • NONE 5
  • MEMBER 4
  • CONTRIBUTOR 1

issue 1

  • HDF5-DIAG warnings calling `open_mfdataset` with more than `file_cache_maxsize` datasets (hdf5 1.12.2) · 10 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1483958731 https://github.com/pydata/xarray/issues/7549#issuecomment-1483958731 https://api.github.com/repos/pydata/xarray/issues/7549 IC_kwDOAMm_X85Yc2nL Mikejmnez 8241481 2023-03-26T00:41:10Z 2023-03-26T00:41:10Z CONTRIBUTOR

Thanks everybody. Similar to @gewitterblitz and based on https://github.com/SciTools/iris/issues/5187 , pinning libnetcdf to v4.8.1 did the trick

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  HDF5-DIAG warnings calling `open_mfdataset` with more than `file_cache_maxsize` datasets (hdf5 1.12.2) 1596115847
1483473963 https://github.com/pydata/xarray/issues/7549#issuecomment-1483473963 https://api.github.com/repos/pydata/xarray/issues/7549 IC_kwDOAMm_X85YbAQr gewitterblitz 13985417 2023-03-24T21:58:38Z 2023-03-24T21:58:38Z NONE

Thanks, @trexfeathers. I had the same problem of HDF5 DIAG warnings after upgrading to xarray v2023.3.0 yesterday. Your diagnosis in SciTools/iris#5187 helped isolate the issue with libnetcdf v4.9.1. Downgrading libnetcdf to v4.8.1 resulted in no HDF5 warnings.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  HDF5-DIAG warnings calling `open_mfdataset` with more than `file_cache_maxsize` datasets (hdf5 1.12.2) 1596115847
1461777692 https://github.com/pydata/xarray/issues/7549#issuecomment-1461777692 https://api.github.com/repos/pydata/xarray/issues/7549 IC_kwDOAMm_X85XIPUc pp-mo 2089069 2023-03-09T10:43:50Z 2023-03-09T10:43:50Z NONE

@trexfeathers Before v1.6.1, I believe

Oops, fixed.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  HDF5-DIAG warnings calling `open_mfdataset` with more than `file_cache_maxsize` datasets (hdf5 1.12.2) 1596115847
1461751910 https://github.com/pydata/xarray/issues/7549#issuecomment-1461751910 https://api.github.com/repos/pydata/xarray/issues/7549 IC_kwDOAMm_X85XIJBm pp-mo 2089069 2023-03-09T10:28:24Z 2023-03-09T10:43:22Z NONE

@pp-mo 👀

See our issues here : https://github.com/SciTools/iris/issues/5187 It does look a lot like the "same" problem. It is specifically related to the use of dask and multi-threading. Here in python-netCDF4 land, we were protected from all of that before netCDF4 v1.6.1, since before that all netcdf operations held the Python GIL so could not be re-entered by threads.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  HDF5-DIAG warnings calling `open_mfdataset` with more than `file_cache_maxsize` datasets (hdf5 1.12.2) 1596115847
1461755281 https://github.com/pydata/xarray/issues/7549#issuecomment-1461755281 https://api.github.com/repos/pydata/xarray/issues/7549 IC_kwDOAMm_X85XIJ2R trexfeathers 40734014 2023-03-09T10:30:28Z 2023-03-09T10:30:28Z NONE

before netCDF4 v1.6.2

Before v1.6.1, I believe

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  HDF5-DIAG warnings calling `open_mfdataset` with more than `file_cache_maxsize` datasets (hdf5 1.12.2) 1596115847
1460606256 https://github.com/pydata/xarray/issues/7549#issuecomment-1460606256 https://api.github.com/repos/pydata/xarray/issues/7549 IC_kwDOAMm_X85XDxUw trexfeathers 40734014 2023-03-08T18:00:35Z 2023-03-08T18:00:35Z NONE

@pp-mo 👀

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  HDF5-DIAG warnings calling `open_mfdataset` with more than `file_cache_maxsize` datasets (hdf5 1.12.2) 1596115847
1449702243 https://github.com/pydata/xarray/issues/7549#issuecomment-1449702243 https://api.github.com/repos/pydata/xarray/issues/7549 IC_kwDOAMm_X85WaLNj kmuehlbauer 5821660 2023-03-01T09:37:43Z 2023-03-01T09:37:43Z MEMBER

This as far I can get for the moment. @mx-moth I'd suggest to go upstream (netCDF4/netcdf-c) with details about this issue. At least we can rule out an issue only related to hdf5=1.12.2.

Maybe @DennisHeimbigner can shed more light here? ( #005: H5Oattribute.c line 494 in H5O__attr_open_by_name(): can't locate attribute: '_QuantizeBitRoundNumberOfSignificantBits' major: Attribute minor: Object not found

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  HDF5-DIAG warnings calling `open_mfdataset` with more than `file_cache_maxsize` datasets (hdf5 1.12.2) 1596115847
1449662753 https://github.com/pydata/xarray/issues/7549#issuecomment-1449662753 https://api.github.com/repos/pydata/xarray/issues/7549 IC_kwDOAMm_X85WaBkh kmuehlbauer 5821660 2023-03-01T09:19:26Z 2023-03-01T09:31:35Z MEMBER

I just tested this with netcdf-c 4.9.1 but still these errors show up, also using conda-forge only install.

To make this even weirder I've checked creation/reading with only hdf5/h5py/h5netcdf in the environment. Seems everything is working well.

```python import argparse import pathlib import tempfile from typing import List

import h5netcdf.legacyapi as nc import xarray

HERE = pathlib.Path(file).parent

def add_arguments(parser: argparse.ArgumentParser): parser.add_argument('count', type=int, default=200, nargs='?') parser.add_argument('--file-cache-maxsize', type=int, required=False)

def main(): parser = argparse.ArgumentParser() add_arguments(parser) opts = parser.parse_args()

if opts.file_cache_maxsize is not None:
    xarray.set_options(file_cache_maxsize=opts.file_cache_maxsize)

temp_dir = tempfile.mkdtemp(dir=HERE, prefix='work-dir-')
work_dir = pathlib.Path(temp_dir)
print("Working in", work_dir.name)

print("Making", opts.count, "datasets")
dataset_paths = make_many_datasets(work_dir, count=opts.count)

print("Combining", len(dataset_paths), "datasets")
dataset = xarray.open_mfdataset(dataset_paths, lock=False, engine="h5netcdf")
dataset.to_netcdf(work_dir / 'combined.nc', engine="h5netcdf")

def make_many_datasets( work_dir: pathlib.Path, count: int = 200 ) -> List[pathlib.Path]: dataset_paths = [] for i in range(count): variable = f'var_{i}' path = work_dir / f'{variable}.nc' dataset_paths.append(path) make_dataset(path, variable)

return dataset_paths

def make_dataset( path: pathlib.Path, variable: str, ) -> None: ds = nc.Dataset(path, "w") ds.createDimension("x", 1) var = ds.createVariable(variable, "i8", ("x",)) var[:] = 1 ds.close()

if name == 'main': main() ```

```

This will show no error. Defaults to making 200 files

$ python3 ./test.py

This will also not show the error - the number of files is less than file_cache_maxsize:

$ python3 ./test.py 127

This will adjust file_cache_maxsize to show the error again, despite the lower number of files, here we see another error issued by h5py

$ python3 ./test.py 11 --file-cache-maxsize=10 ```

python Working in work-dir-40mwn69y Making 11 datasets Combining 11 datasets Traceback (most recent call last): File "/home/kai/python/gists/xarray/7549.py", line 65, in <module> main() File "/home/kai/python/gists/xarray/7549.py", line 36, in main dataset.to_netcdf(work_dir / 'combined.nc', engine="h5netcdf") File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/xarray/core/dataset.py", line 1911, in to_netcdf return to_netcdf( # type: ignore # mypy cannot resolve the overloads:( File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/xarray/backends/api.py", line 1226, in to_netcdf writes = writer.sync(compute=compute) File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/xarray/backends/common.py", line 172, in sync delayed_store = da.store( File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/dask/array/core.py", line 1236, in store compute_as_if_collection(Array, store_dsk, map_keys, **kwargs) File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/dask/base.py", line 341, in compute_as_if_collection return schedule(dsk2, keys, **kwargs) File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/dask/threaded.py", line 89, in get results = get_async( File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/dask/local.py", line 511, in get_async raise_exception(exc, tb) File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/dask/local.py", line 319, in reraise raise exc File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/dask/local.py", line 224, in execute_task result = _execute_task(task, data) File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/dask/core.py", line 119, in _execute_task return func(*(_execute_task(a, cache) for a in args)) File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/dask/array/core.py", line 126, in getter c = np.asarray(c) File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/xarray/core/indexing.py", line 459, in __array__ return np.asarray(self.array, dtype=dtype) File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/xarray/core/indexing.py", line 623, in __array__ return np.asarray(self.array, dtype=dtype) File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/xarray/core/indexing.py", line 524, in __array__ return np.asarray(array[self.key], dtype=None) File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/xarray/backends/h5netcdf_.py", line 43, in __getitem__ return indexing.explicit_indexing_adapter( File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/xarray/core/indexing.py", line 815, in explicit_indexing_adapter result = raw_indexing_method(raw_key.tuple) File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/xarray/backends/h5netcdf_.py", line 50, in _getitem return array[key] File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/h5netcdf/core.py", line 337, in __getitem__ padding = self._get_padding(key) File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/h5netcdf/core.py", line 291, in _get_padding shape = self.shape File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/h5netcdf/core.py", line 268, in shape return tuple([self._parent._all_dimensions[d].size for d in self.dimensions]) File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/h5netcdf/core.py", line 268, in <listcomp> return tuple([self._parent._all_dimensions[d].size for d in self.dimensions]) File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/h5netcdf/dimensions.py", line 113, in size if self.isunlimited(): File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/h5netcdf/dimensions.py", line 133, in isunlimited return self._h5ds.maxshape == (None,) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "/home/kai/miniconda/envs/test-netcdf4/lib/python3.9/site-packages/h5py/_hl/dataset.py", line 588, in maxshape space = self.id.get_space() File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5d.pyx", line 299, in h5py.h5d.DatasetID.get_space ValueError: Invalid dataset identifier (invalid dataset identifier)

INSTALLED VERSIONS
------------------
commit: None
python: 3.9.16 | packaged by conda-forge | (main, Feb  1 2023, 21:39:03) 
[GCC 11.3.0]
python-bits: 64
OS: Linux
OS-release: 5.14.21-150400.24.46-default
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: de_DE.UTF-8
LOCALE: ('de_DE', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: None

xarray: 2023.2.0
pandas: 1.5.3
numpy: 1.24.2
scipy: None
netCDF4: None
pydap: None
h5netcdf: 1.1.0
h5py: 3.7.0
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2023.2.1
distributed: 2023.2.1
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: 2023.1.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 67.4.0
pip: 23.0.1
conda: None
pytest: None
mypy: None
IPython: None
sphinx: None

Update: added engine="h5netcdf" to call of open_mfdataset.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  HDF5-DIAG warnings calling `open_mfdataset` with more than `file_cache_maxsize` datasets (hdf5 1.12.2) 1596115847
1449454232 https://github.com/pydata/xarray/issues/7549#issuecomment-1449454232 https://api.github.com/repos/pydata/xarray/issues/7549 IC_kwDOAMm_X85WZOqY kmuehlbauer 5821660 2023-03-01T07:01:40Z 2023-03-01T07:01:40Z MEMBER

@dcherian Thanks for the ping.

I can reproduce in a fresh conda-forge env with pip installed netcdf4, xarray and dask.

@mx-moth A search brought up this likely related issue over at netcdf-c, https://github.com/Unidata/netcdf-c/issues/2458. The according PR with a fix https://github.com/Unidata/netcdf-c/pull/2461 is milestoned for netcdf-c 4.9.1.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  HDF5-DIAG warnings calling `open_mfdataset` with more than `file_cache_maxsize` datasets (hdf5 1.12.2) 1596115847
1449072125 https://github.com/pydata/xarray/issues/7549#issuecomment-1449072125 https://api.github.com/repos/pydata/xarray/issues/7549 IC_kwDOAMm_X85WXxX9 dcherian 2448579 2023-02-28T23:14:44Z 2023-02-28T23:14:44Z MEMBER

Not clear to me if this is a bug or just some verbose warnings.

@kmuehlbauer do you have any thoughts?

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 1
}
  HDF5-DIAG warnings calling `open_mfdataset` with more than `file_cache_maxsize` datasets (hdf5 1.12.2) 1596115847

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 710.42ms · About: xarray-datasette