home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

3 rows where state = "closed" and user = 57914115 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 2
  • pull 1

state 1

  • closed · 3 ✖

repo 1

  • xarray 3
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1303371718 I_kwDOAMm_X85Nr9_G 6781 Cannot open dataset with empty list units antscloud 57914115 closed 0     6 2022-07-13T12:33:11Z 2022-10-03T20:32:06Z 2022-10-03T20:32:05Z CONTRIBUTOR      

What happened?

I found myself using a netcdf with empty units and by using xarray i was unable to use open_dataset due to the parsing of cf conventions. I reproduce the bug, and it happens in a particular situation when the units is an empty list (See Minimal Complete Verifiable Example)

What did you expect to happen?

To parse the units attribute as an empty string ?

Minimal Complete Verifiable Example

```Python temp = 15 + 8 * np.random.randn(2, 2, 3) precip = 10 * np.random.rand(2, 2, 3) lon = [[-99.83, -99.32], [-99.79, -99.23]] lat = [[42.25, 42.21], [42.63, 42.59]]

for real use cases, its good practice to supply array attributes such as

units, but we won't bother here for the sake of brevity

ds = xr.Dataset( { "temperature": (["x", "y", "time"], temp), "precipitation": (["x", "y", "time"], precip), }, coords={ "lon": (["x", "y"], lon), "lat": (["x", "y"], lat), "time": pd.date_range("2014-09-06", periods=3), "reference_time": pd.Timestamp("2014-09-05"), }, ) ds.temperature.attrs["units"] = []

ds.to_netcdf("test.nc")

ds = xr.open_dataset("test.nc") ds.close() ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [ ] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [ ] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

```Python

TypeError Traceback (most recent call last) Input In [3], in <cell line: 1>() ----> 1 ds = xr.open_dataset("test.nc") 2 print(ds["temperature"].attrs) 3 ds.close()

File ~/.local/src/miniconda/envs/uptodatexarray/lib/python3.10/site-packages/xarray/backends/api.py:495, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, backend_kwargs, args, kwargs) 483 decoders = _resolve_decoders_kwargs( 484 decode_cf, 485 open_backend_dataset_parameters=backend.open_dataset_parameters, (...) 491 decode_coords=decode_coords, 492 ) 494 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None) --> 495 backend_ds = backend.open_dataset( 496 filename_or_obj, 497 drop_variables=drop_variables, 498 decoders, 499 kwargs, 500 ) 501 ds = _dataset_from_backend_dataset( 502 backend_ds, 503 filename_or_obj, (...) 510 *kwargs, 511 ) 512 return ds

File ~/.local/src/miniconda/envs/uptodatexarray/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:564, in NetCDF4BackendEntrypoint.open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, format, clobber, diskless, persist, lock, autoclose) 562 store_entrypoint = StoreBackendEntrypoint() 563 with close_on_error(store): --> 564 ds = store_entrypoint.open_dataset( 565 store, 566 mask_and_scale=mask_and_scale, 567 decode_times=decode_times, 568 concat_characters=concat_characters, 569 decode_coords=decode_coords, 570 drop_variables=drop_variables, 571 use_cftime=use_cftime, 572 decode_timedelta=decode_timedelta, 573 ) 574 return ds

File ~/.local/src/miniconda/envs/uptodatexarray/lib/python3.10/site-packages/xarray/backends/store.py:27, in StoreBackendEntrypoint.open_dataset(self, store, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta) 24 vars, attrs = store.load() 25 encoding = store.get_encoding() ---> 27 vars, attrs, coord_names = conventions.decode_cf_variables( 28 vars, 29 attrs, 30 mask_and_scale=mask_and_scale, 31 decode_times=decode_times, 32 concat_characters=concat_characters, 33 decode_coords=decode_coords, 34 drop_variables=drop_variables, 35 use_cftime=use_cftime, 36 decode_timedelta=decode_timedelta, 37 ) 39 ds = Dataset(vars, attrs=attrs) 40 ds = ds.set_coords(coord_names.intersection(vars))

File ~/.local/src/miniconda/envs/uptodatexarray/lib/python3.10/site-packages/xarray/conventions.py:503, in decode_cf_variables(variables, attributes, concat_characters, mask_and_scale, decode_times, decode_coords, drop_variables, use_cftime, decode_timedelta) 499 continue 500 stack_char_dim = ( 501 concat_characters and v.dtype == "S1" and v.ndim > 0 and stackable(v.dims[-1]) 502 ) --> 503 new_vars[k] = decode_cf_variable( 504 k, 505 v, 506 concat_characters=concat_characters, 507 mask_and_scale=mask_and_scale, 508 decode_times=decode_times, 509 stack_char_dim=stack_char_dim, 510 use_cftime=use_cftime, 511 decode_timedelta=decode_timedelta, 512 ) 513 if decode_coords in [True, "coordinates", "all"]: 514 var_attrs = new_vars[k].attrs

File ~/.local/src/miniconda/envs/uptodatexarray/lib/python3.10/site-packages/xarray/conventions.py:354, in decode_cf_variable(name, var, concat_characters, mask_and_scale, decode_times, decode_endianness, stack_char_dim, use_cftime, decode_timedelta) 351 var = coder.decode(var, name=name) 353 if decode_timedelta: --> 354 var = times.CFTimedeltaCoder().decode(var, name=name) 355 if decode_times: 356 var = times.CFDatetimeCoder(use_cftime=use_cftime).decode(var, name=name)

File ~/.local/src/miniconda/envs/uptodatexarray/lib/python3.10/site-packages/xarray/coding/times.py:537, in CFTimedeltaCoder.decode(self, variable, name) 534 def decode(self, variable, name=None): 535 dims, data, attrs, encoding = unpack_for_decoding(variable) --> 537 if "units" in attrs and attrs["units"] in TIME_UNITS: 538 units = pop_to(attrs, encoding, "units") 539 transform = partial(decode_cf_timedelta, units=units)

TypeError: unhashable type: 'numpy.ndarray' ```

Anything else we need to know?

The following assignation produces the bug :

python ds.temperature.attrs["units"] = []

But these ones does not produce the bug : python ds.temperature.attrs["units"] = "[]" ds.temperature.attrs["units"] = ""

Also, i don't know how the units attributes get encoded for writing but i see no difference between ds.temperature.attrs["units"] = "" and ds.temperature.attrs["units"] = [] when using ncdump on the file

Environment

This bug was encountered with versions below this one.

INSTALLED VERSIONS ------------------ commit: None python: 3.10.4 (main, Mar 31 2022, 08:41:55) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 5.13.0-52-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: fr_FR.UTF-8 LOCALE: ('fr_FR', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.6.1 xarray: 0.20.1 pandas: 1.4.3 numpy: 1.22.3 scipy: None netCDF4: 1.5.7 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.5.1.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.5 dask: None distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None fsspec: None cupy: None pint: None sparse: None setuptools: 61.2.0 pip: 22.1.2 conda: None pytest: None IPython: 8.4.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6781/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1068680815 PR_kwDOAMm_X84vQ2hE 6037 Fix wrong typing for tolerance in reindex antscloud 57914115 closed 0     6 2021-12-01T17:19:08Z 2022-01-15T17:28:08Z 2022-01-15T17:27:56Z CONTRIBUTOR   0 pydata/xarray/pulls/6037

In the xarray.core.dataset.pymodule, more particulary in the reindex method the tolerance argument is set to be a Number

https://github.com/pydata/xarray/blob/f08672847abec18f46df75e2f620646d27fa41a2/xarray/core/dataset.py#L2743

But the _reindex function call reindex_variable function. In the reindex_variable the type for tolerance is Any. This function ends to call get_indexer_nd which call the pandas function get_indexer :

https://github.com/pydata/xarray/blob/f08672847abec18f46df75e2f620646d27fa41a2/xarray/core/indexes.py#L137

In pandas the type of tolerance according to the docs can be a scalar or a list-like object

  • [x] Tests added
  • [x] Passes pre-commit run --all-files
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6037/reactions",
    "total_count": 2,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 2,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
869948050 MDU6SXNzdWU4Njk5NDgwNTA= 5230 Same files in open_mfdataset() unclear error message antscloud 57914115 closed 0     3 2021-04-28T13:26:59Z 2021-04-30T12:41:17Z 2021-04-30T12:41:17Z CONTRIBUTOR      

When using xr.open_mfdataset() with two exact same files by mistake, it causes an unclear error message

What happened:

With of course the time dimension existing :

python ds=xr.open_mfdataset(["some_file.nc","some_file.nc"],concat_dim="time",engine="netcdf4")

```python

ValueError Traceback (most recent call last)

~/.local/src/miniconda/envs/minireobs/lib/python3.8/site-packages/xarray/backends/api.py in open_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, lock, data_vars, coords, combine, parallel, join, attrs_file, **kwargs) 966 # Redo ordering from coordinates, ignoring how they were ordered 967 # previously --> 968 combined = combine_by_coords( 969 datasets, 970 compat=compat,

~/.local/src/miniconda/envs/minireobs/lib/python3.8/site-packages/xarray/core/combine.py in combine_by_coords(datasets, compat, data_vars, coords, fill_value, join, combine_attrs) 762 concatenated_grouped_by_data_vars = [] 763 for vars, datasets_with_same_vars in grouped_by_vars: --> 764 combined_ids, concat_dims = _infer_concat_order_from_coords( 765 list(datasets_with_same_vars) 766 )

~/.local/src/miniconda/envs/minireobs/lib/python3.8/site-packages/xarray/core/combine.py in _infer_concat_order_from_coords(datasets) 106 107 if len(datasets) > 1 and not concat_dims: --> 108 raise ValueError( 109 "Could not find any dimension coordinates to use to " 110 "order the datasets for concatenation"

ValueError: Could not find any dimension coordinates to use to order the datasets for concatenation ``` What you expected to happen: A warning saying that we are using the same dataset ? A more explicit error message (exact same dimensions) ? No error and no concatenation, remove duplicated datasets?

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.5 (default, Sep 4 2020, 07:30:14) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 5.4.0-72-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: fr_FR.UTF-8 LOCALE: fr_FR.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.7.3 xarray: 0.17.0 pandas: 1.1.1 numpy: 1.19.2 scipy: 1.5.2 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.2.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2021.04.0 distributed: 2021.04.0 matplotlib: 3.3.1 cartopy: None seaborn: None numbagg: None pint: None setuptools: 49.6.0.post20200814 pip: 20.2.2 conda: None pytest: 6.1.1 IPython: 7.18.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5230/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 19.687ms · About: xarray-datasette