home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

13 rows where issue = 710876876 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 4

  • gerritholl 9
  • dcherian 2
  • shoyer 1
  • djhoese 1

author_association 2

  • CONTRIBUTOR 10
  • MEMBER 3

issue 1

  • Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements · 13 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
703435472 https://github.com/pydata/xarray/issues/4471#issuecomment-703435472 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMzQzNTQ3Mg== shoyer 1217238 2020-10-05T06:49:52Z 2020-10-05T06:49:52Z MEMBER

I agree, xarray's decoding should be robust as to whether these attributes are scalars or vectors of length one.

This should probably be considered a bug in h5netcdf, which I guess should the assumption from netCDF4-python that vector attributes of length 1 are scalars.

(h5netcdf can store true scalar attributes in HDF5 files, but it's probably better to be consistent with netCDF)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
703286087 https://github.com/pydata/xarray/issues/4471#issuecomment-703286087 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMzI4NjA4Nw== dcherian 2448579 2020-10-04T17:12:32Z 2020-10-04T17:12:32Z MEMBER

adapt other components to handle either scalars or length 1 arrays

I think we can make this change safely in the decoding machinery. As you point out, it will be backwards compatible.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
703058067 https://github.com/pydata/xarray/issues/4471#issuecomment-703058067 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMzA1ODA2Nw== gerritholl 500246 2020-10-03T06:59:07Z 2020-10-03T06:59:07Z CONTRIBUTOR

I can try to fix this in a PR, I just need to be sure what the fix should look like - to change the dimensionality of attributes (has the potential to break backward compatibility) or to adapt other components to handle either scalars or length 1 arrays (safer alternative, but may occur in more locations both inside and outside xarray, so in this case perhaps a note in the documentation could be in order as well). I don't know if xarray thrives for consistency between what the different engines expose on opening the same file.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702872015 https://github.com/pydata/xarray/issues/4471#issuecomment-702872015 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjg3MjAxNQ== dcherian 2448579 2020-10-02T17:50:36Z 2020-10-02T17:50:36Z MEMBER

other components (such as where the scale factor and add_offset are applied) need to be adapted to handle arrays of length 1 for those values.

Great diagnosis @gerritholl . This could be fixed here (I think): https://github.com/pydata/xarray/blob/333e8dba55f0165ccadf18f2aaaee9257a4d716b/xarray/coding/variables.py#L245-L263 Are you up for sending in a PR?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702708138 https://github.com/pydata/xarray/issues/4471#issuecomment-702708138 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjcwODEzOA== gerritholl 500246 2020-10-02T12:32:40Z 2020-10-02T12:32:40Z CONTRIBUTOR

According to The NetCDF User's Guide, attributes are supposed to be vectors:

The current version treats all attributes as vectors; scalar values are treated as single-element vectors.

That suggests that, strictly speaking, the h5netcdf engine is right and the netcdf4 engine is wrong, and that other components (such as where the scale factor and add_offset are applied) need to be adapted to handle arrays of length 1 for those values.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702671253 https://github.com/pydata/xarray/issues/4471#issuecomment-702671253 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjY3MTI1Mw== gerritholl 500246 2020-10-02T11:07:33Z 2020-10-02T11:07:33Z CONTRIBUTOR

The ds.load() prevents the traceback because it means the entire n-d data variable is multiplied with the 1-d scale factor. Similarly, requesting a slice (ds["Rad"][400:402, 300:302]) also prevents a traceback. The traceback occurs if a single value is requested, because then Python will complain about multiplying a scalar with a 1-d array. I'm not entirely sure why, but would be a numpy issue:

``` In [7]: a = np.array(0)

In [8]: b = np.array([0])

In [9]: a * b Out[9]: array([0])

In [10]: a *= b

ValueError Traceback (most recent call last) <ipython-input-10-0d04f348f081> in <module> ----> 1 a *= b

ValueError: non-broadcastable output operand with shape () doesn't match the broadcast shape (1,) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702645270 https://github.com/pydata/xarray/issues/4471#issuecomment-702645270 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjY0NTI3MA== gerritholl 500246 2020-10-02T10:10:45Z 2020-10-02T10:10:45Z CONTRIBUTOR

Interestingly, the problem is prevented if one adds

ds.load()

before the print statement.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702643539 https://github.com/pydata/xarray/issues/4471#issuecomment-702643539 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjY0MzUzOQ== gerritholl 500246 2020-10-02T10:07:17Z 2020-10-02T10:07:34Z CONTRIBUTOR

My last comment was inaccurate. Although the open succeeds, the non-scalar scale factor does trigger failure upon accessing data (due to lazy loading) even without any open file:

python import xarray fn = "OR_ABI-L1b-RadF-M3C07_G16_s20170732006100_e20170732016478_c20170732016514.nc" with xarray.open_dataset(fn, engine="h5netcdf") as ds: print(ds["Rad"][400, 300])

The data file is publicly available at:

s3://noaa-goes16/ABI-L1b-RadF/2017/073/20/OR_ABI-L1b-RadF-M3C07_G16_s20170732006100_e20170732016478_c20170732016514.nc

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702021297 https://github.com/pydata/xarray/issues/4471#issuecomment-702021297 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjAyMTI5Nw== gerritholl 500246 2020-10-01T09:47:00Z 2020-10-01T09:47:00Z CONTRIBUTOR

However, a simple `xarray.open_dataset(fn, engine="h5netcdf") still fails with ValueError only if passed an open file, so there appear to be still other differences apart from the dimensionality of the variable attributes depending on backend.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702018925 https://github.com/pydata/xarray/issues/4471#issuecomment-702018925 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjAxODkyNQ== gerritholl 500246 2020-10-01T09:42:35Z 2020-10-01T09:42:35Z CONTRIBUTOR

Some further digging shows it's due to differences between the h5netcdf and netcdf4 backends:

python import xarray fn = "/data/gholl/cache/fogtools/abi/2017/03/14/20/06/7/OR_ABI-L1b-RadF-M3C07_G16_s20170732006100_e20170732016478_c20170732016514.nc" with xarray.open_dataset(fn, decode_cf=False, mask_and_scale=False, engine="netcdf4") as ds: print(ds["esun"].attrs["_FillValue"]) print(ds["Rad"].attrs["scale_factor"]) with xarray.open_dataset(fn, decode_cf=False, mask_and_scale=False, engine="h5netcdf") as ds: print(ds["esun"].attrs["_FillValue"]) print(ds["Rad"].attrs["scale_factor"])

Results in:

-999.0 0.001564351 [-999.] [0.00156435]

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
701973665 https://github.com/pydata/xarray/issues/4471#issuecomment-701973665 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMTk3MzY2NQ== gerritholl 500246 2020-10-01T08:20:20Z 2020-10-01T08:20:20Z CONTRIBUTOR

Probably related: when reading an open file through a file system instance, the _FillValue, scale_factor, and add_offset are arrays of length one. When opening by passing a filename, those are all scalar (as expected):

python import xarray from fsspec.implementations.local import LocalFileSystem fn = "/data/gholl/cache/fogtools/abi/2017/03/14/20/06/7/OR_ABI-L1b-RadF-M3C07_G16_s20170732006100_e20170732016478_c20170732016514.nc" ds1 = xarray.open_dataset(fn, decode_cf=True, mask_and_scale=False) print(ds1["esun"].attrs["_FillValue"]) print(ds1["Rad"].attrs["scale_factor"]) with LocalFileSystem().open(fn) as of: ds2 = xarray.open_dataset(of, decode_cf=True, mask_and_scale=False) print(ds2["esun"].attrs["_FillValue"]) print(ds2["Rad"].attrs["scale_factor"])

Result:

-999.0 0.001564351 [-999.] [0.00156435]

I strongly suspect that this is what causes the ValueError, and in any case it also causes downstream problems even if opening succeeds as per the previous comment.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
701948369 https://github.com/pydata/xarray/issues/4471#issuecomment-701948369 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMTk0ODM2OQ== gerritholl 500246 2020-10-01T07:33:11Z 2020-10-01T07:33:11Z CONTRIBUTOR

I just tested this with some more combinations:

  • decode_cf=True, mask_and_scale=False, everything seems fine.
  • decode_cf=False, mask_and_scale=True, everything seems fine.
  • decode_cf=True, mask_and_scale=True results in the ValueError and associated traceback.
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
700696450 https://github.com/pydata/xarray/issues/4471#issuecomment-700696450 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMDY5NjQ1MA== djhoese 1828519 2020-09-29T13:18:16Z 2020-09-29T13:18:16Z CONTRIBUTOR

Just tested this with decode_cf=False and the rest of the loading process seems fine (note: I used engine='h5netcdf' too).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 25.775ms · About: xarray-datasette
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows