home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

9 rows where author_association = "CONTRIBUTOR", issue = 710876876 and user = 500246 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • gerritholl · 9 ✖

issue 1

  • Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements · 9 ✖

author_association 1

  • CONTRIBUTOR · 9 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
703058067 https://github.com/pydata/xarray/issues/4471#issuecomment-703058067 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMzA1ODA2Nw== gerritholl 500246 2020-10-03T06:59:07Z 2020-10-03T06:59:07Z CONTRIBUTOR

I can try to fix this in a PR, I just need to be sure what the fix should look like - to change the dimensionality of attributes (has the potential to break backward compatibility) or to adapt other components to handle either scalars or length 1 arrays (safer alternative, but may occur in more locations both inside and outside xarray, so in this case perhaps a note in the documentation could be in order as well). I don't know if xarray thrives for consistency between what the different engines expose on opening the same file.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702708138 https://github.com/pydata/xarray/issues/4471#issuecomment-702708138 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjcwODEzOA== gerritholl 500246 2020-10-02T12:32:40Z 2020-10-02T12:32:40Z CONTRIBUTOR

According to The NetCDF User's Guide, attributes are supposed to be vectors:

The current version treats all attributes as vectors; scalar values are treated as single-element vectors.

That suggests that, strictly speaking, the h5netcdf engine is right and the netcdf4 engine is wrong, and that other components (such as where the scale factor and add_offset are applied) need to be adapted to handle arrays of length 1 for those values.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702671253 https://github.com/pydata/xarray/issues/4471#issuecomment-702671253 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjY3MTI1Mw== gerritholl 500246 2020-10-02T11:07:33Z 2020-10-02T11:07:33Z CONTRIBUTOR

The ds.load() prevents the traceback because it means the entire n-d data variable is multiplied with the 1-d scale factor. Similarly, requesting a slice (ds["Rad"][400:402, 300:302]) also prevents a traceback. The traceback occurs if a single value is requested, because then Python will complain about multiplying a scalar with a 1-d array. I'm not entirely sure why, but would be a numpy issue:

``` In [7]: a = np.array(0)

In [8]: b = np.array([0])

In [9]: a * b Out[9]: array([0])

In [10]: a *= b

ValueError Traceback (most recent call last) <ipython-input-10-0d04f348f081> in <module> ----> 1 a *= b

ValueError: non-broadcastable output operand with shape () doesn't match the broadcast shape (1,) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702645270 https://github.com/pydata/xarray/issues/4471#issuecomment-702645270 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjY0NTI3MA== gerritholl 500246 2020-10-02T10:10:45Z 2020-10-02T10:10:45Z CONTRIBUTOR

Interestingly, the problem is prevented if one adds

ds.load()

before the print statement.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702643539 https://github.com/pydata/xarray/issues/4471#issuecomment-702643539 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjY0MzUzOQ== gerritholl 500246 2020-10-02T10:07:17Z 2020-10-02T10:07:34Z CONTRIBUTOR

My last comment was inaccurate. Although the open succeeds, the non-scalar scale factor does trigger failure upon accessing data (due to lazy loading) even without any open file:

python import xarray fn = "OR_ABI-L1b-RadF-M3C07_G16_s20170732006100_e20170732016478_c20170732016514.nc" with xarray.open_dataset(fn, engine="h5netcdf") as ds: print(ds["Rad"][400, 300])

The data file is publicly available at:

s3://noaa-goes16/ABI-L1b-RadF/2017/073/20/OR_ABI-L1b-RadF-M3C07_G16_s20170732006100_e20170732016478_c20170732016514.nc

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702021297 https://github.com/pydata/xarray/issues/4471#issuecomment-702021297 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjAyMTI5Nw== gerritholl 500246 2020-10-01T09:47:00Z 2020-10-01T09:47:00Z CONTRIBUTOR

However, a simple `xarray.open_dataset(fn, engine="h5netcdf") still fails with ValueError only if passed an open file, so there appear to be still other differences apart from the dimensionality of the variable attributes depending on backend.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
702018925 https://github.com/pydata/xarray/issues/4471#issuecomment-702018925 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMjAxODkyNQ== gerritholl 500246 2020-10-01T09:42:35Z 2020-10-01T09:42:35Z CONTRIBUTOR

Some further digging shows it's due to differences between the h5netcdf and netcdf4 backends:

python import xarray fn = "/data/gholl/cache/fogtools/abi/2017/03/14/20/06/7/OR_ABI-L1b-RadF-M3C07_G16_s20170732006100_e20170732016478_c20170732016514.nc" with xarray.open_dataset(fn, decode_cf=False, mask_and_scale=False, engine="netcdf4") as ds: print(ds["esun"].attrs["_FillValue"]) print(ds["Rad"].attrs["scale_factor"]) with xarray.open_dataset(fn, decode_cf=False, mask_and_scale=False, engine="h5netcdf") as ds: print(ds["esun"].attrs["_FillValue"]) print(ds["Rad"].attrs["scale_factor"])

Results in:

-999.0 0.001564351 [-999.] [0.00156435]

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
701973665 https://github.com/pydata/xarray/issues/4471#issuecomment-701973665 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMTk3MzY2NQ== gerritholl 500246 2020-10-01T08:20:20Z 2020-10-01T08:20:20Z CONTRIBUTOR

Probably related: when reading an open file through a file system instance, the _FillValue, scale_factor, and add_offset are arrays of length one. When opening by passing a filename, those are all scalar (as expected):

python import xarray from fsspec.implementations.local import LocalFileSystem fn = "/data/gholl/cache/fogtools/abi/2017/03/14/20/06/7/OR_ABI-L1b-RadF-M3C07_G16_s20170732006100_e20170732016478_c20170732016514.nc" ds1 = xarray.open_dataset(fn, decode_cf=True, mask_and_scale=False) print(ds1["esun"].attrs["_FillValue"]) print(ds1["Rad"].attrs["scale_factor"]) with LocalFileSystem().open(fn) as of: ds2 = xarray.open_dataset(of, decode_cf=True, mask_and_scale=False) print(ds2["esun"].attrs["_FillValue"]) print(ds2["Rad"].attrs["scale_factor"])

Result:

-999.0 0.001564351 [-999.] [0.00156435]

I strongly suspect that this is what causes the ValueError, and in any case it also causes downstream problems even if opening succeeds as per the previous comment.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876
701948369 https://github.com/pydata/xarray/issues/4471#issuecomment-701948369 https://api.github.com/repos/pydata/xarray/issues/4471 MDEyOklzc3VlQ29tbWVudDcwMTk0ODM2OQ== gerritholl 500246 2020-10-01T07:33:11Z 2020-10-01T07:33:11Z CONTRIBUTOR

I just tested this with some more combinations:

  • decode_cf=True, mask_and_scale=False, everything seems fine.
  • decode_cf=False, mask_and_scale=True, everything seems fine.
  • decode_cf=True, mask_and_scale=True results in the ValueError and associated traceback.
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Numeric scalar variable attributes (including fill_value, scale_factor, add_offset) are 1-d instead of 0-d with h5netcdf engine, triggering ValueError: non-broadcastable output on application when loading single elements 710876876

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 28.986ms · About: xarray-datasette