home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 762725640

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/4822#issuecomment-762725640 https://api.github.com/repos/pydata/xarray/issues/4822 762725640 MDEyOklzc3VlQ29tbWVudDc2MjcyNTY0MA== 5821660 2021-01-19T09:43:53Z 2021-01-19T09:43:53Z MEMBER

Possibly related issue: https://github.com/h5netcdf/h5netcdf/issues/27

Anyway, I think I've found the root-cause of the issue. So with netcdf4-python it is possible to specify writing of attributes as NC_STRING string attributes.

The following will create a netcdf file which is readable with current xarray and h5netcdf backend.

```python import netCDF4 as nc import xarray as xr rootgrp = nc.Dataset("test_nc4_attrs.nc", mode="w", format="NETCDF4")

rootgrp.set_ncstring_attrs(True)

x = rootgrp.createDimension("x", 1) y = rootgrp.createDimension("y", 1) foo = rootgrp.createVariable("foo", "i4", ("y", "x")) foo.coordinates = "y x" foo[0, 0] = 0 rootgrp.close() ds = xr.open_dataset("test_nc4_attrs.nc", engine="h5netcdf") print(ds) print(ds.foo) ds.close() ```

If you uncomment the one line, this creates netcdf file which breaks with the error @yt87 showed above. If you try to open the file with decode_cf=False and write it back using to_netcdf the same error occurs.

h5dump before uncomment ATTRIBUTE "coordinates" { DATATYPE H5T_STRING { STRSIZE 3; STRPAD H5T_STR_NULLTERM; CSET H5T_CSET_ASCII; CTYPE H5T_C_S1; } DATASPACE SCALAR DATA { (0): "y x" } }

h5dump after uncomment ATTRIBUTE "coordinates" { DATATYPE H5T_STRING { STRSIZE H5T_VARIABLE; STRPAD H5T_STR_NULLTERM; CSET H5T_CSET_UTF8; CTYPE H5T_C_S1; } DATASPACE SIMPLE { ( 1 ) / ( 1 ) } DATA { (0): "y x" } }

Another thing I found, that this only accounts for coordinate attributes. All other attributes are read correctly AFAIKT. So with this in mind I would consider that a bug. The assumption that attributes are always scalar doesn't hold true.

And one more thing, also roundtripping in netcdf4 does not keep the NC_STRING type:

```python import netCDF4 as nc import xarray as xr rootgrp = nc.Dataset("test_nc4_attrs.nc", mode="w", format="NETCDF4") rootgrp.set_ncstring_attrs(True) x = rootgrp.createDimension("x", 1) y = rootgrp.createDimension("y", 1) foo = rootgrp.createVariable("foo", "i4", ("y", "x")) rootgrp.coordinates = "z y x" foo.coordinates = "y x" foo[0, 0] = 0 rootgrp.close()

ds = xr.open_dataset("test_nc4_attrs.nc", engine="netcdf4") ds.to_netcdf("test_nc4_attrs_out.nc") ds.close() ```

``` ncdump test_nc4_attrs.nc

netcdf test_nc4_attrs { dimensions: x = 1 ; y = 1 ; variables: int foo(y, x) ; string foo:coordinates = "y x" ;

// global attributes: string :coordinates = "z y x" ; data:

foo = 0 ; } ```

``` ncdump test_nc4_attrs_out.nc

netcdf test_nc4_attrs_out { dimensions: y = 1 ; x = 1 ; variables: int foo(y, x) ; foo:coordinates = "y x" ; data:

foo = 0 ; } ```

So changes are needed in several places to check if the attribute is scalar or array, not only for coordinates attributes. Thoughts?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  787947436
Powered by Datasette · Queries took 82.223ms · About: xarray-datasette