home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 827675422

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/5223#issuecomment-827675422 https://api.github.com/repos/pydata/xarray/issues/5223 827675422 MDEyOklzc3VlQ29tbWVudDgyNzY3NTQyMg== 14808389 2021-04-27T15:00:40Z 2021-04-27T18:15:02Z MEMBER

[xarray-simlab] stores the _FillValue as an attribute which in turn is used by netcdf

that might be a bug in xarray-simlab (cc @benbovy). Usually, the fill value is used to replace missing values on disk. For example, python np.array([0, np.nan, 2, np.nan, np.nan, 5]) with a fill value of -1 could be encoded as [0, -1, 2, -1, -1, 5] before writing to disk, which can be saved as a int (int8, even) instead of a float. Same for datetimes: ["2020-01-01", "NaT", "2020-12-01"] with a fill value of -1 can be encoded as [0, -1, 11] with units = "months since 2020-01-01" and the standard calendar. As far as I understand it, using np.datetime64("NaT") as fill value does not make much sense because netCDF does not support datetime dtypes:

traceback when trying to save a datetime array attribute ```pytb TypeError Traceback (most recent call last) <ipython-input-1-9d07cb2115e9> in <module> 1 import numpy as np 2 import xarray as xr ----> 3 xr.Dataset(attrs={"_FillValue": np.array("NaT", dtype="M")}).to_netcdf("test.nc") .../xarray/core/dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf) 1752 from ..backends.api import to_netcdf 1753 -> 1754 return to_netcdf( 1755 self, 1756 path, .../xarray/backends/api.py in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf) 1066 # TODO: allow this work (setting up the file for writing array data) 1067 # to be parallelized with dask -> 1068 dump_to_store( 1069 dataset, store, writer, encoding=encoding, unlimited_dims=unlimited_dims 1070 ) .../xarray/backends/api.py in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims) 1113 variables, attrs = encoder(variables, attrs) 1114 -> 1115 store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims) 1116 1117 .../xarray/backends/common.py in store(self, variables, attributes, check_encoding_set, writer, unlimited_dims) 263 variables, attributes = self.encode(variables, attributes) 264 --> 265 self.set_attributes(attributes) 266 self.set_dimensions(variables, unlimited_dims=unlimited_dims) 267 self.set_variables( .../xarray/backends/common.py in set_attributes(self, attributes) 280 """ 281 for k, v in attributes.items(): --> 282 self.set_attribute(k, v) 283 284 def set_variables(self, variables, check_encoding_set, writer, unlimited_dims=None): .../xarray/backends/netCDF4_.py in set_attribute(self, key, value) 449 self.ds.setncattr_string(key, value) 450 else: --> 451 self.ds.setncattr(key, value) 452 453 def encode_variable(self, variable): src/netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Dataset.setncattr() src/netCDF4/_netCDF4.pyx in netCDF4._netCDF4._set_att() TypeError: illegal data type for attribute b'_FillValue', must be one of dict_keys(['S1', 'i1', 'u1', 'i2', 'u2', 'i4', 'u4', 'i8', 'u8', 'f4', 'f8']), got M8 ```

Also, it's strange that _FillValue is saved to attrs and not encoding (which means xarray won't actually use it to encode the arrays).

As a summary, I think you should open this issue on the issue tracker of xarray-simlab.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  868907284
Powered by Datasette · Queries took 0.74ms · About: xarray-datasette