home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 516725099

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
516725099 MDU6SXNzdWU1MTY3MjUwOTk= 3480 Allow appending non-numerical types to zarr arrays. 463809 closed 0     0 2019-11-02T21:20:53Z 2019-11-13T15:55:33Z 2019-11-13T15:55:33Z CONTRIBUTOR      

MCVE Code Sample

Zarr itself allows appending np.datetime and np.bool types.

```python

path = 'tmp/test.zarr' z1 = zarr.open(path, mode='w', shape=(10,), chunks=(10,), dtype='M8[D]') z1[:] = '1990-01-01' z2 = zarr.open(path, mode='a') a = np.array(['1992-01-01'] * 10, dtype='datetime64[D]') z2.append(a) (20,) z2 <zarr.core.Array (20,) datetime64[D]> ``` But it's equivalent in xarray throws an error:

```

ds = xr.Dataset( ... {'y': (('x',), np.array(['1991-01-01'] * 10, dtype='datetime64[D]'))} ... ) ds.to_zarr('tmp/test_xr.zarr', mode='w') <xarray.backends.zarr.ZarrStore object at 0x31f403170> ds2 = xr.Dataset( ... {'y': (('x',), np.array(['1992-01-01'] * 10, dtype='datetime64[D]'))} ... ) ds2.to_zarr('tmp/test_xr.zarr', mode='a', append_dim='x') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/personal/opt/anaconda3/lib/python3.7/site-packages/xarray/core/dataset.py", line 1616, in to_zarr append_dim=append_dim, File "/Users/personal/opt/anaconda3/lib/python3.7/site-packages/xarray/backends/api.py", line 1304, in to_zarr _validate_datatypes_for_zarr_append(dataset) File "/Users/personal/opt/anaconda3/lib/python3.7/site-packages/xarray/backends/api.py", line 1249, in _validate_datatypes_for_zarr_append check_dtype(k) File "/Users/personal/opt/anaconda3/lib/python3.7/site-packages/xarray/backends/api.py", line 1245, in check_dtype "unicode string or an object".format(var) ValueError: Invalid dtype for data variable: <xarray.DataArray 'y' (x: 10)> array(['1992-01-01T00:00:00.000000000', '1992-01-01T00:00:00.000000000', '1992-01-01T00:00:00.000000000', '1992-01-01T00:00:00.000000000', '1992-01-01T00:00:00.000000000', '1992-01-01T00:00:00.000000000', '1992-01-01T00:00:00.000000000', '1992-01-01T00:00:00.000000000', '1992-01-01T00:00:00.000000000', '1992-01-01T00:00:00.000000000'], dtype='datetime64[ns]') Dimensions without coordinates: x dtype must be a subtype of number, a fixed sized string, a fixed size unicode string or an object ```

Expected Output

The append should succeed.

Problem Description

This function in xarray/api.py is too strict on types:

``` def _validate_datatypes_for_zarr_append(dataset): """DataArray.name and Dataset keys must be a string or None"""

def check_dtype(var):
    if (
        not np.issubdtype(var.dtype, np.number)
        and not coding.strings.is_unicode_dtype(var.dtype)
        and not var.dtype == object
    ):
        # and not re.match('^bytes[1-9]+$', var.dtype.name)):
        raise ValueError(
            "Invalid dtype for data variable: {} "
            "dtype must be a subtype of number, "
            "a fixed sized string, a fixed size "
            "unicode string or an object".format(var)
        )

for k in dataset.data_vars.values():
    check_dtype(k)

```

np.datetime64[.] and np.bool are not numbers: ```

np.issubdtype(np.dtype('datetime64[D]'), np.number) False np.issubdtype(np.dtype('bool'), np.number) False ```

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.4 (default, Aug 13 2019, 15:17:50) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 18.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: None xarray: 0.14.0 pandas: 0.25.1 numpy: 1.17.2 scipy: 1.3.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.9.0 Nio: None zarr: 2.3.2 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 2.5.2 distributed: 2.5.2 matplotlib: 3.1.1 cartopy: None seaborn: 0.9.0 numbagg: None setuptools: 41.4.0 pip: 19.2.3 conda: 4.7.12 pytest: 5.2.1 IPython: 7.8.0 sphinx: 2.2.0
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3480/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 0 rows from issue in issue_comments
Powered by Datasette · Queries took 83.183ms · About: xarray-datasette