home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

2 rows where user = 62192187 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 1
  • pull 1

state 1

  • closed 2

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1200716594 PR_kwDOAMm_X842CkkU 6476 Fix zarr append dtype checks cisaacstern 62192187 closed 0     7 2022-04-12T00:30:34Z 2022-05-11T17:39:42Z 2022-05-11T17:35:10Z CONTRIBUTOR   0 pydata/xarray/pulls/6476
  • [x] Closes #6345
  • [x] Tests added
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
    • If this is deemed a "notable bug fix" I can add a note here prior to merge.
  • [ ] New functions/methods are listed in api.rst
    • N/A
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6476/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1164454058 I_kwDOAMm_X85FaCiq 6345 `to_zarr` raises `ValueError: Invalid dtype` with `mode='a'` (but not with `mode='w'`) cisaacstern 62192187 closed 0     6 2022-03-09T21:21:26Z 2022-05-11T17:35:10Z 2022-05-11T17:35:10Z CONTRIBUTOR      

What happened?

A dataset in which a data variable has dtype='|S35' can be written to zarr without error as follows

```python import xarray as xr import numpy as np

data = np.zeros((2, 3), dtype='|S35') ds = xr.DataArray(data, name='foo').to_dataset() ds.to_zarr('test.zarr', mode='w') Changing the value of `mode` from `'w'` to `'a'`, raises `ValueError: Invalid dtype for data variable`:python !rm -rf test.zarr ds.to_zarr('test.zarr', mode='a') ```

Full Traceback ```python-traceback --------------------------------------------------------------------------- ValueError Traceback (most recent call last) Input In [4], in <cell line: 1>() ----> 1 ds.to_zarr('test.zarr', mode='a') File ~/miniconda3/envs/pangeo-forge-recipes/lib/python3.9/site-packages/xarray/core/dataset.py:2036, in Dataset.to_zarr(self, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options) 2033 if encoding is None: 2034 encoding = {} -> 2036 return to_zarr( 2037 self, 2038 store=store, 2039 chunk_store=chunk_store, 2040 storage_options=storage_options, 2041 mode=mode, 2042 synchronizer=synchronizer, 2043 group=group, 2044 encoding=encoding, 2045 compute=compute, 2046 consolidated=consolidated, 2047 append_dim=append_dim, 2048 region=region, 2049 safe_chunks=safe_chunks, 2050 ) File ~/miniconda3/envs/pangeo-forge-recipes/lib/python3.9/site-packages/xarray/backends/api.py:1406, in to_zarr(dataset, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options) 1391 zstore = backends.ZarrStore.open_group( 1392 store=mapper, 1393 mode=mode, (...) 1402 stacklevel=4, # for Dataset.to_zarr() 1403 ) 1405 if mode in ["a", "r+"]: -> 1406 _validate_datatypes_for_zarr_append(dataset) 1407 if append_dim is not None: 1408 existing_dims = zstore.get_dimensions() File ~/miniconda3/envs/pangeo-forge-recipes/lib/python3.9/site-packages/xarray/backends/api.py:1301, in _validate_datatypes_for_zarr_append(dataset) 1292 raise ValueError( 1293 "Invalid dtype for data variable: {} " 1294 "dtype must be a subtype of number, " (...) 1297 "object".format(var) 1298 ) 1300 for k in dataset.data_vars.values(): -> 1301 check_dtype(k) File ~/miniconda3/envs/pangeo-forge-recipes/lib/python3.9/site-packages/xarray/backends/api.py:1292, in _validate_datatypes_for_zarr_append.<locals>.check_dtype(var) 1283 def check_dtype(var): 1284 if ( 1285 not np.issubdtype(var.dtype, np.number) 1286 and not np.issubdtype(var.dtype, np.datetime64) (...) 1290 ): 1291 # and not re.match('^bytes[1-9]+$', var.dtype.name)): -> 1292 raise ValueError( 1293 "Invalid dtype for data variable: {} " 1294 "dtype must be a subtype of number, " 1295 "datetime, bool, a fixed sized string, " 1296 "a fixed size unicode string or an " 1297 "object".format(var) 1298 ) ValueError: Invalid dtype for data variable: <xarray.DataArray 'foo' (dim_0: 2, dim_1: 3)> array([[b'', b'', b''], [b'', b'', b'']], dtype='|S35') Dimensions without coordinates: dim_0, dim_1 dtype must be a subtype of number, datetime, bool, a fixed sized string, a fixed size unicode string or an object ```

What did you expect to happen?

I would expect the behavior of mode='w' and mode='a' to be consistent as regards dtypes of data variables.

Minimal Complete Verifiable Example

See What Happened? section above

Relevant log output

See What Happened? section above

Anything else we need to know?

No response

Environment

``` INSTALLED VERSIONS


commit: None python: 3.9.10 | packaged by conda-forge | (main, Feb 1 2022, 21:28:27) [Clang 11.1.0 ] python-bits: 64 OS: Darwin OS-release: 21.0.1 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1

xarray: 2022.3.0 pandas: 1.4.1 numpy: 1.22.2 scipy: 1.8.0 netCDF4: 1.5.8 pydap: installed h5netcdf: 999 h5py: 3.6.0 Nio: None zarr: 2.11.0 cftime: 1.6.0 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.10 cfgrib: 0.9.8.5 iris: None bottleneck: None dask: 2022.02.1 distributed: 2022.2.1 matplotlib: 3.5.1 cartopy: None seaborn: None numbagg: None fsspec: 2022.02.0 cupy: None pint: None sparse: None setuptools: 59.8.0 pip: 22.0.4 conda: None pytest: 6.2.5 IPython: 8.1.1 sphinx: None ``` cc @rabernat

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6345/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 18.745ms · About: xarray-datasette