home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

6 rows where issue = 1164454058 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 4

  • rabernat 3
  • shoyer 1
  • kmsampson 1
  • cisaacstern 1

author_association 3

  • MEMBER 4
  • CONTRIBUTOR 1
  • NONE 1

issue 1

  • `to_zarr` raises `ValueError: Invalid dtype` with `mode='a'` (but not with `mode='w'`) · 6 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1065424051 https://github.com/pydata/xarray/issues/6345#issuecomment-1065424051 https://api.github.com/repos/pydata/xarray/issues/6345 IC_kwDOAMm_X84_gRSz cisaacstern 62192187 2022-03-11T19:28:18Z 2022-03-11T19:31:18Z CONTRIBUTOR

So it looks like changing

https://github.com/pydata/xarray/blob/d293f50f9590251ce09543319d1f0dc760466f1b/xarray/backends/api.py#L1280-L1301

to

```python def _validate_datatypes_for_zarr_append(store, dataset): """DataArray.name and Dataset keys must be a string or None"""

def check_dtype(vname):
    store_dtype = store.get_variables()[vname].dtype
    dataset_dtype = dataset[vname].dtype
    if not store_dtype == dataset_dtype:
        raise ValueError(
            f"Mismatched dtypes for variable {vname} between Zarr store on disk "
            f"and dataset to append. Store has dtype {store_dtype} but dataset to "
            f"append has dtype {dataset_dtype}."
        )

for vname in dataset.data_vars:
    check_dtype(vname)

``` could work?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `to_zarr` raises `ValueError: Invalid dtype` with `mode='a'` (but not with `mode='w'`) 1164454058
1065385198 https://github.com/pydata/xarray/issues/6345#issuecomment-1065385198 https://api.github.com/repos/pydata/xarray/issues/6345 IC_kwDOAMm_X84_gHzu rabernat 1197350 2022-03-11T18:41:11Z 2022-03-11T18:41:11Z MEMBER

It seems like what we really want to do is verify that the datatype of the appended data matches the data type on disk.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `to_zarr` raises `ValueError: Invalid dtype` with `mode='a'` (but not with `mode='w'`) 1164454058
1065381346 https://github.com/pydata/xarray/issues/6345#issuecomment-1065381346 https://api.github.com/repos/pydata/xarray/issues/6345 IC_kwDOAMm_X84_gG3i shoyer 1217238 2022-03-11T18:38:42Z 2022-03-11T18:38:42Z MEMBER

The data type restriction here seems to date back to the original PR adding support for appending. I turned up this comment that seems to summarize the motivation for this check: https://github.com/pydata/xarray/pull/2706#issuecomment-502481584

I think the original issue was that appending a fixed-width string could be a problem if the fixed-width does not match the width of the existing string dtype stored in Zarr.

This obviously doesn't apply in this case, because you are adding an entirely new variable. So I guess the check could be removed in that case.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `to_zarr` raises `ValueError: Invalid dtype` with `mode='a'` (but not with `mode='w'`) 1164454058
1065350469 https://github.com/pydata/xarray/issues/6345#issuecomment-1065350469 https://api.github.com/repos/pydata/xarray/issues/6345 IC_kwDOAMm_X84_f_VF rabernat 1197350 2022-03-11T17:58:28Z 2022-03-11T17:58:28Z MEMBER

Thanks for reporting this @kmsampson. My feeling is that it is a bug...which we can hopefully fix pretty easily!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `to_zarr` raises `ValueError: Invalid dtype` with `mode='a'` (but not with `mode='w'`) 1164454058
1065343999 https://github.com/pydata/xarray/issues/6345#issuecomment-1065343999 https://api.github.com/repos/pydata/xarray/issues/6345 IC_kwDOAMm_X84_f9v_ kmsampson 9536301 2022-03-11T17:50:21Z 2022-03-11T17:50:21Z NONE

I just ran into this today as well. I am trying to add a dimensionless variable to an existing Zarr store, to help with CF-compliance (if exporting to netCDF), and I ran into this issue. The dtype of my variable is '|S1', and the error message is printed below:

ValueError: Invalid dtype for data variable: <xarray.DataArray 'crs' ()> array(b'', dtype='|S1')

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `to_zarr` raises `ValueError: Invalid dtype` with `mode='a'` (but not with `mode='w'`) 1164454058
1063401936 https://github.com/pydata/xarray/issues/6345#issuecomment-1063401936 https://api.github.com/repos/pydata/xarray/issues/6345 IC_kwDOAMm_X84_YjnQ rabernat 1197350 2022-03-09T21:43:49Z 2022-03-09T21:43:49Z MEMBER

The relevant code is here

https://github.com/pydata/xarray/blob/d293f50f9590251ce09543319d1f0dc760466f1b/xarray/backends/api.py#L1405-L1406

and here

https://github.com/pydata/xarray/blob/d293f50f9590251ce09543319d1f0dc760466f1b/xarray/backends/api.py#L1280-L1298

What I don't understand is why different validation is needed for the append scenario than for the the write scenario. @shoyer worked on this in #5252, so maybe he has some ideas.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `to_zarr` raises `ValueError: Invalid dtype` with `mode='a'` (but not with `mode='w'`) 1164454058

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 11.695ms · About: xarray-datasette