home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

2 rows where author_association = "MEMBER" and issue = 608974755 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 2

  • dcherian 1
  • mathause 1

issue 1

  • apply_ufunc gives wrong dtype with dask=parallelized and vectorized=True · 2 ✖

author_association 1

  • MEMBER · 2 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
622599058 https://github.com/pydata/xarray/issues/4015#issuecomment-622599058 https://api.github.com/repos/pydata/xarray/issues/4015 MDEyOklzc3VlQ29tbWVudDYyMjU5OTA1OA== dcherian 2448579 2020-05-01T22:47:36Z 2020-05-01T22:47:36Z MEMBER

@mathuse is right.

The solution is to use dtype when we create meta for vectorized functions here: https://github.com/pydata/xarray/blob/3820fb77256682d909c1e41d962e29bec0edd62d/xarray/core/computation.py#L1008-L1011

@ulijh or @mathause , are either of you up for sending in a PR?

For now the workaround is to pass your own meta = np.ndarray((0,0), dtype=da.dtype)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc gives wrong dtype with dask=parallelized and vectorized=True 608974755
622374824 https://github.com/pydata/xarray/issues/4015#issuecomment-622374824 https://api.github.com/repos/pydata/xarray/issues/4015 MDEyOklzc3VlQ29tbWVudDYyMjM3NDgyNA== mathause 10194086 2020-05-01T12:49:42Z 2020-05-01T12:49:42Z MEMBER

It works when you set vectorize=False

```python da2 = xr.apply_ufunc( func, da, vectorize=False, dask="parallelized", output_dtypes=[da.dtype],

)
assert da2.dtype == da.dtype, "wrong dtype"
```

or when you pass your own meta:

```python da2 = xr.apply_ufunc( func, da, vectorize=True, dask="parallelized", output_dtypes=[da.dtype], meta=da )

assert da2.dtype == da.dtype, "wrong dtype" ```

This also goes wrong if the DataArray has another dtype, e.g. int:

```python da = xr.DataArray(np.arange(234).reshape(2,3,4) da = da.chunk(dict(dim_1=1)) da2 = xr.apply_ufunc( func, da, vectorize=True, dask="parallelized", output_dtypes=[da.dtype], )

assert da2.dtype == da.dtype, "wrong dtype" ```

Indeed, the dtype of meta takes precedence over the dtype (https://github.com/dask/dask/blob/25005e19cc30e8b2877d4dadbaef378ee912bdc0/dask/array/core.py#L1022):

meta : empty ndarray
    empty ndarray created with same NumPy backend, ndim and dtype as the
    Dask Array being created (overrides dtype)
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc gives wrong dtype with dask=parallelized and vectorized=True 608974755

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.752ms · About: xarray-datasette