home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 272004812

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
272004812 MDU6SXNzdWUyNzIwMDQ4MTI= 1699 apply_ufunc(dask='parallelized') output_dtypes for datasets 6213168 open 0     8 2017-11-07T22:18:23Z 2020-04-06T15:31:17Z   MEMBER      

When a Dataset has variables with different dtypes, there's no way to tell apply_ufunc that the same function applied to different variables will produce different dtypes:

``` ds1 = xarray.Dataset(data_vars={'a': ('x', [1, 2]), 'b': ('x', [3.0, 4.5])}).chunk() ds2 = xarray.apply_ufunc(lambda x: x + 1, ds1, dask='parallelized', output_dtypes=[float]) ds2

<xarray.Dataset> Dimensions: (x: 2) Dimensions without coordinates: x Data variables: a (x) float64 dask.array<shape=(2,), chunksize=(2,)> b (x) float64 dask.array<shape=(2,), chunksize=(2,)>

ds2.compute()

<xarray.Dataset> Dimensions: (x: 2) Dimensions without coordinates: x Data variables: a (x) int64 2 3 b (x) float64 4.0 5.5 ```

Proposed solution

When the output is a dataset, apply_ufunc could accept either output_dtypes=[t] (if all output variables will have the same dtype) or output_dtypes=[{var1: t1, var2: t2, ...}]. In the example above, it would be output_dtypes=[{'a': int, 'b': float}].

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1699/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 issue

Links from other tables

  • 3 rows from issues_id in issues_labels
  • 8 rows from issue in issue_comments
Powered by Datasette · Queries took 0.574ms · About: xarray-datasette