home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 872441814

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/5559#issuecomment-872441814 https://api.github.com/repos/pydata/xarray/issues/5559 872441814 MDEyOklzc3VlQ29tbWVudDg3MjQ0MTgxNA== 3460034 2021-07-01T17:57:32Z 2021-07-01T18:05:19Z CONTRIBUTOR

Is it correct that xarray ends up calling dask.array.mean() on the pint.Quantity(dask.Array) object inside the DataArray? I took a guess at that since I can replicate what is happening inside the DataArray with

```python import dask.array as da

da = xr.DataArray([1,2,3], attrs={'units': 'metres'})

chunked = da.chunk(1).pint.quantify()

da.mean(chunked.variable._data) ```

Also, the Dask warning Passing an object to dask.array.from_array which is already a Dask collection. This can lead to unexpected behavior. is a big red flag that the Pint Quantity is making its way into Dask internals where it should not end up.

If so, I think this gets into a thorny issue with duck array handling in Dask. It was decided in https://github.com/dask/dask/pull/6393 that deliberately calling Dask array operations like elemwise (so, presumably by extension, blockwise and the reductions in dask.array.reductions like mean()) on a non-Dask array implies that the user wants to turn that array into a dask array. This get problematic, however, for upcast types like Pint Quantities that wrap Dask Arrays, since then you can get dask.Array(pint.Quantity(dask.Array)), which is what I think is going on here?

If this all checks out, I believe this becomes a Dask issue to improve upcast type/duck Dask array handling.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  935062144
Powered by Datasette · Queries took 0.783ms · About: xarray-datasette