home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 1460980509

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/7019#issuecomment-1460980509 https://api.github.com/repos/pydata/xarray/issues/7019 1460980509 IC_kwDOAMm_X85XFMsd 35968931 2023-03-08T22:37:21Z 2023-03-08T22:39:46Z MEMBER

I'm making progress with this PR, and now that @tomwhite implemented cubed.apply_gufunc I've re-routed xarray.apply_ufunc to use whatever version of apply_gufunc is defined by the chosen ChunkManager. This means many basic operations should now just work:

```python In [1]: import xarray as xr

In [2]: da = xr.DataArray([1, 2, 3], dims='x')

In [3]: da_chunked = da.chunk(from_array_kwargs={'manager': 'cubed'})

In [4]: da_chunked Out[4]: <xarray.DataArray (x: 3)> cubed.Array<array-003, shape=(3,), dtype=int64, chunks=((3,),)> Dimensions without coordinates: x

In [5]: da_chunked.mean() Out[5]: <xarray.DataArray ()> cubed.Array<array-006, shape=(), dtype=int64, chunks=()>

In [6]: da_chunked.mean().compute() [cubed.Array<array-009, shape=(), dtype=int64, chunks=()>] Out[6]: <xarray.DataArray ()> array(2) ```

(You need to install both cubed>0.5.0 and the main branch of rechunker for this to work.)

I still have a fair bit more to do on this PR (see checklist at top), but for testing should I:

  1. Start making a test_cubed.py file in xarray as part of this PR with bespoke tests,
  2. Put bespoke tests for xarray wrapping cubed somewhere else (e.g. the cubed repo or a new cubed-xarray repo),
  3. Merge this PR without cubed-specific tests and concentrate on finishing the general duck-array testing framework in #6908 so we can implement (b) in the way we actually eventually want things to work for 3rd-party duck array libraries?

I would prefer not to have this PR grow to be thousands of lines by including tests in it, but also waiting for #6908 might take a while because that's also a fairly ambitious PR.

The fact that the tests are currently green for this PR (ignoring some mypy stuff) is evidence that the decoupling of dask from xarray is working so far.

(I have already added some tests for the ability to register custom ChunkManagers though.)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1368740629
Powered by Datasette · Queries took 0.838ms · About: xarray-datasette