home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 1287134964

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/6807#issuecomment-1287134964 https://api.github.com/repos/pydata/xarray/issues/6807 1287134964 IC_kwDOAMm_X85MuB70 2448579 2022-10-21T15:38:27Z 2022-10-21T18:08:49Z MEMBER

IIUC the issue Ryan & Tom are talking about is tied to reading from files.

For example, we read from a zarr store using zarr, then wrap that zarr.Array (or h5Py Dataset) with a large number of ExplicitlyIndexed Classes that enable more complicated indexing, lazy decoding etc.

IIUC #4628 is about concatenating such arrays i.e. neither zarr.Array nor ExplicitlyIndexed support concatenation, so we end up calling np.array and forcing a disk read.

With dask or cubed we would have dask(ExplicitlyIndexed(zarr)) or cubed(ExplicitlyIndexed(zarr)) so as long as dask and cubed define concat and we dispatch to them, everything is 👍🏾

PS: This is what I was attempting to explain (not very clearly) in the distributed arrays meeting. We don't ever use dask.array.from_zarr (for e.g.). We use zarr to read, then wrap in ExplicitlyIndexed and then pass to dask.array.from_array.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1308715638
Powered by Datasette · Queries took 4.034ms · About: xarray-datasette