home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 1099265057

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/6475#issuecomment-1099265057 https://api.github.com/repos/pydata/xarray/issues/6475 1099265057 IC_kwDOAMm_X85BhXQh 6528957 2022-04-14T14:48:39Z 2022-04-14T14:55:07Z CONTRIBUTOR

Does Zarr v3 have a notion of a "root" group? That feels like a more sensible default to me, both for Xarray and Zarr-Python

I think we likely need to introduce a separate Hierarchy class as in the early zarrita python prototype and the xtensor-zarr C++ implementation to be able to access the root via public API. The concept of "hierarchy" as a collection of Nodes which are either Arrays or Groups is mentioned in the spec, but there is no corresponding class for this in zarr-python currently.

One issue with relying only on Array and Group as currently implemented in Zarr-Python is that we can create array nodes outside of any group subfolder. e.g. one can currently create an Array directly at path 'array1' and this would put the chunks under 'data/root/array1/', and metadata at 'meta/root/array1.array.json'. However, the root itself is not a Group. A group is basically a subfolder under root (e.g.' open_group with path = group1 creates '/meta/root/group1/' folder and 'meta/root/group1.group.json' metadata). There is no mechanism in the spec to open root directly as a Group!

This sounds fine for now, but I am concerned that it will slow the adoption of Zarr v3. Eventually, we would presumably want to change the default to version 3, but this is difficult to do if it entirely breaks backwards compatibility.

We did define DEFAULT_ZARR_VERSION=2 (privately). If we update that variable to 3 in a future release, the default when zarr_version is not specified will change.

My preference would be for the default behavior to try opening Zarr v2, and fall back to opening in v3 mode, even if this requires attempting to open a file from the store. This is similar to how Xarray handles other Zarr versioning issues (e.g., for consolidated metadata). Perhaps Zarr-Python could raise an informative error that we could catch if the Zarr version is incorrect, or even handle this behavior itself?

Yeah, something like this seems feasible on the Zarr side for convenience routines like open_group. The v3 spec requires zarr.json metadata that specifies the protocol version. If we traverse up the directory tree and do not find this file, then it is not a valid v3 or later spec and we can try opening it as v2.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1200581329
Powered by Datasette · Queries took 155.192ms · About: xarray-datasette