home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 1034196986

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/6069#issuecomment-1034196986 https://api.github.com/repos/pydata/xarray/issues/6069 1034196986 IC_kwDOAMm_X849pJf6 1217238 2022-02-09T21:12:31Z 2022-02-09T21:12:31Z MEMBER

The reason why this isn't allowed is because it's ambiguous what to do with the other variables that are not restricted to the region (['cell', 'face', 'layer', 'max_cell_node', 'max_face_nodes', 'node', 'siglay'] in this case).

I can imagine quite a few different ways this behavior could be implemented:

  1. Ignore these variables entirely.
  2. Ignore variables if they also already exist, but write new ones.
  3. Write or overwrite both new and existing these variables.
  4. Write new variables. Ignore existing variables only if they already exist with the same values, and if not, raise an error.

I believe your proposal here (removing these checks from _validate_region) would achieve (3), but I'm not sure that's the best option.

(4) seems like perhaps the most user-friendly option, but checking existing variables can add significant overhead. When experimenting adding region support Xarray-Beam, I found many cases where it was easy to inadvertently make large parallel pipelines much slower by downloaded existing variables.

The current solution is not to do any of these, and to force the user to make an explicit choice by dropping new variables, or write them in a separate call to to_zarr. I think it would also be OK to let a user explicitly opt-in to one of these behaviors, but I don't think guessing what the user wants would be ideal.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1077079208
Powered by Datasette · Queries took 0.722ms · About: xarray-datasette