home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 965803186

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/5961#issuecomment-965803186 https://api.github.com/repos/pydata/xarray/issues/5961 965803186 IC_kwDOAMm_X845kPyy 35968931 2021-11-10T22:26:12Z 2021-11-11T01:08:14Z MEMBER

Update: I tried making a custom mapping class (code in drop-down below), then swapping out ._variables = dict(variables) for ._variables = DataManifest(variables=variables) in the Dataset constructors to see if that would break anything, as a first step towards that kind of integration. (At some point it would be good to be able to automatically run the test_dataset.py tests again for the manifest case.)

It kind of works?

From a xarray.Dataset perspective, Dataset._variables just needs to be a MutableMapping of xarray.Variable objects.

It's not quite as simple as this - you need a .copy method (fine), a repr (okay), and there are several places inside Dataset and DataArray that explicitly check that the type of ._variables is a dict.

To get tests to pass I can either relax those type constraints (which leads to >2/3 of test_dataset.py passing immediately) or maybe try making DataManifest inherit from dict so that it passes isinstance(ds._variables, dict)? This probably deserves a new PR...

(EDIT: Though maybe inheriting from dict is more trouble than it's worth)

Code for custom mapping class ```python from collections.abc import MutableMapping from typing import Dict, Hashable, Mapping, Iterator, Sequence from xarray.core.variable import Variable #from xarray.tree.datatree import DataTree class DataTree: """Purely for type hinting purposes for now (and to avoid a circular import)""" ... class DataManifest(MutableMapping): """ Stores variables like a dict, but also stores children alongside in a hidden manner, to check against. Acts like a dict of keys to variables, but prevents setting variables to same key as any children. It prevents name collisions by acting as a common record of stored items for both the DataTree instance and its wrapped Dataset instance. """ def __init__( self, variables: Dict[Hashable, Variable] = {}, children: Dict[Hashable, DataTree] = {}, ): if variables and children: keys_in_both = set(variables.keys()) & set(children.keys()) if keys_in_both: raise KeyError( f"The keys {keys_in_both} exist in both the variables and child nodes" ) self._variables = variables self._children = children @property def children(self) -> Dict[Hashable, DataTree]: """Stores list of the node's children""" return self._children @children.setter def children(self, children: Dict[Hashable, DataTree]): for key, child in children.items(): if key in self.keys(): raise KeyError("Cannot add child under key {key} because a variable is already stored under that key") if not isinstance(child, DataTree): raise TypeError self._children = children def __getitem__(self, key: Hashable) -> Variable: """Forward to the variables here so the manifest acts like a normal dict of variables""" return self._variables[key] def __setitem__(self, key: Hashable, value: Variable): """Allow adding new variables, but first check if they conflict with children""" if key in self._children: raise KeyError( f"key {key} already in use to denote a child" "node in wrapping DataTree node" ) if isinstance(value, Variable): self._variables[key] = value else: raise TypeError(f"Cannot store object of type {type(value)}") def __delitem__(self, key: Hashable): """Forward to the variables here so the manifest acts like a normal dict of variables""" if key in self._variables: del self._variables[key] elif key in self.children: # TODO might be better not to del children here? del self._children[key] else: raise KeyError(f"Cannot remove item because nothing is stored under {key}") def __contains__(self, item: object) -> bool: """Forward to the variables here so the manifest acts like a normal dict of variables""" return item in self._variables def __iter__(self) -> Iterator: """Forward to the variables here so the manifest acts like a normal dict of variables""" return iter(self._variables) def __len__(self) -> int: """Forward to the variables here so the manifest acts like a normal dict of variables""" return len(self._variables) def copy(self) -> "DataManifest": """Required for consistency with dict""" return DataManifest(variables=self._variables.copy(), children=self._children.copy()) # TODO __repr__ ```
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1048697792
Powered by Datasette · Queries took 238.929ms · About: xarray-datasette