issue_comments: 541315926
This data as json
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/pydata/xarray/issues/3378#issuecomment-541315926 | https://api.github.com/repos/pydata/xarray/issues/3378 | 541315926 | MDEyOklzc3VlQ29tbWVudDU0MTMxNTkyNg== | 6213168 | 2019-10-12T11:27:52Z | 2019-10-12T11:38:11Z | MEMBER | https://docs.dask.org/en/latest/custom-collections.html#implementing-deterministic-hashing ```python @normalize_token.register(Dataset) def tokenize_dataset(ds): return Dataset, ds._variables, ds._coord_names, ds._attrs @normalize_token.register(DataArray) def tokenize_dataarray(da): return DataArray, ds._variable, ds._coords, ds._name Note: the @singledispatch for IndexVariable must be defined before the one for Variable@normalize_token.register(IndexVariable) def tokenize_indexvariable(v): # Don't waste time converting pd.Index to np.ndarray return IndexVariable, v._dims, v._data.array, v._attrs @normalize_token.register(Variable) def tokenize_variable(v): # Note: it's v.data, not v._data, in order to cope with the # wrappers around NetCDF and the like return Variable, v._dims, v.data, v._attrs ``` You'll need to write a dummy normalize_token for when dask is not installed. Unit tests: - running tokenize() twice on the same object returns the same result - changing the content of a data_var (or the variable, for DataArray) changes the output - changing the content of a coord changes the output - changing attrs, name, or dimension names change the output - whether a variable is a data_var or a coord changes the output - dask arrays aren't computed - non-numpy, non-dask NEP18 data is not converted to numpy - works with xarray's fancy wrappers around NetCDF and the like |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
503578688 |