html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/3378#issuecomment-541315926,https://api.github.com/repos/pydata/xarray/issues/3378,541315926,MDEyOklzc3VlQ29tbWVudDU0MTMxNTkyNg==,6213168,2019-10-12T11:27:52Z,2019-10-12T11:38:11Z,MEMBER,"https://docs.dask.org/en/latest/custom-collections.html#implementing-deterministic-hashing
```python
@normalize_token.register(Dataset)
def tokenize_dataset(ds):
return Dataset, ds._variables, ds._coord_names, ds._attrs
@normalize_token.register(DataArray)
def tokenize_dataarray(da):
return DataArray, ds._variable, ds._coords, ds._name
# Note: the @singledispatch for IndexVariable must be defined before the one for Variable
@normalize_token.register(IndexVariable)
def tokenize_indexvariable(v):
# Don't waste time converting pd.Index to np.ndarray
return IndexVariable, v._dims, v._data.array, v._attrs
@normalize_token.register(Variable)
def tokenize_variable(v):
# Note: it's v.data, not v._data, in order to cope with the
# wrappers around NetCDF and the like
return Variable, v._dims, v.data, v._attrs
```
You'll need to write a dummy normalize_token for when dask is not installed.
Unit tests:
- running tokenize() twice on the same object returns the same result
- changing the content of a data_var (or the variable, for DataArray) changes the output
- changing the content of a coord changes the output
- changing attrs, name, or dimension names change the output
- whether a variable is a data_var or a coord changes the output
- dask arrays aren't computed
- non-numpy, non-dask NEP18 data is not converted to numpy
- works with xarray's fancy wrappers around NetCDF and the like","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,503578688