html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/3378#issuecomment-541315926,https://api.github.com/repos/pydata/xarray/issues/3378,541315926,MDEyOklzc3VlQ29tbWVudDU0MTMxNTkyNg==,6213168,2019-10-12T11:27:52Z,2019-10-12T11:38:11Z,MEMBER,"https://docs.dask.org/en/latest/custom-collections.html#implementing-deterministic-hashing ```python @normalize_token.register(Dataset) def tokenize_dataset(ds): return Dataset, ds._variables, ds._coord_names, ds._attrs @normalize_token.register(DataArray) def tokenize_dataarray(da): return DataArray, ds._variable, ds._coords, ds._name # Note: the @singledispatch for IndexVariable must be defined before the one for Variable @normalize_token.register(IndexVariable) def tokenize_indexvariable(v): # Don't waste time converting pd.Index to np.ndarray return IndexVariable, v._dims, v._data.array, v._attrs @normalize_token.register(Variable) def tokenize_variable(v): # Note: it's v.data, not v._data, in order to cope with the # wrappers around NetCDF and the like return Variable, v._dims, v.data, v._attrs ``` You'll need to write a dummy normalize_token for when dask is not installed. Unit tests: - running tokenize() twice on the same object returns the same result - changing the content of a data_var (or the variable, for DataArray) changes the output - changing the content of a coord changes the output - changing attrs, name, or dimension names change the output - whether a variable is a data_var or a coord changes the output - dask arrays aren't computed - non-numpy, non-dask NEP18 data is not converted to numpy - works with xarray's fancy wrappers around NetCDF and the like","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,503578688