html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/4208#issuecomment-663148752,https://api.github.com/repos/pydata/xarray/issues/4208,663148752,MDEyOklzc3VlQ29tbWVudDY2MzE0ODc1Mg==,306380,2020-07-23T17:57:55Z,2020-07-23T17:57:55Z,MEMBER,Dask collections tokenize quickly.  We just use the name I think.,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,653430454
https://github.com/pydata/xarray/issues/4208#issuecomment-663135877,https://api.github.com/repos/pydata/xarray/issues/4208,663135877,MDEyOklzc3VlQ29tbWVudDY2MzEzNTg3Nw==,2448579,2020-07-23T17:31:18Z,2020-07-23T17:31:18Z,MEMBER,"Re:rechunk, this should be part of the spec I guess. We need this for `DataArray.chunk()`.

xarray does do some automatic rechunking in `variable.py`. But this comment:
```
            # chunked data should come out with the same chunks; this makes
            # it feasible to combine shifted and unshifted data
            # TODO: remove this once dask.array automatically aligns chunks
```
suggest that we could delete that automatic rechunking today.

>  This will probably be very fast because you're probably just returning the name of the underlying dask array as well as the unit of the pint array/quatity.

ah yes, we can rely on the underlying array library to optimize this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,653430454
https://github.com/pydata/xarray/issues/4208#issuecomment-663123118,https://api.github.com/repos/pydata/xarray/issues/4208,663123118,MDEyOklzc3VlQ29tbWVudDY2MzEyMzExOA==,306380,2020-07-23T17:05:30Z,2020-07-23T17:05:30Z,MEMBER,"> That's exactly what's been done in Pint (see hgrecco/pint#1129)! @dcherian's points go beyond just that and address what Pint hasn't covered yet through the standard collection interface.

Ah, great.  My bad.


> how do we ask a duck dask array to rechunk itself? pint seems to forward the .rechunk call but that isn't formalized anywhere AFAICT.

I think that you would want to make a pint array rechunk method that called down to the dask array rechunk method.  My guess is that this might come up in other situations as well.

> less important: should duck dask arrays cache their token somewhere? dask.array uses .name to do this and xarray uses that to check equality cheaply. We can use tokenize of course. But I'm wondering if it's worth asking duck dask arrays to cache their token as an optimization.

I think that implementing the `dask.base.normalize_token` method should be fine.  This will probably be very fast because you're probably just returning the name of the underlying dask array as well as the unit of the pint array/quatity.  I don't think that caching would be necessary here.

It's also possible that we could look at the `__dask_layers__` method to get this information.  My memory is a bit fuzzy here though.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,653430454
https://github.com/pydata/xarray/issues/4208#issuecomment-663119539,https://api.github.com/repos/pydata/xarray/issues/4208,663119539,MDEyOklzc3VlQ29tbWVudDY2MzExOTUzOQ==,306380,2020-07-23T16:58:27Z,2020-07-23T16:58:27Z,MEMBER,My guess is that we could steal the xarray.DataArray implementations over to Pint without causing harm.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,653430454
https://github.com/pydata/xarray/issues/4208#issuecomment-663119334,https://api.github.com/repos/pydata/xarray/issues/4208,663119334,MDEyOklzc3VlQ29tbWVudDY2MzExOTMzNA==,306380,2020-07-23T16:58:06Z,2020-07-23T16:58:06Z,MEMBER,"In Xarray we implemented the Dask collection spec.  https://docs.dask.org/en/latest/custom-collections.html#the-dask-collection-interface

We might want to do that with Pint as well, if they're going to contain Dask things.  That way Dask operations like `dask.persist`, `dask.visualize`, and `dask.compute` will work normally.  ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,653430454
https://github.com/pydata/xarray/issues/4208#issuecomment-663117842,https://api.github.com/repos/pydata/xarray/issues/4208,663117842,MDEyOklzc3VlQ29tbWVudDY2MzExNzg0Mg==,2448579,2020-07-23T16:55:11Z,2020-07-23T16:55:11Z,MEMBER,"A couple of things came up in #4221 
1. how do we ask a duck dask array to rechunk itself? pint seems to forward the `.rechunk` call but that isn't formalized anywhere AFAICT.
2. less important: should duck dask arrays cache their token somewhere? `dask.array` uses `.name` to do this and xarray uses that to check equality cheaply. We can use `tokenize` of course. But I'm wondering if it's worth asking duck dask arrays to cache their token as an optimization.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,653430454
https://github.com/pydata/xarray/issues/4208#issuecomment-656358078,https://api.github.com/repos/pydata/xarray/issues/4208,656358078,MDEyOklzc3VlQ29tbWVudDY1NjM1ODA3OA==,2448579,2020-07-09T21:22:56Z,2020-07-09T21:22:56Z,MEMBER,"We have https://github.com/pydata/xarray/blob/master/xarray/core/pycompat.py which defines `dask_array_type` and `sparse_array_type` and then use `isinstance(da, dask_array_type)` in a bunch of places (e.g. duck_array_ops).

re duck array check: @keewis added this recently
https://github.com/pydata/xarray/blob/f3ca63a4ac5c091a92085b477a0d34c08df88aa6/xarray/core/utils.py#L250-L253","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,653430454
https://github.com/pydata/xarray/issues/4208#issuecomment-656068407,https://api.github.com/repos/pydata/xarray/issues/4208,656068407,MDEyOklzc3VlQ29tbWVudDY1NjA2ODQwNw==,6213168,2020-07-09T11:18:15Z,2020-07-09T11:19:28Z,MEMBER,"> Is it acceptable for a Pint Quantity to always have the Dask collection interface defined (i.e., be a duck Dask array), even when its magnitude (what it wraps) is not a Dask Array?

I think there are already enough headaches with ``__iter__`` being always defined and confusing libraries such as pandas (https://github.com/hgrecco/pint/issues/1128).
I don't see why pint should be explicitly aware of dask (except in unit tests)? It should only deal with generic NEP18-compatible libraries (numpy, dask, sparse, cupy, etc.).

> How should xarray check for a duck Dask Array?

We should ask the dask team to formalize what defines a ""dask-array-like"", like they already did with dask collections, and implement their definition in xarray.
I'd personally make it ""whatever defines a numpy-array-like AND has a chunks method AND the chunks method returns a tuple"".","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,653430454
https://github.com/pydata/xarray/issues/4208#issuecomment-655820797,https://api.github.com/repos/pydata/xarray/issues/4208,655820797,MDEyOklzc3VlQ29tbWVudDY1NTgyMDc5Nw==,1217238,2020-07-09T00:09:58Z,2020-07-09T00:09:58Z,MEMBER,"It might also make sense to check for one or more of the special dask collection attributes (`__dask_graph__`, `__dask_keys__`, etc)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,653430454
https://github.com/pydata/xarray/issues/4208#issuecomment-655810311,https://api.github.com/repos/pydata/xarray/issues/4208,655810311,MDEyOklzc3VlQ29tbWVudDY1NTgxMDMxMQ==,1217238,2020-07-08T23:31:21Z,2020-07-08T23:31:21Z,MEMBER,"Maybe something like this would work?
```
def is_duck_dask_array(x):
  return getattr(x, 'chunks', None) is not None
```

`xarray.DataArray` would pass this test (`chunks` is either `None` for non-dask arrays or a tuple for dask arrays), so this would be consistent with what we already do.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,653430454