html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/525#issuecomment-531603838,https://api.github.com/repos/pydata/xarray/issues/525,531603838,MDEyOklzc3VlQ29tbWVudDUzMTYwMzgzOA==,1217238,2019-09-15T22:13:31Z,2019-09-15T22:13:31Z,MEMBER,"> Would it help for this integration to have pint Quanitites implement the [dask custom collections interface](https://docs.dask.org/en/latest/custom-collections.html) for when it wraps a dask array? ... Then, instead of xarray checking for `isinstance(dask_array_type)`, it could for check for ""duck dask arrays"" (e.g., those with both `__array_function__` and `__dask_graph__`)?
Yes, great idea!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-524572226,https://api.github.com/repos/pydata/xarray/issues/525,524572226,MDEyOklzc3VlQ29tbWVudDUyNDU3MjIyNg==,1217238,2019-08-24T18:41:13Z,2019-08-24T18:41:13Z,MEMBER,"The other virtue of the separate package is a faster release cycle. We can (and should!) still put a full example in the xarray docs.
For IO integration, I think the simplest choice would be to write utility functions for going to pint arrays from unit-free arrays with string “units” in attrs (and back). That will be easily composable with xarray’s existing IO functionality.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-524569964,https://api.github.com/repos/pydata/xarray/issues/525,524569964,MDEyOklzc3VlQ29tbWVudDUyNDU2OTk2NA==,1217238,2019-08-24T18:03:24Z,2019-08-24T18:03:24Z,MEMBER,"That said, I agree that the dedicated accessor is a pretty good interface, especially if you want more methods/property than `units` and `to`. Even then `array.pint.units` and `array.pint.to()` is pretty readable. This could make sense if we are concerned about different interfaces for different unit libraries.
The accessor route is also definitely more conservative than putting first-class unit support in xarray proper, which I like.
As for where to put it, that's sort of up to you. I think it's probably going to get complicated enough that it should into a library that can be installed, rather than being a boilerplate example in xarray's docs. It could be in xarray if it's going to be very minimal (e.g., only one method + one property) but if you want more than that -- and judging by pint-pandas I would guess you do -- then it should probably go into pint or a dedicated package, e.g., pint-xarray.
I would not write a general purpose xarray+units library unless you're particularly excited about doing that for some reason -- it's much better to start with a particular integration that works well than to make a half-hearted effort at something general purpose without well-motivated use-cases. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-524568953,https://api.github.com/repos/pydata/xarray/issues/525,524568953,MDEyOklzc3VlQ29tbWVudDUyNDU2ODk1Mw==,1217238,2019-08-24T17:47:22Z,2019-08-24T17:47:22Z,MEMBER,"> With the units attribute, I was presuming based on the past comments that `DataArray.units` would be a new property; I forgot that `DataArray.` passes along to `DataArray.attrs.`, so that implementing something new for `DataArray.units` would be a breaking change!
I think the new property is still an option, even if we want to preserve accessing `""units""` from `attrs`, e.g.,
```python
@property
def units(self):
if hasattr(self.data, 'units'):
# data is an array with units
return self.data.units
elif 'units' in self.attrs:
# consider issuing a FutureWarning here?
return self.attrs['units']
else:
raise AttributeError('units')
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-514880353,https://api.github.com/repos/pydata/xarray/issues/525,514880353,MDEyOklzc3VlQ29tbWVudDUxNDg4MDM1Mw==,1217238,2019-07-25T03:25:47Z,2019-08-24T05:02:39Z,MEMBER,"I think we could do basic indexes with units after steps (1) and (2) in the big index refactor plan:
https://github.com/pydata/xarray/issues/1603#issuecomment-511126208
At that point, indexes will be something that are propagated entirely separately from arrays. So even if the index will get cast into a pandas.index, the corresponding coordinate array will stick around.
The next level of support would be ""unit array indexing"", e.g., `ds.sel(x=1000*u.meters)`. This will require an API for letting you define your own index classes -- something that we definitely want in the long term but will take more work to realize.","{""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-524520588,https://api.github.com/repos/pydata/xarray/issues/525,524520588,MDEyOklzc3VlQ29tbWVudDUyNDUyMDU4OA==,1217238,2019-08-24T05:01:54Z,2019-08-24T05:01:54Z,MEMBER,"For the general xarray method, I think we would probably want something like `DataArray.units_convert` or `DataArray.units_to`. Or potentially this could use an accessor, e.g., `DataArray.pint.to` (which will always be a fallback option).
For the Dataset repr, it would probably be nice to print the units along with some of the array values. So yes, this could probably use some custom logic for recognizing quantity types, among other duck array types. If the number of distinct array types starts to get burdensomely large, we could expose an interface for registering new ones, e.g., `xarray.register_inline_array_repr(array_type, inline_repr)`.
For rolling out a new units attribute and/or IO integration, we will need to be careful to preserve backwards compatibility for now (at least with a warning). I’m sure there is lots of code that expects `array.attrs` to be a string attribute today, so we should consider our options carefully before breaking all that code. The conservative choice would be to keep existing uses working for now unchanged (as a fallback), and save breaking changes for later once we are confident we know the right solution.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-483814686,https://api.github.com/repos/pydata/xarray/issues/525,483814686,MDEyOklzc3VlQ29tbWVudDQ4MzgxNDY4Ng==,1217238,2019-04-16T19:39:23Z,2019-04-16T19:39:23Z,MEMBER,"> Would `__array_function__` solve the problem with operator precedence? I thought they are separate issues because `__mul__` and `__rmul__` need not call any `numpy` functions, and will therefore not necessary dispatch to `__array_function__`.
Let me try to answer this more clearly: these are independent examples of the same problem.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-483812516,https://api.github.com/repos/pydata/xarray/issues/525,483812516,MDEyOklzc3VlQ29tbWVudDQ4MzgxMjUxNg==,1217238,2019-04-16T19:32:28Z,2019-04-16T19:32:28Z,MEMBER,There's a whole section in NEP 13 about this: http://www.numpy.org/neps/nep-0013-ufunc-overrides.html#type-casting-hierarchy,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-483812113,https://api.github.com/repos/pydata/xarray/issues/525,483812113,MDEyOklzc3VlQ29tbWVudDQ4MzgxMjExMw==,1217238,2019-04-16T19:31:13Z,2019-04-16T19:31:13Z,MEMBER,"`__array_function__` and `__array_ufunc__` have the exact same operator precedence issues as `__mul__`/`__rmul__`. In all cases, properly written methods should return `NotImplemented` in some cases.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-483803067,https://api.github.com/repos/pydata/xarray/issues/525,483803067,MDEyOklzc3VlQ29tbWVudDQ4MzgwMzA2Nw==,1217238,2019-04-16T19:03:43Z,2019-04-16T19:03:43Z,MEMBER,Another place to get started with this would be implementing `__array_function__` in pint: https://github.com/hgrecco/pint/issues/790,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-483802699,https://api.github.com/repos/pydata/xarray/issues/525,483802699,MDEyOklzc3VlQ29tbWVudDQ4MzgwMjY5OQ==,1217238,2019-04-16T19:02:37Z,2019-04-16T19:02:37Z,MEMBER,"> Your first example dispatches to `dask.array.__mul__`, which doesn't know anything about pint and doesn't know how to compose its operations because there are no hooks--the pint array just gets coerced to a numpy array. The second goes to `pint.Quantity.__mul__`, which assumes it can wrap the `dask.array` (because it duck typing) and seems to succeed in doing so.
Unfortunately I don't think either dask or pint handle this properly right now.
There is a protocol for Python's `*` operator, which involves calling `__mul__` and `__rmul__` methods. But if both dask and pint *always* returns the result instead of `NotImplemented`, it is impossible to ensure a consistent result for `a * b` and `b * a` if `a` and `b` are different types. (This exact same issue exists for `__array_function__`, too, because the dispatching protocol is copied from Python.)
Dask and pint need *some* system -- either opt-in or opt-out -- for recognizing that they cannot handle operations with some argument types.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-482651181,https://api.github.com/repos/pydata/xarray/issues/525,482651181,MDEyOklzc3VlQ29tbWVudDQ4MjY1MTE4MQ==,1217238,2019-04-12T17:09:10Z,2019-04-12T17:09:10Z,MEMBER,"> This would be a good issue to fix upstream, by resolving whether dask should wrap pint or pint should wrap dask :).
For what it's worth, I suspect the optimal wrapping order is: xarrays > pint > dask > numpy. This is because it's useful to get unit compatibility errors at ""graph construction time"" rather than ""runtime"".","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-482650523,https://api.github.com/repos/pydata/xarray/issues/525,482650523,MDEyOklzc3VlQ29tbWVudDQ4MjY1MDUyMw==,1217238,2019-04-12T17:06:49Z,2019-04-12T17:06:49Z,MEMBER,"> One additional issue. It seems like pint has some odd behavior with dask. Multiplication (and I assume addition) is not commutative:
This would be a good issue to fix upstream, by resolving whether dask should wrap pint or pint should wrap dask :).
> I would really like to see units support in xarray, and I'm just wondering what the barrier to contribution to this issue is? Like is this a ""leave @shoyer to it"" kind of task? Or is it something which less experienced developers (such as myself) can help with?
I don't think there's a huge barrier to entry here and I would encourage someone else to dive into this. We could start by adding an experimental flag to xarray to disable coercing everything to numpy arrays, and do some experiments to see what works.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-482643387,https://api.github.com/repos/pydata/xarray/issues/525,482643387,MDEyOklzc3VlQ29tbWVudDQ4MjY0MzM4Nw==,1217238,2019-04-12T16:44:16Z,2019-04-12T16:44:16Z,MEMBER,"
(I just added a third bullet to my list above)
> > 2. once our minimum required numpy version is 1.17
>
> @shoyer - what would be an approximate time frame for this?
First, we'll need to wait for NumPy 1.17 to be released :). But more seriously, if we do a breaking release of xarray we can probably with bumping the required NumPy version significantly.
It's definitely a smoother experience for users if we allow at least slightly older versions of NumPy (e.g., so they can use newer xarray with a version of NumPy pre-packaged with their system), but if keeping existing things working with the current version of NumPy is a pain, then it may be worth upgrading the minimum required version.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-482641808,https://api.github.com/repos/pydata/xarray/issues/525,482641808,MDEyOklzc3VlQ29tbWVudDQ4MjY0MTgwOA==,1217238,2019-04-12T16:39:13Z,2019-04-12T16:40:47Z,MEMBER,"Three things will need to change internally in xarray:
1. `.data` is currently required to return a NumPy or dask array. This will need to be relaxed to include ""any duck array type"". (For now, we can store an explicit list of these types.)
2. We need to rewrite Xarray's internal array operations, found in `xarray/core/duck_array_ops.py`, to use NumPy's API when `__array_function__` is enabled instead of our ad-hoc checks. Eventually (once our minimum required numpy version is 1.17), we should be able to delete most of `duck_array_ops` entirely!
3. We should figure out what a minimal ""units layer"" would look like in xarray, exposing a few attributes or methods that call out to underlying unit implementations, e.g., `DataArray.units` should be redirected to pull out `DataArray.data.units`","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-457871736,https://api.github.com/repos/pydata/xarray/issues/525,457871736,MDEyOklzc3VlQ29tbWVudDQ1Nzg3MTczNg==,1217238,2019-01-26T22:35:19Z,2019-01-26T22:35:26Z,MEMBER,This is still relevant. Hopefully the advent of `__array_function__` in NumPy will make this easier/possible.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-248081325,https://api.github.com/repos/pydata/xarray/issues/525,248081325,MDEyOklzc3VlQ29tbWVudDI0ODA4MTMyNQ==,1217238,2016-09-19T18:35:14Z,2016-09-19T18:35:14Z,MEMBER,"@burnpanck Take a look at the approach described in #988 and let me know if you think that sounds viable.
NumPy subclasses inside xarray objects would probably mostly work, if we changed some internal uses of `np.asarray` to `np.asanyarray`. But it's also a pretty big rabbit hole. I'm still not sure there are any good ways to do operations like `concatenate`.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-242937257,https://api.github.com/repos/pydata/xarray/issues/525,242937257,MDEyOklzc3VlQ29tbWVudDI0MjkzNzI1Nw==,1217238,2016-08-27T19:50:22Z,2016-08-27T19:50:22Z,MEMBER,"#988 describes a possible approach for allowing third-party libraries to add units to xarray. It's not as comprehensive as a custom dtype, but might be enough to be useful.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-182744774,https://api.github.com/repos/pydata/xarray/issues/525,182744774,MDEyOklzc3VlQ29tbWVudDE4Mjc0NDc3NA==,1217238,2016-02-11T07:19:19Z,2016-02-11T07:19:19Z,MEMBER,"If anyone is excited about working on the NumPy improvement we need to make this more feasible (namely, custom dtypes and duck typing) at [BIDS](http://bids.berkeley.edu/), you should talk to @njsmith.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-181938629,https://api.github.com/repos/pydata/xarray/issues/525,181938629,MDEyOklzc3VlQ29tbWVudDE4MTkzODYyOQ==,1217238,2016-02-09T16:21:27Z,2016-02-09T16:21:27Z,MEMBER,"@hgrecco Are you suggesting that pint could wrap xarray objects, or that xarray could wrap pint? Either is certainly possible, though I'm a bit pessimistic that we can come up with a complete solution without the numpy fixes we've been discussing.
Also, just to note, xarray contains a `Dataset` type for representing collections of variables that often have different units (e.g., temperature and pressure). That suggests to me that it could make more sense to put pint and/or astropy.Quantity objects inside xarray arrays rather than the other way around.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-141853078,https://api.github.com/repos/pydata/xarray/issues/525,141853078,MDEyOklzc3VlQ29tbWVudDE0MTg1MzA3OA==,1217238,2015-09-21T01:28:18Z,2016-02-09T16:16:38Z,MEMBER,"@mhvk It would certainly be possible to extend dask.array to handle units, in either of the ways you suggest.
Although you could allow `Quantity` objects inside dask.arrays, I don't like that approach, because static checks like units really should be done only once when arrays are constructed (akin to dtype checks) rather than at evaluation time, and for every chunk. This suggests that tagging on the outside is the better approach.
So far, so good -- but with the current state of duck array typing in NumPy, it's really hard to be happy with this. Until `__numpy_ufunc__` lands, we can't override operations like `np.sqrt` in a way that is remotely feasible for dask.arrays (we can't afford to load big arrays into memory). Likewise, we need overrides for standard numpy array utility functions like `concatenate`. But the worst part is that the lack of standard interfaces means that we lose the possibility of composing different arrays backends with your `Quantity` type -- it will only be able to wrap dask or numpy arrays, not sparse matrices or [bolt](http://bolt-project.org/) arrays or some other type yet to be invented.
Once we have all that duck-array stuff, then yes, you certainly could write all a duck-array `Quantity` type that can wrap generic duck-arrays. But something like `Quantity` really only needs to override compute operations so that they can propagate dtypes -- there shouldn't be a need to override methods like `concatenate`. If you had an actual (parametric) dtype for units (e.g., `Quantity[float64, 'meters']`), then you would get all those dtype agnostic methods for free, which would make your life as an implementer much easier. Hence why I think custom dtypes would really be the ideal solution.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-131876284,https://api.github.com/repos/pydata/xarray/issues/525,131876284,MDEyOklzc3VlQ29tbWVudDEzMTg3NjI4NA==,1217238,2015-08-17T16:14:56Z,2015-08-17T16:14:56Z,MEMBER,"Unfortunately, the astropy approach uses a numpy.ndarray subclass, which means it's mutually exclusive with dask.array. Otherwise, it does look very nice, though.
On Mon, Aug 17, 2015 at 8:38 AM, Ryan Abernathey notifications@github.com
wrote:
> Astropy has pretty good units support:
> http://astropy.readthedocs.org/en/latest/units/
>
> ## Would it be possible to copy what they do?
>
> Reply to this email directly or view it on GitHub:
> https://github.com/xray/xray/issues/525#issuecomment-131866203
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585
https://github.com/pydata/xarray/issues/525#issuecomment-130134807,https://api.github.com/repos/pydata/xarray/issues/525,130134807,MDEyOklzc3VlQ29tbWVudDEzMDEzNDgwNw==,1217238,2015-08-12T02:01:01Z,2015-08-12T02:01:01Z,MEMBER,"Eventually, I hope so!
Unfortunately, doing this in a feasible and maintainable way will probably require upstream fixes in NumPy. In particular, better support for duck-array types (https://github.com/numpy/numpy/issues/4164) and/or the ability to write units as a custom NumPy dtypes. Both of these are on the NumPy roadmap, though they don't have a timeframe for when that will happen.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585