html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/525#issuecomment-733629234,https://api.github.com/repos/pydata/xarray/issues/525,733629234,MDEyOklzc3VlQ29tbWVudDczMzYyOTIzNA==,11289391,2020-11-25T10:48:57Z,2020-11-25T10:48:57Z,CONTRIBUTOR,"Hi! I'm just popping in as a very interested user of both xarray and unit packages to ask: since there's been some awesome progress made here and pint-xarray is now enough of A Thing to have [documentation](https://pint-xarray.readthedocs.io/en/stable/creation.html), though obviously experimental - how much work would you expect a corresponding package for astropy's Quantities to take, given the current state of things? Are there any limitations that would prevent that? I saw the discussion above about Quantities being more problematic due to taking the subclass-from-numpy-arrays route, but I'm not sure how much of a roadblock that still is. I would suspect the API could be shared with pint-xarray (which, obviously, is experimental for now).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585 https://github.com/pydata/xarray/issues/525#issuecomment-531603357,https://api.github.com/repos/pydata/xarray/issues/525,531603357,MDEyOklzc3VlQ29tbWVudDUzMTYwMzM1Nw==,3460034,2019-09-15T22:04:39Z,2019-09-15T22:04:39Z,CONTRIBUTOR,"Based the points raised by @crusaderky in https://github.com/hgrecco/pint/issues/878#issue-492678605 about how much special case handling xarray has for dask arrays, I was thinking recently about what it might take for the xarray > pint > dask.array wrapping discussed here and elsewhere to work as fluidly as xarray > dask.array currently does. Would it help for this integration to have pint Quanitites implement the [dask custom collections interface](https://docs.dask.org/en/latest/custom-collections.html) for when it wraps a dask array? I would think that this would allow a pint Quanitity to behave in a ""dask-array-like"" way rather than just an ""array-like"" way. Then, instead of xarray checking for `isinstance(dask_array_type)`, it could for check for ""duck dask arrays"" (e.g., those with both `__array_function__` and `__dask_graph__`)? There are almost certainly some subtle implementation details that would need to be worked out, but I'm guessing that this could take care of the bulk of the integration. Also, if I'm incorrect with this line of thought, or there is a better way forward for implementing this wrapping pattern, please do let me know!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585 https://github.com/pydata/xarray/issues/525#issuecomment-524570522,https://api.github.com/repos/pydata/xarray/issues/525,524570522,MDEyOklzc3VlQ29tbWVudDUyNDU3MDUyMg==,3460034,2019-08-24T18:12:55Z,2019-08-24T18:39:49Z,CONTRIBUTOR,"@shoyer I agree, the accessor interface makes a lot of sense for this: it's more conservative on the xarray side, while also giving the most flexibility for the pint + xarray integration. Based on your feedback and what I'd hope to see out of the pint + xarray integration, I'm thinking a pint-adjacent package like pint-xarray may be the best route forward. ~~I'll create an issue on pint to inquire about that possibility.~~ See https://github.com/hgrecco/pint/issues/849.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585 https://github.com/pydata/xarray/issues/525#issuecomment-524569782,https://api.github.com/repos/pydata/xarray/issues/525,524569782,MDEyOklzc3VlQ29tbWVudDUyNDU2OTc4Mg==,3460034,2019-08-24T18:00:37Z,2019-08-24T18:01:11Z,CONTRIBUTOR,"Oh, okay, having the fallback like that was how I thought about implementing it. (I'm sorry that I didn't describe that in my initial comment.) So would the way forward be to implement `DataArray.units_convert()`/`DataArray.units_to()` and `DataArray.units` as you described right now, but wait for and/or delegate IO integration? Or, should there also be a fully-backwards-compatible IO integration implemented now (such as an optional kwarg on `open_dataset` and `to_netcdf`)?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585 https://github.com/pydata/xarray/issues/525#issuecomment-524568112,https://api.github.com/repos/pydata/xarray/issues/525,524568112,MDEyOklzc3VlQ29tbWVudDUyNDU2ODExMg==,3460034,2019-08-24T17:34:31Z,2019-08-24T17:34:31Z,CONTRIBUTOR,"@shoyer Thank you for the reply! That sounds good about the repr custom logic. With the units attribute, I was presuming based on the past comments that `DataArray.units` would be a new property; I forgot that `DataArray.` passes along to `DataArray.attrs.`, so that implementing something new for `DataArray.units` would be a breaking change! In trying to avoid such a change, though, I think it would be confusing to have a DataArray-level `DataArray.units_convert` method and not a corresponding DataArray-level way of getting at the units. So, would it be okay to just implement this unit interface (unit access, unit conversion, and IO) through an accessor, and start out with just a pint accessor? If so, where should it be implemented? Possible ideas I had: - As a boilerplate example in the xarray documentation that downstream libraries or end-users can implement? - In xarray itself? - In pint or a new pint-adjacent package (similar to [pint-pandas](https://github.com/hgrecco/pint-pandas))? - A new xarray-adjacent package for general-purpose unit compatibility?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585 https://github.com/pydata/xarray/issues/525#issuecomment-524518305,https://api.github.com/repos/pydata/xarray/issues/525,524518305,MDEyOklzc3VlQ29tbWVudDUyNDUxODMwNQ==,3460034,2019-08-24T04:17:54Z,2019-08-24T04:17:54Z,CONTRIBUTOR,"With the progress being made with https://github.com/pydata/xarray/pull/2956, https://github.com/pydata/xarray/pull/3238, and https://github.com/hgrecco/pint/pull/764, I was thinking that now might be a good time to work out the details of the ""minimal units layer"" mentioned by @shoyer in https://github.com/pydata/xarray/issues/525#issuecomment-482641808 and https://github.com/pydata/xarray/issues/988#issuecomment-413732471? I'd be glad to try putting together a PR that could follow up on https://github.com/pydata/xarray/pull/3238 for it, but I would want to ask for some guidance: (For reference, below is the action list from https://github.com/pydata/xarray/issues/988#issuecomment-413732471) > - The `DataArray.units` property could forward to `DataArray.data.units`. > - A `DataArray.to` or `DataArray.convert` method could call the relevant method on data and re-wrap it in a DataArray. > - A minimal layer on top of xarray's netCDF IO could handle unit attributes by wrapping/unwrapping arrays with pint. **`DataArray.units`** Having `DataArray.units` forward to `DataArray.data.units` should work for `pint`, `unyt`, and `quantities`, but should a fallback to `DataArray.data.unit` be added for `astropy.units`? Also, how should `DataArray.units` behave if `DataArray.data` does not have a ""units"" or ""unit"" attribute, but `DataArray.attrs['units']` exists? **`DataArray.to()`/`DataArray.convert()`** `DataArray.to()` would be consistent with the methods for `pint`, `unyt`, and `astropy.units` (the relevant method for `quantities` looks to be `.rescale()`), however, it is very similar to the numerous output-related `DataArray.to_*()` methods. Is this okay, or would `DataArray.convert()` or some other method name be better to avoid confusion? **Units and IO** While wrapping and unwrapping arrays with `pint` itself should be straightforward, I really don't know what the best API for it should be, especially for input. Some possibilities that came to mind (by no means an exhaustive list): - Leave open_dataset as it is now, but provide examples in the documentation for how to reconstruct a new Dataset with unit arrays (perhaps provide a boilerplate function or accessor) - Add a kwarg like ""wrap_units"" to `open_dataset()` that accepts a quantity constructor (like `ureg.Quantity` in pint) that is applied within each variable - Devise some generalized system for specifying the internal array structure in the opened dataset (to handle other duck array types, not just unit arrays) With any of these, tests for lazy-loading would be crucial (I don't know yet how pint will handle that). Output may be easier: I was thinking that unwrapping could be done implicitly by automatically putting `str(DataArray.units)` as the ""units"" attribute and replacing the unit array with its magnitude/value? **Extra questions based on sparse implementation** [`__repr__`](https://github.com/pydata/xarray/pull/3211) Will a set of repr functions for each unit array type need to be added like they were for sparse in https://github.com/pydata/xarray/pull/3211? Or should there be some more general system implemented because of all of the possible combinations that would arise with other duck array types? [`to_dense()`/`.to_numpy_data()`/`.to_numpy()`](https://github.com/pydata/xarray/issues/3245) What is the expected behavior with unit arrays with regards to this soon-to-be-implemented conversion method?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585 https://github.com/pydata/xarray/issues/525#issuecomment-514877824,https://api.github.com/repos/pydata/xarray/issues/525,514877824,MDEyOklzc3VlQ29tbWVudDUxNDg3NzgyNA==,3460034,2019-07-25T03:11:20Z,2019-07-25T03:11:20Z,CONTRIBUTOR,"Thank you for the insight! So if I'm understanding things correctly as they stand now, dimension coordinates store their values internally as a `pandas.Index`, which would mean, to implement this directly, this becomes an upstream issue in pandas to allow a ndarray-like unit array inside a `pandas.Index`? Based on what I've seen on the pandas side, this looks far from straightforward. With that in mind, would ""dimension coordinates with units"" (or more generally ""dimension coordinates with `__array_function__` implementers"") be another use case that best falls under flexible indices (#1603)? (In the mean time, I would guess that the best workaround is using an accessor interface to handle unit-related operations on coordinates, since the `attrs` are preserved.)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585 https://github.com/pydata/xarray/issues/525#issuecomment-514452182,https://api.github.com/repos/pydata/xarray/issues/525,514452182,MDEyOklzc3VlQ29tbWVudDUxNDQ1MjE4Mg==,3460034,2019-07-24T02:19:08Z,2019-07-24T02:19:08Z,CONTRIBUTOR,"In light of the recent activity with `__array_function__` in #3117, I took a quick look to see if it worked with Pint as modified in https://github.com/hgrecco/pint/pull/764. The basics of sticking a Pint `Quantity` in a `DataArray` seem to work well, and the perhaps the greatest issues are on Pint's end...right now https://github.com/hgrecco/pint/pull/764 is limited in the functions it handles through `__array_function__`, and there are some quirks with operator precedence. However, the other main problem was that coordinates did not work with `Quantity`'s. Looking again at https://github.com/pydata/xarray/issues/1938#issuecomment-510953379 and #2956, this is not surprising. I'm curious though about what it would take to let indexing work with Pint (or other unit arrays)? For most of my use cases (meteorological analysis as in MetPy), having units with coordinates is just as important as having units with the data itself. I'd be interested in helping implement it, but I would greatly appreciate some initial direction, since I'm new to that part of the xarray codebase. Also, cc @keewis, since I saw in #2956 you have a [`unit-support`](https://github.com/keewis/xarray/tree/unit-support) branch that looks like it attempts to extend `NumpyIndexingAdapter` to work with unit arrays, but still has the coordinates-with-units tests marked as xfail.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585 https://github.com/pydata/xarray/issues/525#issuecomment-483807351,https://api.github.com/repos/pydata/xarray/issues/525,483807351,MDEyOklzc3VlQ29tbWVudDQ4MzgwNzM1MQ==,1386642,2019-04-16T19:16:19Z,2019-04-16T19:16:19Z,CONTRIBUTOR,"Would `__array_function__` solve the problem with operator precedence? I thought they are separate issues because `__mul__` and `__rmul__` need not call any `numpy` functions, and will therefore not necessary dispatch to `__array_function__`. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585 https://github.com/pydata/xarray/issues/525#issuecomment-483799967,https://api.github.com/repos/pydata/xarray/issues/525,483799967,MDEyOklzc3VlQ29tbWVudDQ4Mzc5OTk2Nw==,221526,2019-04-16T18:54:37Z,2019-04-16T18:54:37Z,CONTRIBUTOR,"@shoyer I agree with that wrapping order. I think I'd also be in favor of starting with an experiment to disable coercing to arrays. @nbren12 The non-communicative multiplication is a consequence of operator dispatch in Python, and the reason why we want `__array_function__` from numpy. Your first example dispatches to `dask.array.__mul__`, which doesn't know anything about pint and doesn't know how to compose its operations because there are no hooks--the pint array just gets coerced to a numpy array. The second goes to `pint.Quantity.__mul__`, which assumes it can wrap the `dask.array` (because it duck typing) and seems to succeed in doing so.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585 https://github.com/pydata/xarray/issues/525#issuecomment-482643700,https://api.github.com/repos/pydata/xarray/issues/525,482643700,MDEyOklzc3VlQ29tbWVudDQ4MjY0MzcwMA==,1386642,2019-04-12T16:45:17Z,2019-04-12T16:45:17Z,CONTRIBUTOR,"One additional issue. It seems like `pint` has some odd behavior with dask. Multiplication (and I assume addition) is not commutative: ``` In [42]: da.ones((10,)) * ureg.m Out[42]: dask.array In [43]: ureg.m * da.ones((10,)) Out[43]: dask.array ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585 https://github.com/pydata/xarray/issues/525#issuecomment-482639629,https://api.github.com/repos/pydata/xarray/issues/525,482639629,MDEyOklzc3VlQ29tbWVudDQ4MjYzOTYyOQ==,1386642,2019-04-12T16:32:25Z,2019-04-12T16:32:25Z,CONTRIBUTOR,@rabernat recent post inspired me to check out this issue. What would this issue entail now that `__array_function__` is in numpy? Is there some reason this is more complicated than adding an appropriate `__array_function__` to `pint`'s quantity class?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585 https://github.com/pydata/xarray/issues/525#issuecomment-248255299,https://api.github.com/repos/pydata/xarray/issues/525,248255299,MDEyOklzc3VlQ29tbWVudDI0ODI1NTI5OQ==,1310437,2016-09-20T09:49:23Z,2016-09-20T09:51:30Z,CONTRIBUTOR,"Or another way to put it: While typical metadata/attributes are only relevant if you eventually read them (which is where you will notice if they were lost on the way), units are different: They work silently behind the scene at all times, even if you do not explicitly look for them. You want an addition to fail if units don't match, without having to explicitly first test if the operands have units. So what should the ufunc_hook do if it finds two Variables that don't seem to carry units, raise an exception? Most probably not, as that would prevent to use xarray at the same time without units. So if the units are lost on the way, you might never notice, but end up with wrong data. To me, that is just not unlikely enough to happen given the damage it can do (e.g. the time it takes to find out what's going on once you realise you get wrong data). ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585 https://github.com/pydata/xarray/issues/525#issuecomment-248255426,https://api.github.com/repos/pydata/xarray/issues/525,248255426,MDEyOklzc3VlQ29tbWVudDI0ODI1NTQyNg==,1310437,2016-09-20T09:50:00Z,2016-09-20T09:50:00Z,CONTRIBUTOR,"So for now, I'm hunting for `np.asarray`. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585 https://github.com/pydata/xarray/issues/525#issuecomment-248252494,https://api.github.com/repos/pydata/xarray/issues/525,248252494,MDEyOklzc3VlQ29tbWVudDI0ODI1MjQ5NA==,1310437,2016-09-20T09:36:24Z,2016-09-20T09:36:24Z,CONTRIBUTOR,"#988 would certainly allow to me to implement unit functionality on xarray, probably by leveraging an existing units package. What I don't like with that approach is the fact that I essentially end up with a separate distinct implementation of units. I am afraid that I will either have to re-implement many of the helpers that I wrote to work with physical quantities to be xarray aware. Furthermore, one important aspect of units packages is that it prevents you from doing conversion mistakes. But that only works as long as you don't forget to carry the units with you. Having units just as attributes to xarray makes it as simple as forgetting to read the attributes when accessing the data to lose the units. The units inside xarray approach would have the advantage that whenever you end up accessing the data inside xarray, you automatically have the units with you. From a conceptual point of view, the units are really an integral part of the data, so they should sit right there with the data. Whenever you do something with the data, you have to deal with the units. That is true no matter if it is implemented as an attribute handler or directly on the data array. My fear is, attributes leave the impression of ""optional"" metadata which are too easily lost. E.g. xarray doesn't call it's _ufunc_hook_ for some operation where it should, and you silently lose units. My hope is that with nested arrays that carry units, you would instead fail verbosely. Of course, `np.concatenate` is precisely one of these cases where unit packages struggle with to get their hook in (and where units on dtypes would help). So they fight the same problem. Nonetheless, these problems are known and solved as well as possible in the units packages, but in xarray, one would have to deal with them all over again. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585 https://github.com/pydata/xarray/issues/525#issuecomment-248059952,https://api.github.com/repos/pydata/xarray/issues/525,248059952,MDEyOklzc3VlQ29tbWVudDI0ODA1OTk1Mg==,1310437,2016-09-19T17:24:21Z,2016-09-19T17:24:21Z,CONTRIBUTOR,"+1 for units support. I agree, parametrised dtypes would be the preferred solution, but I don't want to wait that long (I would be willing to contribute to that end, but I'm afraid that would exceed my knowledge of numpy). I have never used dask. I understand that the support for dask arrays is a central feature for xarray. However, the way I see it, if one would put a (unit-aware) ndarray subclass into an xarray, then units should work out of the box. As you discussed, this seems not so easy to make work together with dask (particularly in a generic way). However, shouldn't that be an issue that the dask community anyway has to solve (i.e.: currently there is no way to use any units package together with dask, right)? In that sense, allowing such arrays inside xarrays would force users to choose between dask and units, which is something they have to do anyway. But for a big part of users, that would be a very quick way to units! Or am I missing something here? I'll just try to monkeypatch xarray to that end, and see how far I get... ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585 https://github.com/pydata/xarray/issues/525#issuecomment-229421229,https://api.github.com/repos/pydata/xarray/issues/525,229421229,MDEyOklzc3VlQ29tbWVudDIyOTQyMTIyOQ==,221526,2016-06-29T17:02:28Z,2016-06-29T17:02:28Z,CONTRIBUTOR,"I agree that custom dtypes is the right solution (and I'll go dig some more there). In the meantime, I'm not sure why you couldn't wrap an xarray `DataArray` in one of pint's `Quantity` instances. With the exception of also wanting units on coordinates, this seems like a straightforward way to get at least some unit functionality. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585 https://github.com/pydata/xarray/issues/525#issuecomment-182195886,https://api.github.com/repos/pydata/xarray/issues/525,182195886,MDEyOklzc3VlQ29tbWVudDE4MjE5NTg4Ng==,6200806,2016-02-10T04:45:17Z,2016-02-10T04:45:17Z,CONTRIBUTOR,"Not to be pedantic, but just one more :+1: on ultimately implementing units support within xarray -- that would be huge. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100295585