home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

8 rows where author_association = "CONTRIBUTOR" and issue = 484015016 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • jthielen 8

issue 1

  • tests for arrays with units · 8 ✖

author_association 1

  • CONTRIBUTOR · 8 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
532383428 https://github.com/pydata/xarray/pull/3238#issuecomment-532383428 https://api.github.com/repos/pydata/xarray/issues/3238 MDEyOklzc3VlQ29tbWVudDUzMjM4MzQyOA== jthielen 3460034 2019-09-17T20:14:33Z 2019-09-17T20:14:33Z CONTRIBUTOR

@keewis Thank you for catching this! I've pushed an update to https://github.com/andrewgsavage/pint/pull/6 with your suggested changes. Raising an error on incompatible/missing units is definitely something that will be need to be added, but it may take some re-thinking the current implementations and how convert_to_consistent_units is used. My instinct is to auto-wrap non-Quantities (of known types on the type casting hierarchy) as dimensionless (as in https://github.com/hgrecco/pint/pull/764#issuecomment-523272038), and let pint's usual unit comparison checks handle the rest. But, knowing exactly what to do here and when may be something that would have to wait until some other discussions with pint are cleared up a little more (like https://github.com/hgrecco/pint/pull/764 vs. https://github.com/hgrecco/pint/pull/875 and https://github.com/hgrecco/pint/issues/845 / https://github.com/hgrecco/pint/issues/878)?

I'm thinking that discussion regarding bugs with pint's upcoming __array_function__ may fit best on https://github.com/andrewgsavage/pint/pull/6 right now, since https://github.com/hgrecco/pint/pull/764 seems to have halted in favor of https://github.com/hgrecco/pint/pull/875? (That is, until a new PR is made that follows up on https://github.com/hgrecco/pint/pull/875 to replicate what was done in https://github.com/hgrecco/pint/pull/764 and https://github.com/andrewgsavage/pint/pull/6.)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  tests for arrays with units 484015016
531606320 https://github.com/pydata/xarray/pull/3238#issuecomment-531606320 https://api.github.com/repos/pydata/xarray/issues/3238 MDEyOklzc3VlQ29tbWVudDUzMTYwNjMyMA== jthielen 3460034 2019-09-15T22:51:41Z 2019-09-15T22:51:41Z CONTRIBUTOR

@keewis Thank you for pointing that out, I forgot to mention that right now mixed types are not handled by https://github.com/hgrecco/pint/pull/764 (see https://github.com/hgrecco/pint/pull/764#issuecomment-523272038).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  tests for arrays with units 484015016
531603936 https://github.com/pydata/xarray/pull/3238#issuecomment-531603936 https://api.github.com/repos/pydata/xarray/issues/3238 MDEyOklzc3VlQ29tbWVudDUzMTYwMzkzNg== jthielen 3460034 2019-09-15T22:15:16Z 2019-09-15T22:15:53Z CONTRIBUTOR

@keewis My inclination is to think of the units as part of the data, and that, for example, zeros_like returning an array of bare zeros instead of zeros-with-units is reasonably intuitive. But since this is likely not everyone's perspective, I think raising an issue on pint's end would be good.

I'm not sure about the indexing behavior. From the sounds of your prior comment, it works because it functions as an "object-type" index? If so, it may cause a decent hit on performance. I'd definitely want to hear others' thoughts on it too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  tests for arrays with units 484015016
531600938 https://github.com/pydata/xarray/pull/3238#issuecomment-531600938 https://api.github.com/repos/pydata/xarray/issues/3238 MDEyOklzc3VlQ29tbWVudDUzMTYwMDkzOA== jthielen 3460034 2019-09-15T21:27:27Z 2019-09-15T21:27:27Z CONTRIBUTOR

@keewis In https://github.com/andrewgsavage/pint/pull/6, I implemented zeros_like, ones_like and empty_like to return base ndarrays rather than quantities, and full_like to have the unit of fill_value. This seemed like the most sensible behavior to me, since it avoids the ambiguities you mention in the full_like case if they were based on the unit of other, and for many of my own use cases, I've often wanted a zeros/ones array with a different unit than the quantity whose shape I am basing it off of.

Does this behavior seem reasonable to you? Also, would this be something that should be cleared up with an issue on pint's end?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  tests for arrays with units 484015016
527245079 https://github.com/pydata/xarray/pull/3238#issuecomment-527245079 https://api.github.com/repos/pydata/xarray/issues/3238 MDEyOklzc3VlQ29tbWVudDUyNzI0NTA3OQ== jthielen 3460034 2019-09-02T21:24:52Z 2019-09-02T21:31:05Z CONTRIBUTOR

After digging a bit more into np.prod based on your comment, the situation is more complicated than I thought it was. I forgot that array.prod() has an implementation because pint has a ufunc implementation for it (so, actually np.prod(array) gives a result by way of __array_ufunc__ before https://github.com/hgrecco/pint/pull/764, but it no longer functions that way when __array_function__ is used). I hesitate to delegate to that implementation though, since it gives incorrect units when the axis and/or where arguments are used (see https://github.com/hgrecco/pint/issues/867).

array.any() and array.all() work now because they fall back to .any() and .all() on the magnitude; they aren't explicitly handled (or tested) by pint. Falling back to the magnitude here seems mostly fine, except possibly for offset units. Once a decision is made about the expected behavior in that case, I can add them to the __array_function__ list and add tests in pint.

Based on your tests and examples, I would agree that https://github.com/pydata/xarray/issues/3241 isn't really fixed. I also agree that adding the method tests is a good idea.

Thank you for clarifying about the sel/isel/loc tests. For some reason, I'm unfortunately getting results more consistent with the stripped unit behavior, rather than what your test results are showing. For example:

```python import xarray as xr import numpy as np import pint

unit_registry = pint.UnitRegistry()

array = np.linspace(5, 10, 20).astype(int) * unit_registry.m x = np.arange(len(array)) * unit_registry.s data_array = xr.DataArray(data=array, coords={"x": x}, dims=["x"])

print(data_array.sel(x = [15, 16] * unit_registry.volts)) outputs <xarray.DataArray (x: 2)> <Quantity([8 9], 'meter')> Coordinates: * x (x) int64 15 16 `` whereas it should be giving aKeyError` by your tests, if I'm interpreting them correctly. What happens in your environment when you run the above snippet?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  tests for arrays with units 484015016
527230031 https://github.com/pydata/xarray/pull/3238#issuecomment-527230031 https://api.github.com/repos/pydata/xarray/issues/3238 MDEyOklzc3VlQ29tbWVudDUyNzIzMDAzMQ== jthielen 3460034 2019-09-02T19:32:16Z 2019-09-02T19:32:16Z CONTRIBUTOR

Thank you for the update! Here are responses to each issue brought up:

np.prod (along with np.product and np.nanprod) was not implemented yet since it seems non-trivial to determine the number of unit multiplications that occur given both axis and where arguments in a performant way. I can take a closer look at it though, and hopefully will be able to have an implementation soon. (If anyone has any ideas on this, let me know!)

I passed up np.all and np.any since I associated those with boolean arrays, rather than arrays with units. However, since it sounds like they are needed here, I can add those to the list. (Looking at it though, behavior with non-multiplicative units is unclear right now...see https://github.com/hgrecco/pint/issues/866)

np.allclose should be easy to implement, and I agree it would be nice to not have to have that workaround.

np.maximum and np.minimum appear to be routed through __array_ufunc__ instead of __array_function__. I wasn't aware of that issue until you pointed it out, but looking into the implementation, "maximum" and "minimum" (and any other ufunc not explicitly listed by pint in BaseQuantity.__set_units, BaseQuantity.__copy_units, or BaseQuantity.__prod_units) will be behave in this way since checking input compatibility but not wrapping output is the fallback behavior. This seems like a bug to me that it doesn't raise a UnitStrippedWarning, and at the very least, it is inconsistent with the fail-to-NotImplemented behavior of the __array_function__ implementation. I'll bring up this point at https://github.com/hgrecco/pint/pull/764.

np.argmin/np.argmax/np.sum: there seems to be a lot going on with these at least when I ran the tests locally. First, when called on data_array, they appear to be going deep into xarray/core/duck_array_ops.py and xarray/core/nanops.py rather than ever calling np.nanargmin/np.nanargmax/np.nansum on data_array.data. Second, even if that is the proper pathway, it ends up failing because it ends up calling np.result_type on mixed ndarray and QuantitySequence arguments. This is still unresolved on pint's end (see https://github.com/hgrecco/pint/pull/764#issuecomment-523272038 for the specific issue, https://github.com/hgrecco/pint/issues/845 for the more general discussion). All this also only happens on float dtype...it works just fine with int dtype, which is puzzling to me.

np.median: This one has been driving me crazy. In my environment, when running in pytest, it is erroring out because somewhere np.median being called on data_array is trying to cast the result (a pint QuantityScalar) to a float/int, which is not allowed by pint when the quantity is not dimensionless. But, I have no idea why it is ending up there. However, when replicating the test outside of pytest, it doesn't error, but instead the result is missing the magnitude (with a UnitStrippedWarning raised when going through the __array_struct__ attribute on the quantity. data_array.median(), however, works just fine. I'm not really sure what to do here.

For indexing, when you say "working," would you be able to clarify what your expected behavior is for indexing? Based on https://github.com/pydata/xarray/issues/525#issuecomment-514880353 and the preceding discussion, right now indices will have units stripped, so for me at least, I would expect any attempt at unit-aware indexing to either not work or raise a UnitStrippedWarning.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  tests for arrays with units 484015016
524099935 https://github.com/pydata/xarray/pull/3238#issuecomment-524099935 https://api.github.com/repos/pydata/xarray/issues/3238 MDEyOklzc3VlQ29tbWVudDUyNDA5OTkzNQ== jthielen 3460034 2019-08-22T22:20:49Z 2019-08-22T22:20:49Z CONTRIBUTOR

I noticed you have @pytest.mark.filterwarnings("error::pint.errors.UnitStrippedWarning"), which I believe is what is raising that warning to the error level. What happens when you remove that?

As to why the warning is happening in the first place, I think that is because the __array_function__ implementation in pint right now only exists in a PR. So, when xarray looks for an __array_function__ attribute on the pint Quantity right now, it falls back to this:

https://github.com/hgrecco/pint/blob/2afdc4bf5c5727ed9cef6fdaccb00b88813c4a24/pint/quantity.py#L1438-L1449

Hence, the UnitStrippedWarning. However, hasattr(data, "__array_function__") should still work, albeit for the wrong reason since pint doesn't yet have a true __array_function__ method.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  tests for arrays with units 484015016
524088212 https://github.com/pydata/xarray/pull/3238#issuecomment-524088212 https://api.github.com/repos/pydata/xarray/issues/3238 MDEyOklzc3VlQ29tbWVudDUyNDA4ODIxMg== jthielen 3460034 2019-08-22T21:39:38Z 2019-08-22T21:39:38Z CONTRIBUTOR

@keewis: In case it helps, I've added a bunch of additional __array_function__ implementations in pint in https://github.com/andrewgsavage/pint/pull/6, which will hopefully get merged soon into the main https://github.com/hgrecco/pint/pull/764 PR. If through these tests with xarray you find something missing on pint's end, let me know and I'd be glad to help.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  tests for arrays with units 484015016

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 31.192ms · About: xarray-datasette