html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/4972#issuecomment-1216662525,https://api.github.com/repos/pydata/xarray/issues/4972,1216662525,IC_kwDOAMm_X85IhMv9,35968931,2022-08-16T13:47:05Z,2022-08-16T13:47:05Z,MEMBER,"> if we can get the strategies from https://github.com/pydata/xarray/pull/6908 to shrink well

I think they already do shrink well. Each of them has corresponding unit tests, and none of those tests fail due to hypothesis timeouts.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-1216441506,https://api.github.com/repos/pydata/xarray/issues/4972,1216441506,IC_kwDOAMm_X85IgWyi,14808389,2022-08-16T10:16:30Z,2022-08-16T10:16:30Z,MEMBER,"> We might just want to wait to merge this before merging that though anyway.

I actually think it should be the other way around: if we can get the strategies from #6908 to shrink well, we might be able to fix the occasional test timeouts here (which should be one of the final issues we have before we can merge this).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-1212138021,https://api.github.com/repos/pydata/xarray/issues/4972,1212138021,IC_kwDOAMm_X85IP8Il,35968931,2022-08-11T15:21:17Z,2022-08-11T15:21:17Z,MEMBER,"> I also agree that both the strategies as well as the base classes should eventually live in xarray.testing, but I'd probably keep them in xarray.tests for now (just so we can experiment a bit more).

So I may actually have overexcitedly already jumped the gun and made a PR to move strategies to `xarray.testing.strategies` last night :sweat_smile: #6908 We might just want to wait to merge this before merging that though anyway.

> Note, however, that it's probably best not to separate 2 and 3 because there might be issues with the base class design we don't notice without trying to actually use them.

Good point.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-1211580601,https://api.github.com/repos/pydata/xarray/issues/4972,1211580601,IC_kwDOAMm_X85IN0C5,35968931,2022-08-11T06:00:29Z,2022-08-11T06:00:29Z,MEMBER,"> Q: Shouldn't the base classes live in xarray.testing rather than xarray.tests?

Another Q on a similar note: Are we planning to eventually publicly expose the (awesome btw) strategies that you've built here @keewis ? They could be very useful for testing other parts of xarray.

We could also make this PR much more incremental by splitting it into 2, or even 3 separate PRs:
1) strategies, to live somewhere like `xarray.testing.strategies` and eventually be made public
2) duck array base classes to inherit from, to live somewhere like `xarray.testing.duckarrays`
3) specific tests for pint/sparse, to live in our own test suite for now but moved out eventually.

The advantage of that would be that (1) & (2) can move forwards without requiring all the tests in (3) to pass.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-1209719906,https://api.github.com/repos/pydata/xarray/issues/4972,1209719906,IC_kwDOAMm_X85IGtxi,35968931,2022-08-09T18:20:25Z,2022-08-09T18:22:35Z,MEMBER,"Q: Shouldn't the base classes live in `xarray.testing` rather than `xarray.tests`? 

Then the `test_units.py` and `test_sparse.py` live in `xarray.tests` (until they eventually get moved to another repo), but import the base classes from `xarray.tests`?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-1208462500,https://api.github.com/repos/pydata/xarray/issues/4972,1208462500,IC_kwDOAMm_X85IB6yk,35968931,2022-08-08T18:26:33Z,2022-08-08T18:26:33Z,MEMBER,See https://github.com/pydata/xarray/issues/6894 for general discussion of the general plans,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-1208349577,https://api.github.com/repos/pydata/xarray/issues/4972,1208349577,IC_kwDOAMm_X85IBfOJ,14808389,2022-08-08T16:33:43Z,2022-08-08T17:00:26Z,MEMBER,"I started with `reduce` because that seemed to be the easiest to check, but I guess I was wrong? In any case, checking that the wrapped data is indeed the expected should be very easy to add (not this PR, though, that has stalled long enough).

The idea is to do something like `pandas`'s `ExtensionArray` [test suite](https://pandas.pydata.org/docs/development/extending.html#testing-extension-arrays) that exposes a set of base classes that can be inherited from. So to test duck array support in a downstream library, you'd inherit from the appropriate classes and override `create` (which returns a strategy for the tested array) and `check_*` to check properties of the expected result (I think that's what you're asking for in 4).

It definitely doesn't feel very polished at the moment, but hopefully we can figure out a way to fix that (and then we can also hopefully figure out a good way to document this).

Edit: also, let's discuss the general plans in a new issue","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-1208311997,https://api.github.com/repos/pydata/xarray/issues/4972,1208311997,IC_kwDOAMm_X85IBWC9,35968931,2022-08-08T16:00:01Z,2022-08-08T16:00:01Z,MEMBER,"I'm watching the progress of this PR with bated breath! I literally want to be able to test 3 different array libraries right now: [pint](https://github.com/pydata/xarray/issues/3594), [cubed](https://github.com/pydata/xarray/issues/6807), and [awkward](https://github.com/pydata/xarray/issues/4285). :exploding_head: 

Thinking about the complexity of testing like this in general I have a bunch of follow-up questions for future PRs though, e.g:

1) Shouldn't we start with some very simple tests that first check if the correct properties are defined on the wrapped array class, i.e. `shape`, `dtype` etc?
2) What's the easiest way to expose a framework of tests like this to the user, so they can import them and run them on any duck array type?
3) Docs on how to use these public tests?
4) Can we possibly allow the user to run tests on data they created? I'm thinking in particular of creating ragged awkward arrays and then testing reduce etc. on them.

Should I therefore make a separate issue to specifically track how we test (and expose test frameworks for) duck array wrapping? (So we can discuss these questions there?)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-1208285136,https://api.github.com/repos/pydata/xarray/issues/4972,1208285136,IC_kwDOAMm_X85IBPfQ,14808389,2022-08-08T15:35:05Z,2022-08-08T15:35:27Z,MEMBER,xref pydata/sparse#555,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-1204086247,https://api.github.com/repos/pydata/xarray/issues/4972,1204086247,IC_kwDOAMm_X85HxOXn,2448579,2022-08-03T15:15:34Z,2022-08-03T15:15:34Z,MEMBER,"> I also wonder whether we should have a separate job for hypothesis: this makes CI run quite a bit longer (about 70s on my machine just for sparse and pint reduce tests, which are a small part of the API)

Sounds good to me!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-1203850549,https://api.github.com/repos/pydata/xarray/issues/4972,1203850549,IC_kwDOAMm_X85HwU01,14808389,2022-08-03T11:53:22Z,2022-08-03T11:53:22Z,MEMBER,"the remaining failures for `pint` are #6822.

I also wonder whether we should have a separate job for `hypothesis`: this makes CI run quite a bit longer (about 70s on my machine just for `sparse` and `pint` reduce tests, which are a small part of the API)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-1192996302,https://api.github.com/repos/pydata/xarray/issues/4972,1192996302,IC_kwDOAMm_X85HG63O,2448579,2022-07-22T23:10:11Z,2022-07-23T02:09:12Z,MEMBER,"> sparse seems to have different dtype casting behavior (it casts e.g. a mean of float16 to float64, while numpy stays at float16), so I'll need to figure out how to work around that.

Sparse doesn't [support float16](https://github.com/pydata/sparse/issues/263) 

-----


~The remaining local failure is https://github.com/pydata/xarray/issues/6817 (perhaps we should run numpy tests too ;) )~

Failure is https://github.com/pydata/sparse/issues/553

","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-1188483366,https://api.github.com/repos/pydata/xarray/issues/4972,1188483366,IC_kwDOAMm_X85G1tEm,35968931,2022-07-19T01:03:23Z,2022-07-19T01:03:23Z,MEMBER,"> Can we xfail these new tests and merge for now? As long as tests/test_sparse.py still passes, there are no issues, correct?

Yeah we now have [another array type](https://github.com/pydata/xarray/issues/6807) to consider testing, so I'm also in favour of merging now with passing tests for pint, and un-xfailing tests for other array types (i.e. sparse) in a later PR.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-977446461,https://api.github.com/repos/pydata/xarray/issues/4972,977446461,IC_kwDOAMm_X846QqY9,2448579,2021-11-24T02:44:00Z,2021-11-24T02:44:00Z,MEMBER,"> The pint tests pass so all that's left is to figure out how to fix the sparse tests.

Can we xfail these new tests and merge for now? As long as `tests/test_sparse.py` still passes, there are no issues, correct?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-899105279,https://api.github.com/repos/pydata/xarray/issues/4972,899105279,IC_kwDOAMm_X841l0H_,14808389,2021-08-15T20:25:08Z,2021-08-15T20:25:08Z,MEMBER,"The `pint` tests pass so all that's left is to figure out how to fix the `sparse` tests.

`sparse` seems to have different `dtype` casting behavior (it casts e.g. a mean of `float16` to `float64`, while `numpy` stays at `float16`), so I'll need to figure out how to work around that. There's also a `TypeError`, which I'll have to investigate.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-872438265,https://api.github.com/repos/pydata/xarray/issues/4972,872438265,MDEyOklzc3VlQ29tbWVudDg3MjQzODI2NQ==,41898282,2021-07-01T17:51:08Z,2021-08-15T13:15:44Z,CONTRIBUTOR,"## Unit Test Results
         6 files           6 suites   58m 27s :stopwatch:
16 289 tests 14 528 :heavy_check_mark: 1 753 :zzz:   8 :x:
90 930 runs  82 604 :heavy_check_mark: 8 284 :zzz: 42 :x:

For more details on these failures, see [this check](https://github.com/pydata/xarray/runs/3333487673).

Results for commit 1d98fec1.

:recycle: This comment has been updated with latest results.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-828924833,https://api.github.com/repos/pydata/xarray/issues/4972,828924833,MDEyOklzc3VlQ29tbWVudDgyODkyNDgzMw==,12229877,2021-04-29T04:03:39Z,2021-04-29T04:03:39Z,CONTRIBUTOR,"> is there a way to separate `sizes` into `dims` and `shape` without using `draw`? I tried `dims, shape = sizes.map(lambda s: tuple(zip(*s)))`, but since the map strategy is not iterable the final unpacking fails.

You've just got to use `draw` for this.

> I'd prefer the verbose repr of `st.builds()` when debugging strategies, but if I know the strategy is correct and understand what it generates I'd probably prefer the smaller repr of `@st.composite`.

That's a good way of thinking about it :slightly_smiling_face: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-825834926,https://api.github.com/repos/pydata/xarray/issues/4972,825834926,MDEyOklzc3VlQ29tbWVudDgyNTgzNDkyNg==,14808389,2021-04-23T18:15:33Z,2021-04-23T18:15:33Z,MEMBER,"ping @shoyer, I'm mostly done with cleaning up the code.

@Zac-HD, could I ask for another review? I *think* I applied most of your suggestions but I'm sure I missed something.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-816994501,https://api.github.com/repos/pydata/xarray/issues/4972,816994501,MDEyOklzc3VlQ29tbWVudDgxNjk5NDUwMQ==,14808389,2021-04-09T22:01:01Z,2021-04-09T22:01:44Z,MEMBER,"@shoyer, I finally found `pandas`' extension array test suite again, which divides the tests into groups and provides base classes for each of them. It then uses fixtures to parametrize those tests, but I think we can use `hypothesis` strategies instead.

This will most probably be a little bit easier to implement and maintain, and allow much more control, although `parametrize` marks are still stripped so the tests will be a bit more verbose.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-808739976,https://api.github.com/repos/pydata/xarray/issues/4972,808739976,MDEyOklzc3VlQ29tbWVudDgwODczOTk3Ng==,14808389,2021-03-27T14:16:40Z,2021-03-28T20:36:09Z,MEMBER,"For more context, the main idea is to provide a generic and cheap way to check the compatibility of any duckarray (including nested duckarrays) with `xarray`'s interface. I settled on having a function construct a test class based on callbacks and marks, but I'm not sure this is the best way (it is the best I can come up with, however).

I'm considering `hypothesis` (hence the separate branch) because I was thinking that it might be easier to go through with the whole ""have a function accepting some callbacks and marks generate the tests""-thing if instead of possibly parametrized callbacks I used `hypothesis`' strategies. There are still some potential issues left, though, most importantly I'm trying to find a generic way to compute the expected values or expected failures.

Thanks for hints 2 and 3, I'm still not quite used to the concepts of `hypothesis`. As for 4, `xarray` has parameters which are intended for use in label-based indexing (e.g. with `sel`), so I'm trying to find a way to mark those to be able to tell them from other types of parameters (like position-based indexers).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-808736180,https://api.github.com/repos/pydata/xarray/issues/4972,808736180,MDEyOklzc3VlQ29tbWVudDgwODczNjE4MA==,12229877,2021-03-27T13:51:54Z,2021-03-27T13:51:54Z,CONTRIBUTOR,"Looking at https://github.com/keewis/xarray/compare/duckarray-tests...duckarray-tests-hypothesis, for high-level feedback:

- Overall it looks pretty good; though ping me again if/when it's a PR and I'll do line-level feedback on idiom issues
- A more general test would generate [the shapes](https://hypothesis.readthedocs.io/en/latest/numpy.html#hypothesis.extra.numpy.array_shapes), and the [axes to reduce over](https://hypothesis.readthedocs.io/en/latest/numpy.html#hypothesis.extra.numpy.valid_tuple_axes) - reducing a 1D array over the first dimension is going to miss things
- You use `@st.composite` when [the `.map()` method](https://hypothesis.readthedocs.io/en/latest/data.html#adapting-strategies) and a lambda would suffice (though the perf gain is small enough that this is mostly a readability issue)
- I don't see the point of `Label`, and we advise against mixing ""a strategy or a value"".  We break this rule a few times for backwards-compatibility in our Numpy support, but wouldn't write such an API these days.

And I'm always delighted to see people using Hypothesis to test libraries that I use and love 🥰🤩","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-808414458,https://api.github.com/repos/pydata/xarray/issues/4972,808414458,MDEyOklzc3VlQ29tbWVudDgwODQxNDQ1OA==,14808389,2021-03-26T18:00:27Z,2021-03-26T18:54:23Z,MEMBER,"@Zac-HD, would you have any advice for something like this?

Edit: [this](https://github.com/keewis/xarray/tree/duckarray-tests-hypothesis) is a initial draft using `hypothesis`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250
https://github.com/pydata/xarray/pull/4972#issuecomment-800366523,https://api.github.com/repos/pydata/xarray/issues/4972,800366523,MDEyOklzc3VlQ29tbWVudDgwMDM2NjUyMw==,14808389,2021-03-16T15:32:08Z,2021-03-26T18:00:22Z,MEMBER,"@shoyer: I asked about this in pytest-dev/pytest#8450, but it seems we are on our own here. The recommendation was to use inheritance, but that currently breaks `parametrize` (see the `pad` tests in `test_variable`).

I wonder if using `hypothesis` would make this easier? It would probably require a bit more reading because I don't yet fully understand how to choose the properties but dynamic parametrization would not be necessary: we would pass strategies instead of creation functions and potentially find more bugs, at the cost of longer test runs.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250