issues: 1337337135
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1337337135 | I_kwDOAMm_X85PtiUv | 6911 | Public hypothesis strategies for generating xarray data | 35968931 | open | 0 | 0 | 2022-08-12T15:17:40Z | 2022-08-12T17:46:48Z | MEMBER | ProposalWe should expose a public set of hypothesis strategies for use in testing xarray code. It could be useful for downstream users, but also for our own internal test suite. It should live in
This issue is different from #1846 because that issue describes how we could use such strategies in our own testing code, whereas this issue is for how we create general strategies that we could use in many places (including exposing publicly). I've become interested in this as part of wanting to see #6894 happen. #6908 would effectively close this issue, but itself is just a pulled out section of all the work @keewis did in #4972. (Also xref https://github.com/pydata/xarray/issues/2686. Also also @max-sixty didn't you have an issue somewhere about creating better and public test fixtures?) Previous workI was pretty surprised to see this comment by @Zac-HD in #1846
given that we might have just used that instead of writing new ones in #4972! (@keewis had you already seen that extension?) We could literally just include that extension in xarray and call this issue solved... Shrinking performance of strategiesHowever I was also reading about strategies that shrink yesterday and think that we should try to make some effort to come up with strategies for producing xarray objects that shrink in a performant and well-motivated manner. In particular by pooling the knowledge of the @xarray-dev core team we could try to create strategies that search for many of the edge cases that we are collectively aware of. My understanding of that guide is that our strategies ideally should: 1) Quickly include or exclude complexity
2) Deliberately generate known edge cases
3) Be very modular internally, to help with "keeping things local" Each sub-strategy should be in its own function, so that hypothesis' decision tree can cut branches off as soon as possible. 4) Avoid obvious inefficiencies e.g. not Perhaps the solutions implemented in #6894 or this hypothesis xarray extension already meet these criteria - I'm not sure. I just wanted a dedicated place to discuss building the strategies specifically, without it getting mixed in with complicated discussions about whatever we're trying to use the strategies for! |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6911/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
13221727 | issue |