html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/pull/6908#issuecomment-1251922767,https://api.github.com/repos/pydata/xarray/issues/6908,1251922767,IC_kwDOAMm_X85KntNP,12229877,2022-09-20T06:59:10Z,2022-09-20T06:59:10Z,CONTRIBUTOR,"> @Zac-HD if I could request one more review please! The two remaining problems for me are: Absolutely! Some quick comments this evening; I would also like to do a full review again before merge but that might be next week or weekend - I'm out for a conference from early Thursday. > 1. How should we alter the API of `datasets` to make it easier to construct `Dataset` objects containing duck-typed arrays (see [Hypothesis strategies in xarray.testing.strategiesΒ #6908 (comment)](https://github.com/pydata/xarray/pull/6908#discussion_r974066789)) Replied in the thread above. > 2. Why does the example generation performance seem to have gotten worse? πŸ˜• I added what I thought were small refactors (e.g. [`_sizes_from_dim_names`](https://github.com/pydata/xarray/pull/6908/files#diff-6eeb2015d4a907b3fd1ef578050a9d30971e2c1a59eec1c784bd937719478da3R353)) which may somehow be related, but I'm not sure. I'd be pretty surprised if that was related, `st.fixed_dictionaries()` is internally basically just that `zip()` trick anyway. I'd *guess* that this is mostly ""as you implement the last few complicated data-gen options, they start taking nonzero time"", but not confident in that without reading closely and probably measuring some perf things.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1336119080 https://github.com/pydata/xarray/pull/6908#issuecomment-1238674803,https://api.github.com/repos/pydata/xarray/issues/6908,1238674803,IC_kwDOAMm_X85J1K1z,12229877,2022-09-06T21:37:07Z,2022-09-06T21:37:07Z,CONTRIBUTOR,(generally staying unsubbed; so please ping me whenever you've got questions or would like another review!),"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1336119080 https://github.com/pydata/xarray/pull/6908#issuecomment-1218435300,https://api.github.com/repos/pydata/xarray/issues/6908,1218435300,IC_kwDOAMm_X85In9jk,12229877,2022-08-17T19:59:55Z,2022-08-17T19:59:55Z,CONTRIBUTOR,"(unsubbing for noise, please @-me when you'd like another review!)","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1336119080 https://github.com/pydata/xarray/pull/6908#issuecomment-1217524909,https://api.github.com/repos/pydata/xarray/issues/6908,1217524909,IC_kwDOAMm_X85IkfSt,12229877,2022-08-17T06:39:42Z,2022-08-17T06:39:42Z,CONTRIBUTOR,"OK, reviewed βœ… Overall it looks pretty good, but there are a couple of places where the API you've got is pushing you into some really nasty performance traps where you have to use rejection sampling to keep things consistent. We have some tricks to help, but it's fundamentally exponential scaling - which means that it works right up until it doesn't work at all, and most of your users are likely to hit that regime πŸ˜• Definitely possible to fix that, but it's more like a redesign than patching a typo.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1336119080 https://github.com/pydata/xarray/pull/6908#issuecomment-1216920806,https://api.github.com/repos/pydata/xarray/issues/6908,1216920806,IC_kwDOAMm_X85IiLzm,12229877,2022-08-16T17:12:40Z,2022-08-16T17:12:40Z,CONTRIBUTOR,"> @Zac-HD would you be able to give your hypothesis-expert opinions on the general design of these strategies? Especially the whole ""strategies that accept multiple other strategies"" thing. I'll aim for a proper review tonight! Quick remarks: strategies accepting strategies is fine, though [our API style guide](https://github.com/HypothesisWorks/hypothesis/blob/master/guides/api-style.rst) suggests accepting strategies _xor_ values (this is violated in a few places in our Numpy extra, but it's still a good base principle). The guides might have other useful advice, I'd recommend skimming them since you're planning to ship Hypothesis-extending code to quite a few people!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1336119080 https://github.com/pydata/xarray/pull/4972#issuecomment-828924833,https://api.github.com/repos/pydata/xarray/issues/4972,828924833,MDEyOklzc3VlQ29tbWVudDgyODkyNDgzMw==,12229877,2021-04-29T04:03:39Z,2021-04-29T04:03:39Z,CONTRIBUTOR,"> is there a way to separate `sizes` into `dims` and `shape` without using `draw`? I tried `dims, shape = sizes.map(lambda s: tuple(zip(*s)))`, but since the map strategy is not iterable the final unpacking fails. You've just got to use `draw` for this. > I'd prefer the verbose repr of `st.builds()` when debugging strategies, but if I know the strategy is correct and understand what it generates I'd probably prefer the smaller repr of `@st.composite`. That's a good way of thinking about it :slightly_smiling_face: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250 https://github.com/pydata/xarray/pull/4972#issuecomment-808736180,https://api.github.com/repos/pydata/xarray/issues/4972,808736180,MDEyOklzc3VlQ29tbWVudDgwODczNjE4MA==,12229877,2021-03-27T13:51:54Z,2021-03-27T13:51:54Z,CONTRIBUTOR,"Looking at https://github.com/keewis/xarray/compare/duckarray-tests...duckarray-tests-hypothesis, for high-level feedback: - Overall it looks pretty good; though ping me again if/when it's a PR and I'll do line-level feedback on idiom issues - A more general test would generate [the shapes](https://hypothesis.readthedocs.io/en/latest/numpy.html#hypothesis.extra.numpy.array_shapes), and the [axes to reduce over](https://hypothesis.readthedocs.io/en/latest/numpy.html#hypothesis.extra.numpy.valid_tuple_axes) - reducing a 1D array over the first dimension is going to miss things - You use `@st.composite` when [the `.map()` method](https://hypothesis.readthedocs.io/en/latest/data.html#adapting-strategies) and a lambda would suffice (though the perf gain is small enough that this is mostly a readability issue) - I don't see the point of `Label`, and we advise against mixing ""a strategy or a value"". We break this rule a few times for backwards-compatibility in our Numpy support, but wouldn't write such an API these days. And I'm always delighted to see people using Hypothesis to test libraries that I use and love πŸ₯°πŸ€©","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,818059250 https://github.com/pydata/xarray/pull/4089#issuecomment-633336577,https://api.github.com/repos/pydata/xarray/issues/4089,633336577,MDEyOklzc3VlQ29tbWVudDYzMzMzNjU3Nw==,12229877,2020-05-25T01:39:49Z,2020-05-25T08:29:16Z,CONTRIBUTOR,"> Also, I don't think there is a hypothesis.extra.xarray module, yet. Any comments on that, @Zac-HD? Not yet, though I'm interested in collaborations to write one! Our existing Numpy + Pandas integrations, along with the `st.builds()` and `@st.composite` strategies, make it reasonably straightforward-if-tedious to compose the objects you want in exquisite detail... the real challenge is to design a clean high-level API which lets you control only the parts you care about. Plausibly we have to let go of that last part to make this work, and just point people at the low-level tools if high-level isn't working. [This third-party module](https://github.com/uber/hypothesis-gufunc/blob/master/hypothesis_gufunc/extra/xr.py) is unidiomatic in that it doesn't have the API design above, but I believe it works. rdturnermtl has a history of great features; we eventually got idiomatic high-performance gufunc shapes upstream and I'm confident we'll get Xarray support eventually too... and sooner if there are people who can help design it :smile: Just \@-mention me again if this comes up, or I won't be notified.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,623751213 https://github.com/pydata/xarray/issues/1846#issuecomment-616967346,https://api.github.com/repos/pydata/xarray/issues/1846,616967346,MDEyOklzc3VlQ29tbWVudDYxNjk2NzM0Ng==,12229877,2020-04-21T05:51:10Z,2020-04-21T05:51:10Z,CONTRIBUTOR,"@rdturnermtl wrote [a Hypothesis extension for Xarray](https://github.com/uber/hypothesis-gufunc/blob/master/hypothesis_gufunc/extra/xr.py), which is at least a nice demo of what's possible. If Xarray contributors want to keep working on this (#3283 seems stalled?) I'd be happy to help, since I'm both invested in Xarray working and would like to ship a `hypothesis.extra.xarray` some day :slightly_smiling_face: ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,290244473 https://github.com/pydata/xarray/pull/3285#issuecomment-547646808,https://api.github.com/repos/pydata/xarray/issues/3285,547646808,MDEyOklzc3VlQ29tbWVudDU0NzY0NjgwOA==,12229877,2019-10-29T21:54:35Z,2019-10-29T21:56:21Z,CONTRIBUTOR,">> OK, looks like the test failure now is real. Let me know if you want me to comment out the relevant line so the tests pass. > If we want to merge a subset of the tests then that's fine. Ofc even better if we can use these tests to find & fix the errors In my experience it's better to open an issue, add an xfail decorator to the test, and merge the tests PR. Otherwise the initial PR can take a very long time and no other property-based tests get added. In this case I'd duplicate the test, so there's one which does not allow empty dataframes and one (xfailing) which does. It's also likely that the person who found the bug is not the best person to fix it, and requiring that they do so in order to merge a useful test just disincentives testing! ","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",,490316894 https://github.com/pydata/xarray/pull/3283#issuecomment-541046819,https://api.github.com/repos/pydata/xarray/issues/3283,541046819,MDEyOklzc3VlQ29tbWVudDU0MTA0NjgxOQ==,12229877,2019-10-11T12:38:06Z,2019-10-11T12:38:06Z,CONTRIBUTOR,"> The datetime64 & timedelta64 cases don't roundtrip completely, so there are test failures. I've added these in a separate commit at the end, so it's easy to cut that back out if you prefer. The basic problem seems to be that Hypothesis is generating `timedelta64[Y]`, but Xarray is dealing with `timedelta64[ns]`: ``` ____________________________ test_netcdf_roundtrip _____________________________ tmp_path = PosixPath('/tmp/pytest-of-vsts/pytest-0/test_netcdf_roundtrip0') data = data(...), arr = array([293], dtype='timedelta64[Y]') ... with xr.open_dataset(tmp_path / ""test.nc"") as roundtripped: > xr.testing.assert_identical(original, roundtripped) E AssertionError: Left and right Dataset objects are not identical E E Differing data variables: E L data (0) timedelta64[ns] -106488 days +01:41:02.290448 E R data (0) timedelta64[ns] -106488 days +01:41:02.290449 properties/test_netcdf_roundtrip.py:51: AssertionError ---------------------------------- Hypothesis ---------------------------------- Falsifying example: test_netcdf_roundtrip( tmp_path=PosixPath('/tmp/pytest-of-vsts/pytest-0/test_netcdf_roundtrip0'), data=data(...), arr=array([293], dtype='timedelta64[Y]') ) ``` So either that's a pretty serious bug, or you should specify [the `max_period` of the timedelta dtype strategy](https://hypothesis.readthedocs.io/en/latest/numpy.html#hypothesis.extra.numpy.timedelta64_dtypes).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,490228661 https://github.com/pydata/xarray/issues/1566#issuecomment-522436770,https://api.github.com/repos/pydata/xarray/issues/1566,522436770,MDEyOklzc3VlQ29tbWVudDUyMjQzNjc3MA==,12229877,2019-08-19T06:50:15Z,2019-08-19T06:50:15Z,CONTRIBUTOR,"It's at least out of date, and doesn't appear to be helpful, so I'm happy to close this issue.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,256557897 https://github.com/pydata/xarray/pull/3194#issuecomment-519760729,https://api.github.com/repos/pydata/xarray/issues/3194,519760729,MDEyOklzc3VlQ29tbWVudDUxOTc2MDcyOQ==,12229877,2019-08-09T03:10:37Z,2019-08-09T03:10:37Z,CONTRIBUTOR,"[My config](https://github.com/Zac-HD/hypothesmith/blob/master/tox.ini) uses `tox` and `bash` instead of pre-commit. Doesn't work well for Windows, but it's a tiny Python script to glob for .py files and `subprocess.run` the pyupgrade command. Balancing powerful CI against new-contributor-friendliness is hard! ```ini [testenv:check] deps = -r requirements.txt whitelist_externals = bash commands = autoflake --recursive --in-place --remove-all-unused-imports --remove-duplicate-keys --remove-unused-variables . bash -c \""pyupgrade --py36-plus **.py\"" isort --recursive --apply . black . flake8 mypy --config-file=tox.ini . src/ bandit --recursive --exclude=./.tox/** --skip=B101,B110,B310 --quiet . ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,478690528 https://github.com/pydata/xarray/pull/3194#issuecomment-519720166,https://api.github.com/repos/pydata/xarray/issues/3194,519720166,MDEyOklzc3VlQ29tbWVudDUxOTcyMDE2Ng==,12229877,2019-08-08T23:14:19Z,2019-08-08T23:14:19Z,CONTRIBUTOR,"The author pushes pre-commit pretty hard, but that shouldn't be a problem for Xarray as we're already using it. Then `pre-commit run --all-files` will execute it. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,478690528 https://github.com/pydata/xarray/pull/3194#issuecomment-519713454,https://api.github.com/repos/pydata/xarray/issues/3194,519713454,MDEyOklzc3VlQ29tbWVudDUxOTcxMzQ1NA==,12229877,2019-08-08T22:42:12Z,2019-08-08T22:42:12Z,CONTRIBUTOR,"In all seriousness, pyupgrade will do this and more automatically with the --py3-only option. I'd add it to CI! ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,478690528 https://github.com/pydata/xarray/issues/3092#issuecomment-519711707,https://api.github.com/repos/pydata/xarray/issues/3092,519711707,MDEyOklzc3VlQ29tbWVudDUxOTcxMTcwNw==,12229877,2019-08-08T22:34:33Z,2019-08-08T22:34:33Z,CONTRIBUTOR,@max-sixty - closed by #3142 I think? ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,466750687 https://github.com/pydata/xarray/pull/3142#issuecomment-518929174,https://api.github.com/repos/pydata/xarray/issues/3142,518929174,MDEyOklzc3VlQ29tbWVudDUxODkyOTE3NA==,12229877,2019-08-07T03:54:03Z,2019-08-07T03:54:03Z,CONTRIBUTOR,"Sadly if there's a way to automatically fix up merge conflicts I don't know of it 😭. However I'd be happy to help anyone who is having trouble rebasing their open PR. On a different note, have you considered adding the `pyupgrade` and `autoflake8` autoformatters? They go further than Black in that they do change the AST, but it's a nice way to ensure consistency of style, automatically remove unused imports and variables for flake8, and so on.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,469871658 https://github.com/pydata/xarray/pull/2777#issuecomment-469067311,https://api.github.com/repos/pydata/xarray/issues/2777,469067311,MDEyOklzc3VlQ29tbWVudDQ2OTA2NzMxMQ==,12229877,2019-03-03T21:41:21Z,2019-03-03T21:41:21Z,CONTRIBUTOR,"No objection to going with #2792; I'm just happy to have the change merged πŸ˜„ It would be nice for someone to cherry-pick 63da214d697345ebdd0ecc0967c72eafc70bcb0d before releasing 0.12 though, just to fix that warning.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,411755105 https://github.com/pydata/xarray/pull/2777#issuecomment-467667604,https://api.github.com/repos/pydata/xarray/issues/2777,467667604,MDEyOklzc3VlQ29tbWVudDQ2NzY2NzYwNA==,12229877,2019-02-27T00:07:34Z,2019-02-27T00:07:34Z,CONTRIBUTOR,"> maybe you have a syntax error? ...yep, an unmatched paren. πŸ˜₯ > If `a` and `b` have the same name, then you'd get an index with the two duplicate entries in the second case but not the first case. [which would be bad] I think it's impossible to avoid this when using inference in the general case. Two options I think would be decent-if-unsatisfying: 1. Explicitly manage this in the new combining functions, e.g. clear the concat dim coords if they are not unique and the input arrays did not have coords in that dimension. 2. Add an argument to `xr.concat` to enable or disable this, e.g. `infer_coords=True`, and disable it when calling `xr.concat` from other combining functions.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,411755105 https://github.com/pydata/xarray/pull/2777#issuecomment-466587431,https://api.github.com/repos/pydata/xarray/issues/2777,466587431,MDEyOklzc3VlQ29tbWVudDQ2NjU4NzQzMQ==,12229877,2019-02-22T23:47:30Z,2019-02-22T23:47:30Z,CONTRIBUTOR,"@shoyer - don't worry about the docs build, I'm pretty sure that was just a flaky network from Travis and it's working now in any case. I've left `tutorial.load_dataset` in, just changed ""removed in 0.12"" to ""removed in a future version"".","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,411755105 https://github.com/pydata/xarray/pull/2777#issuecomment-466369284,https://api.github.com/repos/pydata/xarray/issues/2777,466369284,MDEyOklzc3VlQ29tbWVudDQ2NjM2OTI4NA==,12229877,2019-02-22T11:41:37Z,2019-02-22T11:41:37Z,CONTRIBUTOR,"OK! @shoyer, I've got everything passing and it's ready for review. Even the accidental tutorial/docs fixes :smile: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,411755105 https://github.com/pydata/xarray/pull/2784#issuecomment-465843922,https://api.github.com/repos/pydata/xarray/issues/2784,465843922,MDEyOklzc3VlQ29tbWVudDQ2NTg0MzkyMg==,12229877,2019-02-21T03:05:33Z,2019-02-21T03:05:33Z,CONTRIBUTOR,"This function is deprecated and planned for removal in Xarray 0.12 (i.e. very soon, see #2776), e.g. by https://github.com/pydata/xarray/pull/2777/commits/d6e15cc1647909cebde734a5d6b92f262b234112. Merging both will just create a merge conflict, but if you want to pull that commit into this PR that would be fine with me!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,412694880 https://github.com/pydata/xarray/pull/2777#issuecomment-465812738,https://api.github.com/repos/pydata/xarray/issues/2777,465812738,MDEyOklzc3VlQ29tbWVudDQ2NTgxMjczOA==,12229877,2019-02-21T00:32:10Z,2019-02-21T00:32:10Z,CONTRIBUTOR,"Hmm, it looks like the failure to download [the naturalearth coastlines.zip](http://naciscdn.org/naturalearth/10m/physical/ne_10m_coastline.zip) wasn't so transient after all - but it *does* work on my machine!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,411755105 https://github.com/pydata/xarray/pull/2777#issuecomment-465802643,https://api.github.com/repos/pydata/xarray/issues/2777,465802643,MDEyOklzc3VlQ29tbWVudDQ2NTgwMjY0Mw==,12229877,2019-02-20T23:51:04Z,2019-02-20T23:51:15Z,CONTRIBUTOR,"The docs build failed due to a (transient) http error when loading tutorial data for the docs, so I've also finalised the planned conversion from `xarray.tutorial.load_dataset` to `xarray.tutorial.open_dataset`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,411755105 https://github.com/pydata/xarray/issues/2776#issuecomment-465535523,https://api.github.com/repos/pydata/xarray/issues/2776,465535523,MDEyOklzc3VlQ29tbWVudDQ2NTUzNTUyMw==,12229877,2019-02-20T11:25:23Z,2019-02-20T11:25:23Z,CONTRIBUTOR,"I'd love to get #2777 merged - it's a small patch, but really useful in a remote sensing niche. My students are going to be graded on how well their figures are labelled, and I really don't want to have to teach them matplotlib πŸ˜…","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,411738552 https://github.com/pydata/xarray/pull/2777#issuecomment-465029759,https://api.github.com/repos/pydata/xarray/issues/2777,465029759,MDEyOklzc3VlQ29tbWVudDQ2NTAyOTc1OQ==,12229877,2019-02-19T08:11:56Z,2019-02-19T08:11:56Z,CONTRIBUTOR,"Thanks for the support and quick review @shoyer! Any idea when Xarray 0.12 might be out? I'm teaching some remote sensing workshops in mid-March and would love to have this merged, as a colleague's review of those notebooks prompted this PR πŸ˜„ ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,411755105 https://github.com/pydata/xarray/issues/2499#issuecomment-431718845,https://api.github.com/repos/pydata/xarray/issues/2499,431718845,MDEyOklzc3VlQ29tbWVudDQzMTcxODg0NQ==,12229877,2018-10-22T00:50:22Z,2018-10-22T00:50:22Z,CONTRIBUTOR,"I'd also try to find a way to use a groupby or apply_along_axis without stacking and unstacking the data, and to choose chunks that match the layout on disk - i.e. try `lon=1` if the order is `time, lat, lon`. If the time observations are *not* contiguous in memory, it's probably worth reshaping the whole array and writing it back to disk up front.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,372244156 https://github.com/pydata/xarray/issues/2499#issuecomment-431657200,https://api.github.com/repos/pydata/xarray/issues/2499,431657200,MDEyOklzc3VlQ29tbWVudDQzMTY1NzIwMA==,12229877,2018-10-21T10:30:23Z,2018-10-21T10:30:23Z,CONTRIBUTOR,"> `dataset = xr.open_dataset(netcdf_precip, chunks={'lat': 1})` This makes me *really* suspicious - `lat=1` is a very very small chunk size, and completely unchunked in time and lon. Without knowing anything else, I'd try `chunks=dict(lat=200, lon=200)` or higher depending on the time dim - Dask is most efficient with chunks of around 10MB for most workloads. This all also depends on the data layout on disk too - can you share `repr(xr.open_dataset(netcdf_precip))`? What does `ncdump` say?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,372244156 https://github.com/pydata/xarray/pull/2442#issuecomment-424916084,https://api.github.com/repos/pydata/xarray/issues/2442,424916084,MDEyOklzc3VlQ29tbWVudDQyNDkxNjA4NA==,12229877,2018-09-27T00:35:46Z,2018-09-27T00:35:46Z,CONTRIBUTOR,No problem!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,364247513 https://github.com/pydata/xarray/issues/2441#issuecomment-424902226,https://api.github.com/repos/pydata/xarray/issues/2441,424902226,MDEyOklzc3VlQ29tbWVudDQyNDkwMjIyNg==,12229877,2018-09-26T23:21:33Z,2018-09-26T23:21:33Z,CONTRIBUTOR,"Wow, this is embarrassing - I strengthened that check because I thought people might be relying on a silent no-op, and the first case I hear about... was me. Um. #2442 fixes it, at least?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,364148137 https://github.com/pydata/xarray/pull/2204#issuecomment-396381891,https://api.github.com/repos/pydata/xarray/issues/2204,396381891,MDEyOklzc3VlQ29tbWVudDM5NjM4MTg5MQ==,12229877,2018-06-11T20:47:21Z,2018-06-11T20:47:21Z,CONTRIBUTOR,"@jhamman - also closes #2211. Note though that you have to put the word ""closes"" (or ""fixes"") before each issue reference; it doesn't handle CSV so they don't get closed, and that's an easy way to have zombie issues that cause some confusion later.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,327905732 https://github.com/pydata/xarray/issues/2093#issuecomment-387052622,https://api.github.com/repos/pydata/xarray/issues/2093,387052622,MDEyOklzc3VlQ29tbWVudDM4NzA1MjYyMg==,12229877,2018-05-07T12:43:41Z,2018-05-07T12:43:41Z,CONTRIBUTOR,"With the benefit of almost a year's worth of procrastination, I think the best approach is to take the heuristics from #1440, but only support `chunks=True` - if a decent default heuristic isn't good enough, the user can specify exact chunks. The underlying logic for this issue would be identical to that of #1440, so supporting both is ""just"" a matter of plumbing it in correctly.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,318950038 https://github.com/pydata/xarray/pull/1972#issuecomment-374573132,https://api.github.com/repos/pydata/xarray/issues/1972,374573132,MDEyOklzc3VlQ29tbWVudDM3NDU3MzEzMg==,12229877,2018-03-20T12:03:13Z,2018-03-20T12:03:13Z,CONTRIBUTOR,"> I would say we should merge it now and iterate in future PRs. Merge away then!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,303103716 https://github.com/pydata/xarray/pull/1972#issuecomment-373545890,https://api.github.com/repos/pydata/xarray/issues/1972,373545890,MDEyOklzc3VlQ29tbWVudDM3MzU0NTg5MA==,12229877,2018-03-15T22:41:20Z,2018-03-15T22:41:20Z,CONTRIBUTOR,"@shoyer & @fmaussion - I've just given up on the plotting tests as being more effort than they're worth. Are there any: - blockers to merging this as-is? - other APIs you think it would be reasonably easy to write property tests for? [Here's a nice list of properties](https://fsharpforfunandprofit.com/posts/property-based-testing-2/) to test :smile: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,303103716 https://github.com/pydata/xarray/pull/1972#issuecomment-371796211,https://api.github.com/repos/pydata/xarray/issues/1972,371796211,MDEyOklzc3VlQ29tbWVudDM3MTc5NjIxMQ==,12229877,2018-03-09T12:11:02Z,2018-03-09T12:11:02Z,CONTRIBUTOR,"@jklymak - I'm getting `RuntimeError: Invalid DISPLAY variable` in the Qt backend now that I've added `plt.clf()`. It works on my machine now (:tada:, thanks!) - any suggestions for Travis? (I'm also getting Zarr errors, but I assume those will go away soon as I didn't cause them)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,303103716 https://github.com/pydata/xarray/pull/1972#issuecomment-371722392,https://api.github.com/repos/pydata/xarray/issues/1972,371722392,MDEyOklzc3VlQ29tbWVudDM3MTcyMjM5Mg==,12229877,2018-03-09T06:05:46Z,2018-03-09T06:05:46Z,CONTRIBUTOR,"...that also explains why I was having trouble reproducing the error, whoops. I'll see how it goes with those problems excluded later tonight!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,303103716 https://github.com/pydata/xarray/pull/1972#issuecomment-371689343,https://api.github.com/repos/pydata/xarray/issues/1972,371689343,MDEyOklzc3VlQ29tbWVudDM3MTY4OTM0Mw==,12229877,2018-03-09T02:05:18Z,2018-03-09T02:05:39Z,CONTRIBUTOR,"@shoyer - that depends mostly on whether you want to run these tests as part of a standard run. A test-time dependency on Hypothesis is very cheap compared to the other dev dependencies, so I'd think more about the impact on eg CI resources than contributors. Upside is more power and coverage of edge-cases with odd data; downside is that they take a lot longer by virtue of trying hundreds of examples, and in this case also having to generate arrays takes a while (~log(elements) average via sparse filling). ------- @tacaswell - I would be delighted to write a test suite like this for matplotlib! The only reason I *haven't* is because I thought it would be rude to report so many bugs that I don't have time to help fix. If we can get a group project going though I'd be very enthusiastic πŸ˜„ - For this suite I was passing in 2D arrays of unsigned ints, signed ints, or floats (in all sizes), with edge sizes between 2 and 10. - The failing outputs were reported as 2x2 arrays (~30 times), or 3x2 (once - there's a cap on number of shrinks that I think I hit there). - Values tended to be either uniform small integers - 0 or 1, dtype int8 or uint8; or for some an array of floats with one corner zero and the others *very* large. I didn't keep the exact tracebacks, but I remember seeing many come from overflow in tick spacing calculations. Again, happy to write a test suite and make more detailed reports upstream if people want to fix this - in which case let's open an issue there!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,303103716 https://github.com/pydata/xarray/pull/1972#issuecomment-371671971,https://api.github.com/repos/pydata/xarray/issues/1972,371671971,MDEyOklzc3VlQ29tbWVudDM3MTY3MTk3MQ==,12229877,2018-03-09T00:26:13Z,2018-03-09T00:26:13Z,CONTRIBUTOR,"> This looks like a great start to me -- thank you! You're welcome! Same questions as above though, plus ""is there anything else you need in this initial PR?"". If not, can we merge it > It's impressive that it's possible to break every plotting type with matplotlib :). As much as I love matplotlib, it's a steaming pile of hacks and I want to avoid it more than I want it cleaned up πŸ˜₯ (entirely because the process is dysfunctional, not the code)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,303103716 https://github.com/pydata/xarray/pull/1972#issuecomment-371403712,https://api.github.com/repos/pydata/xarray/issues/1972,371403712,MDEyOklzc3VlQ29tbWVudDM3MTQwMzcxMg==,12229877,2018-03-08T07:29:18Z,2018-03-08T07:29:18Z,CONTRIBUTOR,"Ping @fmaussion / @shoyer - would love your opinions on this, including high-value targets to test. (btw Appveyor had an internal error; build is otherwise green)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,303103716 https://github.com/pydata/xarray/pull/1967#issuecomment-371011595,https://api.github.com/repos/pydata/xarray/issues/1967,371011595,MDEyOklzc3VlQ29tbWVudDM3MTAxMTU5NQ==,12229877,2018-03-07T03:34:37Z,2018-03-07T03:34:37Z,CONTRIBUTOR,"Not any more :wink: - it links to the docs that ""[`xarray.plot.plot()`] calls an xarray plotting function based on the dimensions of darray.squeeze()"". I'd actually like to keep the magic in `plot()` - the whole point of this is that we can do *something* with the data, not that we do anything in particular. For interactive use, this is valuable enough - in my view - to justify keeping an inconsistent API; particularly when there is an obvious non-magic version to use in a script. In any case, that's a topic for another pull.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302695966 https://github.com/pydata/xarray/pull/1967#issuecomment-370974503,https://api.github.com/repos/pydata/xarray/issues/1967,370974503,MDEyOklzc3VlQ29tbWVudDM3MDk3NDUwMw==,12229877,2018-03-07T00:06:15Z,2018-03-07T03:08:31Z,CONTRIBUTOR,"Ah, I see what you mean but don't think we need any change or additional test. There's a catch though - `xarray.plot()` is special, because it (and only it) [squeezes the dimensions of the array](http://xarray.pydata.org/en/stable/generated/xarray.plot.plot.html#xarray.plot.plot) before plotting it. Therefore, the following calls all produce the same plot: xr.DataArray(np.arange(9).reshape((3,3))).plot() xr.DataArray(np.arange(9).reshape((1,3,3))).plot() xr.DataArray(np.arange(9).reshape((1,1,3,1,1,1,1,3,1,1))).plot() My view is that the test you linked to is sufficient for the test you're asking for - `imshow` is a special case because it *can* accept 3D input for RGB plots. TLDR - working as intended IMO, it's just that nobody reads the docs. Changing the API would avoid this but at cost of convenience which is the whole point of `DataArray.plot()`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302695966 https://github.com/pydata/xarray/pull/1967#issuecomment-370813600,https://api.github.com/repos/pydata/xarray/issues/1967,370813600,MDEyOklzc3VlQ29tbWVudDM3MDgxMzYwMA==,12229877,2018-03-06T15:13:43Z,2018-03-06T15:13:43Z,CONTRIBUTOR,"No problem - I had an hour free and no open pulls waiting on me, so the timing was good. As a regression test it's specific to imshow, so I'm not sure what you'd want here (or whether it would work at all on the 2d mixin). More details please? ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302695966 https://github.com/pydata/xarray/issues/1966#issuecomment-370774614,https://api.github.com/repos/pydata/xarray/issues/1966,370774614,MDEyOklzc3VlQ29tbWVudDM3MDc3NDYxNA==,12229877,2018-03-06T13:04:03Z,2018-03-06T13:08:54Z,CONTRIBUTOR,"Ugh, this is literally a one-line fix: I shouldn't have put `.squeeze()` [here](https://github.com/pydata/xarray/blob/3419e9e61beb551850ddc283bb963f09967cb6c3/xarray/plot/utils.py#L299). If you also ensure that the dtype is OK (specify uint8, or divide by nine) the example then works. It's *also* exactly the kind of thing that Hypothesis would catch for us via #1846. Work (moved to less coding :cry:) and the upcoming major release of Hypothesis have priority, but I'll try to find time.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302679890 https://github.com/pydata/xarray/pull/1919#issuecomment-369117775,https://api.github.com/repos/pydata/xarray/issues/1919,369117775,MDEyOklzc3VlQ29tbWVudDM2OTExNzc3NQ==,12229877,2018-02-28T04:27:16Z,2018-02-28T04:27:16Z,CONTRIBUTOR,"Last time I used it Hound *didn't* update the PR status, so if so that removes one of my objections πŸ˜„ As a maintainer I prefer automating checks - I make mistakes, especially when tired, and it's nice knowing that the system has my back. Hound only runs checks in the diff range though, so I think it's still valuable to keep running flake8 ourselves. Trying both for a few months before making a final decision sounds like a good idea to me!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,297794911 https://github.com/pydata/xarray/pull/1919#issuecomment-369111117,https://api.github.com/repos/pydata/xarray/issues/1919,369111117,MDEyOklzc3VlQ29tbWVudDM2OTExMTExNw==,12229877,2018-02-28T03:38:47Z,2018-02-28T03:39:07Z,CONTRIBUTOR,"[home-assistant](https://github.com/home-assistant/home-assistant) uses https://houndci.com, and it seems to work pretty well for them - it just leaves a review with any flake8 comments at the relevant lines of the pull. Note that while HoundCI is used to provide faster and more readable feedback to contributors, they still check flake8 in Travis to ensure everything is fixed before merging.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,297794911 https://github.com/pydata/xarray/issues/1912#issuecomment-369106762,https://api.github.com/repos/pydata/xarray/issues/1912,369106762,MDEyOklzc3VlQ29tbWVudDM2OTEwNjc2Mg==,12229877,2018-02-28T03:09:46Z,2018-02-28T03:09:46Z,CONTRIBUTOR,"Per comment on the pull: my experience is that bots are a great complement, but can't replace style checks in CI. Only remove flake8 from Travis if you're OK with merging pulls that don't meet the style guide (and then amend the contributing guide to match) - that's a valid decision, but also a weaker position than Xarray has previously taken.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,297452821 https://github.com/pydata/xarray/pull/1919#issuecomment-369106191,https://api.github.com/repos/pydata/xarray/issues/1919,369106191,MDEyOklzc3VlQ29tbWVudDM2OTEwNjE5MQ==,12229877,2018-02-28T03:05:48Z,2018-02-28T03:05:48Z,CONTRIBUTOR,"I would be very disappointed to remove the flake8 check from Travis - in my experience, style guides are only applied *consistently* if they are enforced by CI. Ensuring that each pull maintains our standards is much less burdensome than regular ""big push"" efforts to get back to the standard (see eg #1741 and #1824 - it took two people!), **and** avoids periods of lower code quality. In short, I think that Travis *is* the right place to put these checks - review bots are a complement, not a substitute.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,297794911 https://github.com/pydata/xarray/issues/1821#issuecomment-368686954,https://api.github.com/repos/pydata/xarray/issues/1821,368686954,MDEyOklzc3VlQ29tbWVudDM2ODY4Njk1NA==,12229877,2018-02-26T23:20:45Z,2018-02-26T23:20:45Z,CONTRIBUTOR,"First: thanks, everyone, for such a prompt and helpful response! I'm excited both to have 10.1 (:tada:), and by the prospect of faster/automated releases in future. Reading over the releasing instructions, I think there are three parts we need to work on to go fully automated. By fully automated, I mean ""no maintainer action whatsoever beyond merging pulls, which are not release-specific"": - Most things can be automated with [continuous deployment scripts like Hypothesis'](http://hypothesis.works/articles/continuous-releases/). We deploy a patch or minor release for *every* pull request, but if that's undesirable you could run the deployment from a weekly ""cron"" job on Travis. - Changelog management (for weekly rather than per-PR releases) can be automated with https://github.com/hawkowl/towncrier - there's a substitute for all of us :smile: - Xarray would need some novel scripts for uploading Github releases and driving the conda-forge feedstock. (Which Hypothesis would borrow in turn - it's lower priority for us but still nice to have) In short, my advice is to be creative, and if release processes can't be automated - change them!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,287852184 https://github.com/pydata/xarray/issues/1821#issuecomment-368303298,https://api.github.com/repos/pydata/xarray/issues/1821,368303298,MDEyOklzc3VlQ29tbWVudDM2ODMwMzI5OA==,12229877,2018-02-25T11:55:31Z,2018-02-25T11:55:31Z,CONTRIBUTOR,"@jhamman & @shoyer - I think Xarray has a release frequency problem. - #1782 has been merged, but I still can't plot timeseries with an all-nan step - #1796, #1819, and #1893 have been merged - but I still can't plot RGB images - #1840 has been merged - but loading CF-int-encoded data still uses double the memory it needs. - *All of these daily tasks for myself and my colleagues are harder than they should be.* It's been more than three months now without a patch release. This is *really, really* frustrating as an Xarray contributor, user, and advocate - getting my work merged upstream literally isn't worth anything until it's released, my colleagues have trouble using it (and go back to Matlab or IDL!), and it's harder to ask for anything in meetings with eg @opendatacube. **Moving to weekly patch releases would fix all of these problems.** Maintainer availability doesn't need to be a limiting factor, either - for example, @HypothesisWorks has a deployment pipeline where the only human involvement is to click 'merge', and I'd be happy to help out if you'd like to set up a similar system.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,287852184 https://github.com/pydata/xarray/issues/1440#issuecomment-308928211,https://api.github.com/repos/pydata/xarray/issues/1440,308928211,MDEyOklzc3VlQ29tbWVudDMwODkyODIxMQ==,12229877,2017-06-16T04:11:10Z,2018-02-10T07:13:11Z,CONTRIBUTOR,"@matt-long, I think that's a separate issue. Please open a new pull request, including a link to data that will let us reproduce the problem. @jhamman - [updated] I was always keen to work on this if I could make time, but have since changed jobs. However I'd still be happy to help anyone who wants to work on it with design and review. I definitely want to preserve the exact present semantics of dict arguments (so users have exact control, with a warning if it's incompatible with disk chunks). I may interpret int arguments as a (deprecated) hint though, as that's what it's mostly used for, and will add a fairly limited hints API to start with - more advanced users can just specify exact chunks.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,233350060 https://github.com/pydata/xarray/issues/1789#issuecomment-364392462,https://api.github.com/repos/pydata/xarray/issues/1789,364392462,MDEyOklzc3VlQ29tbWVudDM2NDM5MjQ2Mg==,12229877,2018-02-09T10:13:43Z,2018-02-09T10:13:43Z,CONTRIBUTOR,"Related to rtfd/readthedocs.org#1820, I think. Searching reveals that there have been a variety of problems with the 'Edit on Github' link over time.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,282916278 https://github.com/pydata/xarray/pull/1787#issuecomment-362976003,https://api.github.com/repos/pydata/xarray/issues/1787,362976003,MDEyOklzc3VlQ29tbWVudDM2Mjk3NjAwMw==,12229877,2018-02-05T04:01:16Z,2018-02-05T04:01:16Z,CONTRIBUTOR,"This is now far enough down my list of priorities that I'm unlikely to get to it any time soon, and (it turns out) there are also some design issues that would need to be resolved. I therefore suggest that anyone wishing to take this on opens an issue first to clarify the scope of any such change, and then make a PR.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,282369945 https://github.com/pydata/xarray/issues/1880#issuecomment-362902142,https://api.github.com/repos/pydata/xarray/issues/1880,362902142,MDEyOklzc3VlQ29tbWVudDM2MjkwMjE0Mg==,12229877,2018-02-04T12:11:27Z,2018-02-04T12:11:59Z,CONTRIBUTOR,"Ouch, this is definitely a bug in the code I wrote - that's terrible 😭 Probably related to matplotlib/matplotlib/pull/10220, where I'm looking at clipping instead of wrapping colours from the other side. (not as in code overlap or causing a bug, but dealing with similar logic) Fixing this issue *may* be as simple as setting alpha to 255 instead of 1 for integer dtypes, or it might be more involved. Either way I'll try to open a pull within the next few days.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,293858326 https://github.com/pydata/xarray/pull/1840#issuecomment-359168533,https://api.github.com/repos/pydata/xarray/issues/1840,359168533,MDEyOklzc3VlQ29tbWVudDM1OTE2ODUzMw==,12229877,2018-01-20T12:39:30Z,2018-01-20T12:39:30Z,CONTRIBUTOR,"Added tests; float32 not upcast, float16 ([""intended for storage of many floating-point values where higher precision is not needed, not for performing arithmetic""](https://en.wikipedia.org/wiki/Half-precision_floating-point_format)) *is* upcast but only to float32. I'll open a new issue to add a basic suite of property-based tests to Xarray πŸ˜„","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,289853579 https://github.com/pydata/xarray/pull/1840#issuecomment-358882878,https://api.github.com/repos/pydata/xarray/issues/1840,358882878,MDEyOklzc3VlQ29tbWVudDM1ODg4Mjg3OA==,12229877,2018-01-19T06:58:14Z,2018-01-19T06:58:14Z,CONTRIBUTOR,"Thanks - I was actually writing up an issue and decided it would be easier to demonstrate the proposed fix in a PR, but I'll open an issue first next time. The checkbox about flake8 could be removed from the issue template now - since #1824 we run flake8 on everything in CI so if tests pass flake8 is passing too. Re: tests: what do you (and @shoyer) think about using [Hypothesis](https://hypothesis.readthedocs.io) for some property-based tests of variable coding? ""encoding then decoding is a no-op"" is a classic property πŸ˜„ Upside, more powerful and better at finding edge cases; downside slower simply because it checks more cases (a configurable number).","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,289853579 https://github.com/pydata/xarray/pull/1819#issuecomment-358831413,https://api.github.com/repos/pydata/xarray/issues/1819,358831413,MDEyOklzc3VlQ29tbWVudDM1ODgzMTQxMw==,12229877,2018-01-19T00:48:21Z,2018-01-19T00:48:21Z,CONTRIBUTOR,Ping @jhamman?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,287747803 https://github.com/pydata/xarray/pull/1819#issuecomment-358511917,https://api.github.com/repos/pydata/xarray/issues/1819,358511917,MDEyOklzc3VlQ29tbWVudDM1ODUxMTkxNw==,12229877,2018-01-18T01:54:35Z,2018-01-18T01:54:35Z,CONTRIBUTOR,"Method extracted, default for missing bounds fixed, comment added, import moved. Also added a check that you haven't accidentally reversed the bounds while supplying only one.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,287747803 https://github.com/pydata/xarray/pull/1819#issuecomment-358167385,https://api.github.com/repos/pydata/xarray/issues/1819,358167385,MDEyOklzc3VlQ29tbWVudDM1ODE2NzM4NQ==,12229877,2018-01-17T01:44:45Z,2018-01-17T01:44:45Z,CONTRIBUTOR,"Ping @shoyer, @jhamman - I think we're out of blockers and ready to merge πŸ˜„ ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,287747803 https://github.com/pydata/xarray/issues/1829#issuecomment-357812702,https://api.github.com/repos/pydata/xarray/issues/1829,357812702,MDEyOklzc3VlQ29tbWVudDM1NzgxMjcwMg==,12229877,2018-01-15T23:35:22Z,2018-01-15T23:35:22Z,CONTRIBUTOR,"Related: when does Xarray plan to drop Python 2? IMO we should *at least* join [python3statement.org](http://www.python3statement.org) and drop it by 2020, and clearly document the timeline whatever it happens to be. I'd be keen for an earlier date TBH - keyword-only arguments are great, compatibility shims kinda suck, and dependencies are moving to py3-only at an increasing rate (including matplotlib 3.0, scheduled for July) - but can see the other argument too.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,288465429 https://github.com/pydata/xarray/pull/1819#issuecomment-357550651,https://api.github.com/repos/pydata/xarray/issues/1819,357550651,MDEyOklzc3VlQ29tbWVudDM1NzU1MDY1MQ==,12229877,2018-01-14T23:08:16Z,2018-01-14T23:08:16Z,CONTRIBUTOR,@jhamman - all done πŸ˜„ ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,287747803 https://github.com/pydata/xarray/pull/1824#issuecomment-357494081,https://api.github.com/repos/pydata/xarray/issues/1824,357494081,MDEyOklzc3VlQ29tbWVudDM1NzQ5NDA4MQ==,12229877,2018-01-14T07:37:53Z,2018-01-14T07:37:53Z,CONTRIBUTOR,@shoyer; done πŸŽ‰ ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,288322322 https://github.com/pydata/xarray/pull/1819#issuecomment-356903275,https://api.github.com/repos/pydata/xarray/issues/1819,356903275,MDEyOklzc3VlQ29tbWVudDM1NjkwMzI3NQ==,12229877,2018-01-11T11:10:22Z,2018-01-11T11:10:22Z,CONTRIBUTOR,CC @shoyer - hopefully I'm in time for Xarray 0.10.1 πŸ˜‰ ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,287747803 https://github.com/pydata/xarray/pull/1796#issuecomment-356498223,https://api.github.com/repos/pydata/xarray/issues/1796,356498223,MDEyOklzc3VlQ29tbWVudDM1NjQ5ODIyMw==,12229877,2018-01-10T04:42:44Z,2018-01-10T13:55:50Z,CONTRIBUTOR,"@shoyer makes some good points about things that should be fixed upstream in Matplotlib - namely normalization of RGB images, but I'm also going to move my color fixes (clipping instead of modulo) upstream instead. Timeline: 1. I edit this PR, it gets merged, Xarray 10.1 (ETA soonish?) can display RGB images just like Matplotlib 2. I finish my upstream PR, Matplotlib 2.2 (ETA later this month (!)) supports normalization of RGB images and fixes the modulo/clip bug 3. Once the matplotlib API is merged, I make a new PR against Xarray to support the same interface (including on older matplotlib versions) ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283566613 https://github.com/pydata/xarray/pull/1796#issuecomment-356519422,https://api.github.com/repos/pydata/xarray/issues/1796,356519422,MDEyOklzc3VlQ29tbWVudDM1NjUxOTQyMg==,12229877,2018-01-10T07:14:16Z,2018-01-10T07:14:16Z,CONTRIBUTOR,"Looks like `TestDatetimePlot.test_datetime_line_plot` is failing, but I'm pretty sure that's nothing to do with me (and it passed twice on Travis!).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283566613 https://github.com/pydata/xarray/pull/1796#issuecomment-356272643,https://api.github.com/repos/pydata/xarray/issues/1796,356272643,MDEyOklzc3VlQ29tbWVudDM1NjI3MjY0Mw==,12229877,2018-01-09T12:38:28Z,2018-01-09T12:38:28Z,CONTRIBUTOR,"Okay... I've now seen three Travis runs. In every one there's been a pass, a fail, and three errors... but different jobs each time. At this point I'm ready to give up and wait for the Travis team to fix it πŸ˜• The code is ready to go though! πŸŽ‰ ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283566613 https://github.com/pydata/xarray/pull/1796#issuecomment-356223193,https://api.github.com/repos/pydata/xarray/issues/1796,356223193,MDEyOklzc3VlQ29tbWVudDM1NjIyMzE5Mw==,12229877,2018-01-09T09:07:58Z,2018-01-09T09:07:58Z,CONTRIBUTOR,"More review fixes πŸ˜„ I'm pretty sure the build failure is just Travis being flaky - it passes on my machine. Could someone restart it?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283566613 https://github.com/pydata/xarray/pull/1787#issuecomment-355815405,https://api.github.com/repos/pydata/xarray/issues/1787,355815405,MDEyOklzc3VlQ29tbWVudDM1NTgxNTQwNQ==,12229877,2018-01-07T11:21:25Z,2018-01-07T11:21:25Z,CONTRIBUTOR,"- Including units in labels more generally sounds good to me, but I'll hold off until the general concept is approved πŸ˜‰ - I don't think there's an objectively correct way to label units, but parentheses seem to be fairly standard and least prone to misinterpretation. - Happy to add tests, though again I'll wait for in-principle approval. Adding a `units` attribute to some of the test data is easy enough! - `units` is the convention, and if this is unsatisfactory it's quite trivial to override already - I'd hold off on custom options until someone has an actual use-case for the feature.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,282369945 https://github.com/pydata/xarray/issues/37#issuecomment-355814550,https://api.github.com/repos/pydata/xarray/issues/37,355814550,MDEyOklzc3VlQ29tbWVudDM1NTgxNDU1MA==,12229877,2018-01-07T11:03:37Z,2018-01-07T11:03:37Z,CONTRIBUTOR,Closed by #1750?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,28596269 https://github.com/pydata/xarray/pull/1796#issuecomment-355743363,https://api.github.com/repos/pydata/xarray/issues/1796,355743363,MDEyOklzc3VlQ29tbWVudDM1NTc0MzM2Mw==,12229877,2018-01-06T12:18:56Z,2018-01-06T13:33:48Z,CONTRIBUTOR,"OK, @shoyer & @fmaussion - I think I'm done again! The only thing I *haven't* done is change the handling of `robust=True`, as anything beyond ""scale and clip 2nd-to-98th percentiles to [0, 1]"" seems fraught with complications. If this is not what the user intended, it's at least the obvious default behaviour and should be simple to understand. Hopefully this covers the substantial changes; but let me know if there's anything else as I'd really like this merged before the 0.10.1 release - and ideally released in time for demos at my summer school in January πŸ˜„ ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283566613 https://github.com/pydata/xarray/pull/1787#issuecomment-355745453,https://api.github.com/repos/pydata/xarray/issues/1787,355745453,MDEyOklzc3VlQ29tbWVudDM1NTc0NTQ1Mw==,12229877,2018-01-06T12:59:31Z,2018-01-06T12:59:31Z,CONTRIBUTOR,"Ping @fmaussion for review? (the build error is Travis download trouble, not failing tests)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,282369945 https://github.com/pydata/xarray/pull/1796#issuecomment-353773905,https://api.github.com/repos/pydata/xarray/issues/1796,353773905,MDEyOklzc3VlQ29tbWVudDM1Mzc3MzkwNQ==,12229877,2017-12-24T09:08:10Z,2018-01-06T12:06:44Z,CONTRIBUTOR,"Thanks for the review @shoyer! I'll do what I can in the next few days, but that might not be much at all before I get back from a no-internet camping trip around Jan 8th. Items: - [x] add validation and tests for `rgb` argument - [x] don't mutate the input data! (I *think* it doesn't, but obviously this needs to be checked) - [x] simplify order heuristic, to ""use last dimension of size 3 or 4 as rgb; warn if there are multiple possible"" - [x] update rasterio example in gallery","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283566613 https://github.com/pydata/xarray/pull/1796#issuecomment-353351362,https://api.github.com/repos/pydata/xarray/issues/1796,353351362,MDEyOklzc3VlQ29tbWVudDM1MzM1MTM2Mg==,12229877,2017-12-21T13:31:28Z,2017-12-21T13:31:28Z,CONTRIBUTOR,"Done - now with optional argument and nicer commit history, Anything else?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283566613 https://github.com/pydata/xarray/pull/1796#issuecomment-353260924,https://api.github.com/repos/pydata/xarray/issues/1796,353260924,MDEyOklzc3VlQ29tbWVudDM1MzI2MDkyNA==,12229877,2017-12-21T05:21:41Z,2017-12-21T05:21:41Z,CONTRIBUTOR,"Optional argument coming up then πŸ˜„ Anything else I need to do?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283566613 https://github.com/pydata/xarray/pull/1796#issuecomment-353225923,https://api.github.com/repos/pydata/xarray/issues/1796,353225923,MDEyOklzc3VlQ29tbWVudDM1MzIyNTkyMw==,12229877,2017-12-21T00:51:35Z,2017-12-21T00:53:02Z,CONTRIBUTOR,"> I like this. My main concern is that I'd like a fully explicit way to opt into this behavior. Perhaps rgb='band' and/or rgba='band'? I have a pretty strong preference for RGB images 'just working' - *requiring* a new argument would mean imshow has a different signature to all other plot methods without adding any information. If it's an optional argument, it barely does anything - in the rare case where the heuristic fails, you can supply `x` and `y` dims instead. In many ways showing an RGB image is a special case, but this is the least-special-case I could make work - and it's consistent with the obvious meaning and upstream behavior of `imshow`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,283566613 https://github.com/pydata/xarray/pull/1787#issuecomment-351963684,https://api.github.com/repos/pydata/xarray/issues/1787,351963684,MDEyOklzc3VlQ29tbWVudDM1MTk2MzY4NA==,12229877,2017-12-15T09:56:32Z,2017-12-15T22:23:59Z,CONTRIBUTOR,"The exact key [`units` is required by the CF Conventions](http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html#units) and most similar conventions, and very common in other landscape and hydrology data I work with.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,282369945 https://github.com/pydata/xarray/pull/1782#issuecomment-352110041,https://api.github.com/repos/pydata/xarray/issues/1782,352110041,MDEyOklzc3VlQ29tbWVudDM1MjExMDA0MQ==,12229877,2017-12-15T20:57:36Z,2017-12-15T20:57:36Z,CONTRIBUTOR,"No problem! Do you have any idea when this might be released? I can install from git in my own environment, but I probably can't convince GA to do that in their production environment πŸ˜„ ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,282087995 https://github.com/pydata/xarray/issues/1780#issuecomment-351701244,https://api.github.com/repos/pydata/xarray/issues/1780,351701244,MDEyOklzc3VlQ29tbWVudDM1MTcwMTI0NA==,12229877,2017-12-14T12:46:08Z,2017-12-14T12:46:16Z,CONTRIBUTOR,"> I investigated a bit and it seems that in the case of all nan data you would probably have to force add_colorbar to False and do other things for matplotlib to accept your all nan data. I don't *need* to do this, but I'd like to for another feature (`imshow` RGB images) - the question is how! Passing `add_colorbar=False` to `plot.imshow` just gives me errors, and I can't find anywhere in the internals where this would be valid either - it always gets passed through to the matplotlib Artist and then an exception. Any ideas?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,282000017 https://github.com/pydata/xarray/issues/1780#issuecomment-351675151,https://api.github.com/repos/pydata/xarray/issues/1780,351675151,MDEyOklzc3VlQ29tbWVudDM1MTY3NTE1MQ==,12229877,2017-12-14T10:45:31Z,2017-12-14T10:45:31Z,CONTRIBUTOR,"I'd argue that Xarray should handle these cases - it already does for lower dimensions, eg `xr.DataArray(np.full(1, np.nan)).plot()` works, and this would be really nice for exploratory analysis as well as consistency. This is sufficiently annoying to my research group that I'm willing to write the patch, if that helps! I also have some ideas for tests using Hypothesis to ferret out some other problems - for example there's a similar failure if plotting an array with size one in some dimension.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,282000017 https://github.com/pydata/xarray/issues/1604#issuecomment-351649842,https://api.github.com/repos/pydata/xarray/issues/1604,351649842,MDEyOklzc3VlQ29tbWVudDM1MTY0OTg0Mg==,12229877,2017-12-14T09:03:58Z,2017-12-14T09:03:58Z,CONTRIBUTOR,"Closed by #1496, I think.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,262696381 https://github.com/pydata/xarray/issues/1405#issuecomment-338564810,https://api.github.com/repos/pydata/xarray/issues/1405,338564810,MDEyOklzc3VlQ29tbWVudDMzODU2NDgxMA==,12229877,2017-10-23T06:59:06Z,2017-10-23T06:59:14Z,CONTRIBUTOR,I think this was closed by #1473.,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,228023777 https://github.com/pydata/xarray/issues/1541#issuecomment-338557287,https://api.github.com/repos/pydata/xarray/issues/1541,338557287,MDEyOklzc3VlQ29tbWVudDMzODU1NzI4Nw==,12229877,2017-10-23T06:12:40Z,2017-10-23T06:43:34Z,CONTRIBUTOR,"I'll take this one on πŸ˜„ I've also spotted a few things in the sphinx config that could be updated and will keep an eye out for others. - adding Numba to the intersphinx mapping - pointing at python /3/ instead of /3.5/ docs - using ISO8601 date formats - keeping the copyright year up to date - removing warning filter on seaborn import - remove IPython image location workaround - loop over modules to import rather than the unrolled style","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,254368462 https://github.com/pydata/xarray/issues/1440#issuecomment-309353545,https://api.github.com/repos/pydata/xarray/issues/1440,309353545,MDEyOklzc3VlQ29tbWVudDMwOTM1MzU0NQ==,12229877,2017-06-19T06:50:57Z,2017-07-14T02:35:04Z,CONTRIBUTOR,"I've just had a meeting at [NCI](http://nci.org.au/) which has helped clarify what I'm trying to do and how to tell if it's working. This comment is mostly for my own notes, and public for anyone interested. I'll refer to dask chunks as 'blocks' (as in 'blocked algorithms'), and netcdf chunks in a file as 'chunks', to avoid confusion) The approximate algorithm I'm thinking about is outlined in [this comment](https://github.com/pydata/xarray/issues/1440#issuecomment-307002325) above. Considerations, in rough order of performance impact, are: - No block should include data from multiple files (near-absolute, due to locking - though concurrent read is supported on lower levels?) - Contiguous access is critical in un-chunked files - reading the fastest-changing dimension in small parts murders IO performance. It *may* be important in chunked files too, at the level of chunks, but that's mostly down to Dask and user access patterns. - Each chunk, if the file is chunked, must fall entirely into a single block. - Blocks should usually be around 100MB, to balance scheduler overhead with memory usage of intermediate results. - Blocks should be of even(ish) size - if a dimension is of size 100 with chunks of 30, better to have blocks of 60 at the expense of relative shape than have blocks of 90 with one almost empty. - Chunks are cached by underlying libraries, so benchmarks need to clear the caches each run for valid results. Note that this affects IO but typically not decompression of chunks. - Contra contiguous access (above), it might be good to prevent very skewed block shapes (ie 1\*X\*Y or T\*1\*1). Possibly limiting the lowest edge length of a block (10px?), or limiting the edge ratio of a block (20:1?) - If a dimension hint is given (eg 'my analysis will be spatial'), making blocks larger along other dimensions will make whole-file processing more efficient, and subsetting less efficient. (Unless Dask can optimise away loading of part of a block?) Bottom line, I could come up with *something* pretty quickly but would perfer to take a little longer to write and explore some benchmarks.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,233350060 https://github.com/pydata/xarray/pull/1457#issuecomment-308926818,https://api.github.com/repos/pydata/xarray/issues/1457,308926818,MDEyOklzc3VlQ29tbWVudDMwODkyNjgxOA==,12229877,2017-06-16T03:57:57Z,2017-06-16T03:57:57Z,CONTRIBUTOR,"The tests for Hypothesis take almost twice as long to run on Travis at certain times of day, so I certainly wouldn't use it for benchmarking anything! Also concerned that a dedicated benchmarking machine may lead to software (accidentally!) optimized for a particular architecture or balance of machine resources without due consideration. Maybe @wesm could investigate fault injection to (eg) slow down disk access or add latency for some sets of benchmarks?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,236347050 https://github.com/pydata/xarray/pull/1457#issuecomment-308923548,https://api.github.com/repos/pydata/xarray/issues/1457,308923548,MDEyOklzc3VlQ29tbWVudDMwODkyMzU0OA==,12229877,2017-06-16T03:29:12Z,2017-06-16T03:29:12Z,CONTRIBUTOR,"I like the idea of benchmarks, but have some serious concerns. For Dask and IO-bound work in general, benchmark results will vary widely depending on the hardware and (if relevant) network properties. Results will be noncomparable between SSD and HDD, local and remote network access, and in general depend heavily on the specific IO patterns and storage/compute relationship of the computer. This isn't a reason not to benchmark though, just a call for very cautious interpretation - it's clearly useful to catch some of the subtle-but-pathological performance problems that have cropped up. In short, I think benchmarks should have a very clear warnings section in the documentation, and no decision should be taken to change code without benchmarking on a variety of computers (SSD/HDD, PC/cluster, local/remote data...). Also JSON cannot include comments, and there are a number of entries that you need to update, but that's a passing concern.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,236347050 https://github.com/pydata/xarray/issues/1440#issuecomment-307074048,https://api.github.com/repos/pydata/xarray/issues/1440,307074048,MDEyOklzc3VlQ29tbWVudDMwNzA3NDA0OA==,12229877,2017-06-08T11:16:30Z,2017-06-08T11:16:30Z,CONTRIBUTOR,"πŸŽ‰ My view is actually that anyone who can beat the default heuristic should just specify their chunks - you'd already need a good sense for the data and your computation (and the heuristic!). IMO, the few cases where tuning is desirable - but manual chunks are impractical - don't justify adding yet another kwarg to the fairly busy interfaces.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,233350060 https://github.com/pydata/xarray/issues/1440#issuecomment-307002325,https://api.github.com/repos/pydata/xarray/issues/1440,307002325,MDEyOklzc3VlQ29tbWVudDMwNzAwMjMyNQ==,12229877,2017-06-08T05:28:04Z,2017-06-08T05:28:04Z,CONTRIBUTOR,"I love a real-world example πŸ˜„ This sounds pretty similar to how I'm thinking of doing it, with a few caveats - mostly that `cate` assumes that the data is 3D, has lat and lon, single time step, spatial dimensions wholly divisible some small N. Obviously this is fine for CCI data, but not generally true of things Xarray might open. Taking a step back for a moment, chunks are great for avoiding out-of-memory errors, faster processing of reorderable operations, and efficient indexing. The overhead is not great when data is small or chunks are small, it's bad when a single on-disk chunk is on multiple dask chunks, and very bad when a dask chunk includes several files. (of course all of these are generalisations with pathological cases, but IMO good enough to build some heuristics on) With that in mind, here's how I'd decide whether to use the heuristic: - If `chunks` is a dict, never use heuristic (always use explicit user chunks) - If `chunks` is a *hint*, eg set or list as discussed above, or a later proposal, always use heuristic mode - guided by the hint, of course. Files which may otherwise default to non-heuristic or non-chunking mode (eg in mfdataset) could use eg. the empty set to activate the heuristics without hints. - If `chunks` is None, and the uncompressed data is above a size threshold (eg 500MB, 1GB), use chunks given by the heuristic Having decided to use a heuristic, we know the array shape and dimensions, the chunk shape if any, and the hint if any: - Start by selecting a maximum nbytes for the chunk, eg 250 MB - If the total array nbytes is <= max_nbytes, use a single chunk for the whole thing; return - If the array is stored as `(a, b, ...)` chunks on disk, our dask chunks must be `(m.a, n.b, ...)`, ie each dimension has some independent integer multiple. - Loop over dimensions *not* to chunk (per above, either those not in the set, or those in a string), adding one to the respective multiplier. Alternatively if the file is now five or less dask chunks across, simply divide the on-disk chunks into 4, 3, 2, or 1 dask chunks (avoiding a dimension of 1.1 chunks, etc) - For each step, if this would make the chunk larger than max_nbytes, return. - Repeat the loop for remaining dimensions. - If the array is *not* chunked on disk, increase a divisor for each dimension to chunk until the chunk size is <= max_nbytes and return. It's probably a good idea to constrain this further, so that the ratio of chunk edge length along dimensions should not exceed the greater of 100:1 or four times the ratio of chunks on disk (I don't have universal profiling to back this up, but it's always worked well for me). This will mitigate the potentially-very-large effects of dimension order, especially in unchunked files or large chunks. For datasets (as opposed to arrays), I'd calculate chunks once for the largest dtype and just reuse that shape.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,233350060 https://github.com/pydata/xarray/issues/1440#issuecomment-306688091,https://api.github.com/repos/pydata/xarray/issues/1440,306688091,MDEyOklzc3VlQ29tbWVudDMwNjY4ODA5MQ==,12229877,2017-06-07T05:09:06Z,2017-06-07T05:09:06Z,CONTRIBUTOR,"> I'd certainly support a warning when dask chunks do not align with the on-disk chunks. This sounds like a very good idea to me πŸ‘ > I think its unavoidable that users understand how their data will be processed (e.g., whether operations will be mapped over time or space). But maybe some sort of heuristics (if not a fully automated solution) are possible. I think that depends on the size of the data - a very common workflow in our group is to open some national-scale collection, select a small (MB to low GB) section, and proceed with that. At this scale we only use chunks because many of the input files are larger than memory, and shape is basically irrelevant - chunks avoid loading anything until after selecting the subset (I think this is related to #1396). It's certainly *good* to know the main processing dimensions though, and user-guided chunk selection heuristics could take us a long way - I actually think a dimension hint and good heuristics are likely to perform better than most users (who are not experts and have not profiled their performance). The set notation is also very elegant, but I wonder about the interpretation. With `chunks=`, I specify how to break up the data - and any omitted dimensions are not chunked. For the hint, I'd expect to express which dimension(s) to *keep* - ie `{'lat', lon'}` should indicate that my analysis is mostly spatial, rather than mostly not. Maybe we could use a string (eg `time` for timeseries or `lat lon` for spatial) instead of a set to specify large chunk dimensions?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,233350060 https://github.com/pydata/xarray/pull/1260#issuecomment-303005011,https://api.github.com/repos/pydata/xarray/issues/1260,303005011,MDEyOklzc3VlQ29tbWVudDMwMzAwNTAxMQ==,12229877,2017-05-22T05:49:37Z,2017-05-22T05:49:37Z,CONTRIBUTOR,"I would also favour doing nothing (ie 3), because most users will already have some solution. It's also easier to change later if we don't do anything now - no need to think at all about backwards compatibility, and the design can be guided by how people are using the existing parts.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,206905158 https://github.com/pydata/xarray/pull/1322#issuecomment-290390365,https://api.github.com/repos/pydata/xarray/issues/1322,290390365,MDEyOklzc3VlQ29tbWVudDI5MDM5MDM2NQ==,12229877,2017-03-30T12:03:28Z,2017-03-30T12:03:28Z,CONTRIBUTOR,"@shoyer - my preference is to process and truncate all strings, including object reprs. If you want to overrule that feel free; otherwise I think it's ready to merge.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216611104 https://github.com/pydata/xarray/pull/1322#issuecomment-289344566,https://api.github.com/repos/pydata/xarray/issues/1322,289344566,MDEyOklzc3VlQ29tbWVudDI4OTM0NDU2Ng==,12229877,2017-03-27T03:18:29Z,2017-03-27T03:18:29Z,CONTRIBUTOR,Ping @shoyer ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216611104 https://github.com/pydata/xarray/pull/1322#issuecomment-289185538,https://api.github.com/repos/pydata/xarray/issues/1322,289185538,MDEyOklzc3VlQ29tbWVudDI4OTE4NTUzOA==,12229877,2017-03-25T03:30:38Z,2017-03-25T03:30:38Z,CONTRIBUTOR,"If the logic is ""clever"", time to ditch it. I've rebased to a clearer version which simply handles indentation and replaces tabs and newlines with their backslash representations. What properties of this would you want to test?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216611104 https://github.com/pydata/xarray/issues/1319#issuecomment-288898100,https://api.github.com/repos/pydata/xarray/issues/1319,288898100,MDEyOklzc3VlQ29tbWVudDI4ODg5ODEwMA==,12229877,2017-03-24T00:15:06Z,2017-03-24T00:15:06Z,CONTRIBUTOR,"Sure, I'd be happy to. The above example will look much nicer, especially in wrapping environments: ``` Dimensions: (time: 246, x: 4000, y: 4000) Coordinates: * y (y) float64 -3.9e+06 -3.9e+06 -3.9e+06 -3.9e+06 -3.9e+06 ... * x (x) float64 1.5e+06 1.5e+06 1.5e+06 1.5e+06 1.5e+06 1.5e+06 ... * time (time) datetime64[ns] 1999-07-16T23:49:39 1999-07-25T23:43:07 ... Data variables: crs int32 ... blue (time, y, x) float64 ... green (time, y, x) float64 ... red (time, y, x) float64 ... nir (time, y, x) float64 ... swir1 (time, y, x) float64 ... swir2 (time, y, x) float64 ... Attributes: date_created: 2017-03-07T11:57:26.511217 Conventions: CF-1.6, ACDD-1.3 history: 2017-03-07T11:57:26.511307+11:00 adh547 datacube... geospatial_bounds: POLYGON ((148.49626113888138 -34.828378308133452... geospatial_bounds_crs: EPSG:4326 geospatial_lat_min: -35.7203183267 geospatial_lat_max: -34.7089119078 geospatial_lat_units: degrees_north geospatial_lon_min: 148.496261139 geospatial_lon_max: 149.734176111 geospatial_lon_units: degrees_east comment: - Ground Control Points (GCP): new GCP chips ... product_suite: Surface Reflectance NBAR+T 25m publisher_email: earth.observation@ga.gov.au keywords_vocabulary: GCMD product_version: 2 cdm_data_type: Grid references: - Berk, A., Anderson, G.P., Acharya, P.K., Ho... platform: LANDSAT-7 keywords: AU/GA,NASA/GSFC/SED/ESD/LANDSAT,REFLECTANCE,ETM+... publisher_name: Section Leader, Operations Section, NEMO, Geosci... institution: Commonwealth of Australia (Geoscience Australia) acknowledgment: Landsat data is provided by the United States Ge... license: CC BY Attribution 4.0 International License title: Surface Reflectance NBAR+T 25 v2 summary: Surface Reflectance (SR) is a suite of Earth Obs... instrument: ETM source: LANDSAT 7 ETM+ surface observation publisher_url: http://www.ga.gov.au ```","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216329175