html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/pull/4461#issuecomment-741889071,https://api.github.com/repos/pydata/xarray/issues/4461,741889071,MDEyOklzc3VlQ29tbWVudDc0MTg4OTA3MQ==,1872600,2020-12-09T16:31:37Z,2021-01-19T14:46:49Z,NONE,"I'm really looking forward to getting this merged so I can open the National Water Model Zarr I created last week thusly: ```python ds = xr.open_dataset(s3://noaa-nwm-retro-v2.0-zarr-pds', engine='zarr', backend_kwargs={'consolidated':True, ""storage_options"": {'anon':True}}) ``` @martindurant tells me this takes only **3 s** with the new async capability! That would be pretty awesome, because now it takes **1min 15s** to open this dataset!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,709187212 https://github.com/pydata/xarray/pull/4461#issuecomment-762428604,https://api.github.com/repos/pydata/xarray/issues/4461,762428604,MDEyOklzc3VlQ29tbWVudDc2MjQyODYwNA==,6042212,2021-01-18T19:15:25Z,2021-01-18T19:15:25Z,CONTRIBUTOR,"All interested parties, please see new attempt at https://github.com/pydata/xarray/pull/4823","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,709187212 https://github.com/pydata/xarray/pull/4461#issuecomment-748554375,https://api.github.com/repos/pydata/xarray/issues/4461,748554375,MDEyOklzc3VlQ29tbWVudDc0ODU1NDM3NQ==,7799184,2020-12-20T02:35:40Z,2020-12-20T09:10:27Z,CONTRIBUTOR,"> @rabernat , awesome! I was stunned by the difference -- I guess the async loading of coordinate data is the big win, right? @rsignell-usgs one other thing that can largely speed up loading of metadata / coordinates is ensuring coordinate variables are stored in one single chunk. For this particular dataset, chunk size for `time` coordinate is 672 yielding 339 chunks, which can take a while to load from remote bucket stores. If you rewrite `time` coordinate setting `dset.time.encoding[""chunks""] = (227904,)` you should see a very large performance increase. One thing we have been doing for the cases of zarr archives that are appended in time, is defining time coordinate with a very large chunk size (e.g., `dset.time.encoding[""chunks""] = (10000000,)`) when we first write the store. This ensures time coordinate will still fit in one single chunk after appending over time dimension, and does not affect chunking of the actual data variables. One thing we have been having performance issues with is with loading coordinates / metadata from zarr archives that have too many chunks (millions), even when metadata is consolidated and coordinates are in one single chunk. There is an [open issue](https://github.com/dask/dask/issues/6363) in dask about this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,709187212 https://github.com/pydata/xarray/pull/4461#issuecomment-743287803,https://api.github.com/repos/pydata/xarray/issues/4461,743287803,MDEyOklzc3VlQ29tbWVudDc0MzI4NzgwMw==,6042212,2020-12-11T16:19:26Z,2020-12-11T16:19:26Z,CONTRIBUTOR,"> Martin has gained by implementing this PR is transferrable I'm not sure, it's been a while now...","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,709187212 https://github.com/pydata/xarray/pull/4461#issuecomment-741949159,https://api.github.com/repos/pydata/xarray/issues/4461,741949159,MDEyOklzc3VlQ29tbWVudDc0MTk0OTE1OQ==,1197350,2020-12-09T18:02:03Z,2020-12-09T18:02:11Z,MEMBER,"I think @shoyer has laid out the options in a very clear way. I weakly favor option 2, as I think it preferable in terms of software architecture and our broader roadmap for Xarray. However, I am cognizant of the significant effort that @martindurant has put into this, and I don't want his effort to go to waste. Some mitigating factors are: - The example I gave above (https://github.com/pydata/xarray/pull/4461#issuecomment-741939277) shows that one high-impact feature that users want (async capabilities in Zarr) already works, albiet with a different syntax. So this PR is more about convenience. - Presumably the knowledge about Xarray that Martin has gained by implementing this PR is transferrable to a different context, and so we would not be starting from scratch if we went with 2.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,709187212 https://github.com/pydata/xarray/pull/4461#issuecomment-741942375,https://api.github.com/repos/pydata/xarray/issues/4461,741942375,MDEyOklzc3VlQ29tbWVudDc0MTk0MjM3NQ==,1872600,2020-12-09T17:50:04Z,2020-12-09T17:50:04Z,NONE,"@rabernat , awesome! I was stunned by the difference -- I guess the async loading of coordinate data is the big win, right?","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 1, ""eyes"": 0}",,709187212 https://github.com/pydata/xarray/pull/4461#issuecomment-741939277,https://api.github.com/repos/pydata/xarray/issues/4461,741939277,MDEyOklzc3VlQ29tbWVudDc0MTkzOTI3Nw==,1197350,2020-12-09T17:44:55Z,2020-12-09T17:44:55Z,MEMBER,"@rsignell-usgs: note that your example works without this PR (but with the just released zarr 2.6.1) as follows ```python mapper = fsspec.get_mapper('s3://noaa-nwm-retro-v2.0-zarr-pds') ds = xr.open_zarr(mapper, consolidated=True) ``` Took 4s on my laptop (outside of AWS).","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,709187212 https://github.com/pydata/xarray/pull/4461#issuecomment-741928992,https://api.github.com/repos/pydata/xarray/issues/4461,741928992,MDEyOklzc3VlQ29tbWVudDc0MTkyODk5Mg==,1217238,2020-12-09T17:37:01Z,2020-12-09T17:37:01Z,MEMBER,"We are excited about adding this feature! We love fsspec and think this would be very useful for xarray's users. In the long term, we would love to support fsspec for all the file formats that can handle file objects, e.g., including `engine='h5netcdf'` and `engine='scipy'`. The concern right now is that this adds special case logic for zarr in open_dataset(), which @alexamici and @aurghs are presently (simultaneously!) trying to remove as part of paying down technical debt in the ongoing backends refactor. I see two potential paths forwards: 1. Merge this as is. It has good test coverage and porting should (hopefully!) be relatively straightforward to port. 2. Insert this into the new backend API code instead, and require using the v2 backend API instead for this feature. @alexamici could you please take a look and weigh in here? In particular, it would be helpful if you could point to where this would belong in the new refactor. This is also a good motivation for deleting the ""v1"" API code as soon as possible in favor of the ""v2"" code -- nothing is worse than needing to implement a new feature twice!","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",,709187212 https://github.com/pydata/xarray/pull/4461#issuecomment-741881966,https://api.github.com/repos/pydata/xarray/issues/4461,741881966,MDEyOklzc3VlQ29tbWVudDc0MTg4MTk2Ng==,6042212,2020-12-09T16:20:33Z,2020-12-09T16:20:33Z,CONTRIBUTOR,ping again,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,709187212 https://github.com/pydata/xarray/pull/4461#issuecomment-738214364,https://api.github.com/repos/pydata/xarray/issues/4461,738214364,MDEyOklzc3VlQ29tbWVudDczODIxNDM2NA==,2448579,2020-12-03T18:48:25Z,2020-12-03T18:48:25Z,MEMBER,"> We need to resolve this discussion in order to decide what to do about this PR. Any more thoughts from other devs. ping @pydata/xarray ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,709187212 https://github.com/pydata/xarray/pull/4461#issuecomment-712144995,https://api.github.com/repos/pydata/xarray/issues/4461,712144995,MDEyOklzc3VlQ29tbWVudDcxMjE0NDk5NQ==,24736507,2020-10-19T13:09:32Z,2020-11-30T14:39:18Z,NONE,"Hello @martindurant! Thanks for updating this PR. We checked the lines you've touched for [PEPĀ 8](https://www.python.org/dev/peps/pep-0008) issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers: ##### Comment last updated at 2020-11-30 14:39:18 UTC","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,709187212 https://github.com/pydata/xarray/pull/4461#issuecomment-735814666,https://api.github.com/repos/pydata/xarray/issues/4461,735814666,MDEyOklzc3VlQ29tbWVudDczNTgxNDY2Ng==,1197350,2020-11-30T14:21:17Z,2020-11-30T14:21:17Z,MEMBER,We let this go stale again. I just resolve the conflicts.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,709187212 https://github.com/pydata/xarray/pull/4461#issuecomment-721365827,https://api.github.com/repos/pydata/xarray/issues/4461,721365827,MDEyOklzc3VlQ29tbWVudDcyMTM2NTgyNw==,6042212,2020-11-03T20:46:57Z,2020-11-03T20:46:57Z,CONTRIBUTOR,One completely unrelated failure (test_polyfit_warnings). Can I please get a final say here (@max-sixty @alexamici ?),"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,709187212 https://github.com/pydata/xarray/pull/4461#issuecomment-712218057,https://api.github.com/repos/pydata/xarray/issues/4461,712218057,MDEyOklzc3VlQ29tbWVudDcxMjIxODA1Nw==,14808389,2020-10-19T14:48:23Z,2020-10-19T14:48:23Z,MEMBER,"> (failures look like something in pandas dev) yep, that's #4516","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,709187212 https://github.com/pydata/xarray/pull/4461#issuecomment-712194464,https://api.github.com/repos/pydata/xarray/issues/4461,712194464,MDEyOklzc3VlQ29tbWVudDcxMjE5NDQ2NA==,6042212,2020-10-19T14:22:23Z,2020-10-19T14:22:23Z,CONTRIBUTOR,(failures look like something in pandas dev),"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,709187212 https://github.com/pydata/xarray/pull/4461#issuecomment-699179096,https://api.github.com/repos/pydata/xarray/issues/4461,699179096,MDEyOklzc3VlQ29tbWVudDY5OTE3OTA5Ng==,2448579,2020-09-25T22:07:47Z,2020-09-25T22:26:45Z,MEMBER,"We'll have to maintain backward compatibility with older zarr versions for a bit so you'll have to skip the tests appropriately using a version check EDIT: I didn't realize there are no tests in this PR yet. We definitely want current CI tests passing with older zarr versions.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,709187212 https://github.com/pydata/xarray/pull/4461#issuecomment-699174009,https://api.github.com/repos/pydata/xarray/issues/4461,699174009,MDEyOklzc3VlQ29tbWVudDY5OTE3NDAwOQ==,14808389,2020-09-25T21:54:06Z,2020-09-25T21:54:06Z,MEMBER,"> is it reasonable to install from master you might want to take a look at the `upstream-dev` CI which installs `zarr` from github (and is currently passing)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,709187212 https://github.com/pydata/xarray/pull/4461#issuecomment-699155033,https://api.github.com/repos/pydata/xarray/issues/4461,699155033,MDEyOklzc3VlQ29tbWVudDY5OTE1NTAzMw==,6042212,2020-09-25T21:05:42Z,2020-09-25T21:05:42Z,CONTRIBUTOR,"Question: to eventually get tests to pass, will need changes only just now going into zarr. Those may be released some time soon, but in the meantime is it reasonable to install from master?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,709187212