html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2004#issuecomment-738189796,https://api.github.com/repos/pydata/xarray/issues/2004,738189796,MDEyOklzc3VlQ29tbWVudDczODE4OTc5Ng==,291576,2020-12-03T18:15:35Z,2020-12-03T18:15:35Z,CONTRIBUTOR,"I think so, at least in terms of my original problem.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224
https://github.com/pydata/xarray/issues/2004#issuecomment-738183069,https://api.github.com/repos/pydata/xarray/issues/2004,738183069,MDEyOklzc3VlQ29tbWVudDczODE4MzA2OQ==,2448579,2020-12-03T18:03:29Z,2020-12-03T18:03:29Z,MEMBER,can this be closed?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224
https://github.com/pydata/xarray/issues/2004#issuecomment-460881018,https://api.github.com/repos/pydata/xarray/issues/2004,460881018,MDEyOklzc3VlQ29tbWVudDQ2MDg4MTAxOA==,1217238,2019-02-06T02:32:46Z,2019-02-06T02:32:46Z,MEMBER,The performance difference here does indeed to have been fixed with netCDF-C 4.6.2 (but see also https://github.com/pydata/xarray/issues/2747),"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224
https://github.com/pydata/xarray/issues/2004#issuecomment-396317995,https://api.github.com/repos/pydata/xarray/issues/2004,396317995,MDEyOklzc3VlQ29tbWVudDM5NjMxNzk5NQ==,579593,2018-06-11T17:16:43Z,2018-06-11T17:16:43Z,NONE,"netcdf-c master now includes the same mechanism for strided access of HDF5 files as h5py. If netcdf4-python is linked against netcdf-c >= 4.6.2, performance for strided access should be greatly improved.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224
https://github.com/pydata/xarray/issues/2004#issuecomment-375102231,https://api.github.com/repos/pydata/xarray/issues/2004,375102231,MDEyOklzc3VlQ29tbWVudDM3NTEwMjIzMQ==,579593,2018-03-21T21:29:34Z,2018-03-21T21:29:34Z,NONE,Confirmed that the slow performance of netcdf4-python on strided access is due to the way that netcdf-c calls HDF5. There's now an issue on the netcdf-c issue tracker to implement fast strided access for HDF5 files (https://github.com/Unidata/netcdf-c/issues/908).,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224
https://github.com/pydata/xarray/issues/2004#issuecomment-375067743,https://api.github.com/repos/pydata/xarray/issues/2004,375067743,MDEyOklzc3VlQ29tbWVudDM3NTA2Nzc0Mw==,1217238,2018-03-21T19:29:51Z,2018-03-21T19:29:51Z,MEMBER,"H5py is doing all the hard work for this in h5netcdf.
On Wed, Mar 21, 2018 at 11:51 AM Benjamin Root
wrote:
> Ah, nevermind, I see that our examples only had one greater-than-one stride
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> , or mute
> the thread
>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224
https://github.com/pydata/xarray/issues/2004#issuecomment-375056363,https://api.github.com/repos/pydata/xarray/issues/2004,375056363,MDEyOklzc3VlQ29tbWVudDM3NTA1NjM2Mw==,291576,2018-03-21T18:50:58Z,2018-03-21T18:50:58Z,CONTRIBUTOR,"Ah, nevermind, I see that our examples only had one greater-than-one stride","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224
https://github.com/pydata/xarray/issues/2004#issuecomment-375056077,https://api.github.com/repos/pydata/xarray/issues/2004,375056077,MDEyOklzc3VlQ29tbWVudDM3NTA1NjA3Nw==,291576,2018-03-21T18:50:01Z,2018-03-21T18:50:01Z,CONTRIBUTOR,"Dunno. I can't seem to get that engine working on my system.
Reading through that thread, I wonder if the optimization they added only applies if there is only one stride greater than one?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224
https://github.com/pydata/xarray/issues/2004#issuecomment-375054212,https://api.github.com/repos/pydata/xarray/issues/2004,375054212,MDEyOklzc3VlQ29tbWVudDM3NTA1NDIxMg==,579593,2018-03-21T18:44:14Z,2018-03-21T18:44:14Z,NONE,"netcdf4-python does `reopened[::1, ::10]` by making a bunch of calls to the C lib routine `nc_get_vara`. As pointed out in Unidata/netcdf4-python#680, this is faster than a single call to `nc_get_vars` (which does strided access, but is *very* slow). Note that `reopened[::1, ::1][:,::10]` is very fast, but you have to have enough memory to hold the entire array. I wonder how h5netcdf is reading the data - is it pulling the entire array into memory and then selecting or subset?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224
https://github.com/pydata/xarray/issues/2004#issuecomment-375036951,https://api.github.com/repos/pydata/xarray/issues/2004,375036951,MDEyOklzc3VlQ29tbWVudDM3NTAzNjk1MQ==,291576,2018-03-21T17:51:54Z,2018-03-21T17:51:54Z,CONTRIBUTOR,"This might be relevant: https://github.com/Unidata/netcdf4-python/issues/680
Still reading through the thread.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224
https://github.com/pydata/xarray/issues/2004#issuecomment-375034973,https://api.github.com/repos/pydata/xarray/issues/2004,375034973,MDEyOklzc3VlQ29tbWVudDM3NTAzNDk3Mw==,291576,2018-03-21T17:46:09Z,2018-03-21T17:46:09Z,CONTRIBUTOR,my bet is probably netCDF4-python. Don't want to write up the C code though to confirm it. Sigh... this isn't going to be a fun one to track down. Shall I open a bug report over there?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224
https://github.com/pydata/xarray/issues/2004#issuecomment-375020977,https://api.github.com/repos/pydata/xarray/issues/2004,375020977,MDEyOklzc3VlQ29tbWVudDM3NTAyMDk3Nw==,1217238,2018-03-21T17:08:15Z,2018-03-21T17:08:15Z,MEMBER,"The culprit appears to be netCDF4-python and/or netCDF-C:
```
f = netCDF4.Dataset('test.nc')
%time f['__xarray_dataarray_variable__'][:, ::10]
# CPU times: user 313 ms, sys: 1.23 s, total: 1.54 s
```
When I try doing the same operation with h5netcdf, it runs very quickly:
```python
reopened = xr.open_dataarray('test.nc', engine='h5netcdf')
%time reopened[::1, ::10].compute()
# CPU times: user 6.11 ms, sys: 3.63 ms, total: 9.74 ms
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224
https://github.com/pydata/xarray/issues/2004#issuecomment-375014480,https://api.github.com/repos/pydata/xarray/issues/2004,375014480,MDEyOklzc3VlQ29tbWVudDM3NTAxNDQ4MA==,291576,2018-03-21T16:50:59Z,2018-03-21T16:56:13Z,CONTRIBUTOR,"Yeah, good example. Eliminates a lot of possible variables such as problems with netcdf4 compression and such. Probably should see if it happens in v0.10.0 to see if the changes to the indexing system caused this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224
https://github.com/pydata/xarray/issues/2004#issuecomment-375010010,https://api.github.com/repos/pydata/xarray/issues/2004,375010010,MDEyOklzc3VlQ29tbWVudDM3NTAxMDAxMA==,1217238,2018-03-21T16:38:59Z,2018-03-21T16:38:59Z,MEMBER,"Here's a simpler case that gets at the essence of the problem:
```python
import xarray as xr
import numpy as np
source = xr.DataArray(np.zeros((100, 12000)), dims=['time', 'x'])
source.to_netcdf('test.nc', format='NETCDF4')
reopened = xr.open_dataarray('test.nc')
%time reopened[::1, ::1].compute()
# CPU times: user 1.35 ms, sys: 6.77 ms, total: 8.12 ms
%time reopened[::1, ::10].compute()
# CPU times: user 371 ms, sys: 1.33 s, total: 1.7 s
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224