html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/2004#issuecomment-738189796,https://api.github.com/repos/pydata/xarray/issues/2004,738189796,MDEyOklzc3VlQ29tbWVudDczODE4OTc5Ng==,291576,2020-12-03T18:15:35Z,2020-12-03T18:15:35Z,CONTRIBUTOR,"I think so, at least in terms of my original problem.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/2004#issuecomment-738183069,https://api.github.com/repos/pydata/xarray/issues/2004,738183069,MDEyOklzc3VlQ29tbWVudDczODE4MzA2OQ==,2448579,2020-12-03T18:03:29Z,2020-12-03T18:03:29Z,MEMBER,can this be closed?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/2004#issuecomment-460881018,https://api.github.com/repos/pydata/xarray/issues/2004,460881018,MDEyOklzc3VlQ29tbWVudDQ2MDg4MTAxOA==,1217238,2019-02-06T02:32:46Z,2019-02-06T02:32:46Z,MEMBER,The performance difference here does indeed to have been fixed with netCDF-C 4.6.2 (but see also https://github.com/pydata/xarray/issues/2747),"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/2004#issuecomment-396317995,https://api.github.com/repos/pydata/xarray/issues/2004,396317995,MDEyOklzc3VlQ29tbWVudDM5NjMxNzk5NQ==,579593,2018-06-11T17:16:43Z,2018-06-11T17:16:43Z,NONE,"netcdf-c master now includes the same mechanism for strided access of HDF5 files as h5py. If netcdf4-python is linked against netcdf-c >= 4.6.2, performance for strided access should be greatly improved.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/2004#issuecomment-375102231,https://api.github.com/repos/pydata/xarray/issues/2004,375102231,MDEyOklzc3VlQ29tbWVudDM3NTEwMjIzMQ==,579593,2018-03-21T21:29:34Z,2018-03-21T21:29:34Z,NONE,Confirmed that the slow performance of netcdf4-python on strided access is due to the way that netcdf-c calls HDF5. There's now an issue on the netcdf-c issue tracker to implement fast strided access for HDF5 files (https://github.com/Unidata/netcdf-c/issues/908).,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/2004#issuecomment-375067743,https://api.github.com/repos/pydata/xarray/issues/2004,375067743,MDEyOklzc3VlQ29tbWVudDM3NTA2Nzc0Mw==,1217238,2018-03-21T19:29:51Z,2018-03-21T19:29:51Z,MEMBER,"H5py is doing all the hard work for this in h5netcdf. On Wed, Mar 21, 2018 at 11:51 AM Benjamin Root wrote: > Ah, nevermind, I see that our examples only had one greater-than-one stride > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > , or mute > the thread > > . > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/2004#issuecomment-375056363,https://api.github.com/repos/pydata/xarray/issues/2004,375056363,MDEyOklzc3VlQ29tbWVudDM3NTA1NjM2Mw==,291576,2018-03-21T18:50:58Z,2018-03-21T18:50:58Z,CONTRIBUTOR,"Ah, nevermind, I see that our examples only had one greater-than-one stride","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/2004#issuecomment-375056077,https://api.github.com/repos/pydata/xarray/issues/2004,375056077,MDEyOklzc3VlQ29tbWVudDM3NTA1NjA3Nw==,291576,2018-03-21T18:50:01Z,2018-03-21T18:50:01Z,CONTRIBUTOR,"Dunno. I can't seem to get that engine working on my system. Reading through that thread, I wonder if the optimization they added only applies if there is only one stride greater than one?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/2004#issuecomment-375054212,https://api.github.com/repos/pydata/xarray/issues/2004,375054212,MDEyOklzc3VlQ29tbWVudDM3NTA1NDIxMg==,579593,2018-03-21T18:44:14Z,2018-03-21T18:44:14Z,NONE,"netcdf4-python does `reopened[::1, ::10]` by making a bunch of calls to the C lib routine `nc_get_vara`. As pointed out in Unidata/netcdf4-python#680, this is faster than a single call to `nc_get_vars` (which does strided access, but is *very* slow). Note that `reopened[::1, ::1][:,::10]` is very fast, but you have to have enough memory to hold the entire array. I wonder how h5netcdf is reading the data - is it pulling the entire array into memory and then selecting or subset?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/2004#issuecomment-375036951,https://api.github.com/repos/pydata/xarray/issues/2004,375036951,MDEyOklzc3VlQ29tbWVudDM3NTAzNjk1MQ==,291576,2018-03-21T17:51:54Z,2018-03-21T17:51:54Z,CONTRIBUTOR,"This might be relevant: https://github.com/Unidata/netcdf4-python/issues/680 Still reading through the thread.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/2004#issuecomment-375034973,https://api.github.com/repos/pydata/xarray/issues/2004,375034973,MDEyOklzc3VlQ29tbWVudDM3NTAzNDk3Mw==,291576,2018-03-21T17:46:09Z,2018-03-21T17:46:09Z,CONTRIBUTOR,my bet is probably netCDF4-python. Don't want to write up the C code though to confirm it. Sigh... this isn't going to be a fun one to track down. Shall I open a bug report over there?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/2004#issuecomment-375020977,https://api.github.com/repos/pydata/xarray/issues/2004,375020977,MDEyOklzc3VlQ29tbWVudDM3NTAyMDk3Nw==,1217238,2018-03-21T17:08:15Z,2018-03-21T17:08:15Z,MEMBER,"The culprit appears to be netCDF4-python and/or netCDF-C: ``` f = netCDF4.Dataset('test.nc') %time f['__xarray_dataarray_variable__'][:, ::10] # CPU times: user 313 ms, sys: 1.23 s, total: 1.54 s ``` When I try doing the same operation with h5netcdf, it runs very quickly: ```python reopened = xr.open_dataarray('test.nc', engine='h5netcdf') %time reopened[::1, ::10].compute() # CPU times: user 6.11 ms, sys: 3.63 ms, total: 9.74 ms ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/2004#issuecomment-375014480,https://api.github.com/repos/pydata/xarray/issues/2004,375014480,MDEyOklzc3VlQ29tbWVudDM3NTAxNDQ4MA==,291576,2018-03-21T16:50:59Z,2018-03-21T16:56:13Z,CONTRIBUTOR,"Yeah, good example. Eliminates a lot of possible variables such as problems with netcdf4 compression and such. Probably should see if it happens in v0.10.0 to see if the changes to the indexing system caused this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/2004#issuecomment-375010010,https://api.github.com/repos/pydata/xarray/issues/2004,375010010,MDEyOklzc3VlQ29tbWVudDM3NTAxMDAxMA==,1217238,2018-03-21T16:38:59Z,2018-03-21T16:38:59Z,MEMBER,"Here's a simpler case that gets at the essence of the problem: ```python import xarray as xr import numpy as np source = xr.DataArray(np.zeros((100, 12000)), dims=['time', 'x']) source.to_netcdf('test.nc', format='NETCDF4') reopened = xr.open_dataarray('test.nc') %time reopened[::1, ::1].compute() # CPU times: user 1.35 ms, sys: 6.77 ms, total: 8.12 ms %time reopened[::1, ::10].compute() # CPU times: user 371 ms, sys: 1.33 s, total: 1.7 s ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224