issues: 307318224
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
307318224 | MDU6SXNzdWUzMDczMTgyMjQ= | 2004 | Slicing DataArray can take longer than not slicing | 291576 | closed | 0 | 14 | 2018-03-21T16:20:49Z | 2020-12-03T18:15:35Z | 2020-12-03T18:15:35Z | CONTRIBUTOR | Code Sample, a copy-pastable example if possible```ipython In [1]: import xarray as xr In [2]: radmax_ds = xr.open_dataset('tests/radmax_baseline.nc') In [3]: radmax_ds Out[3]: <xarray.Dataset> Dimensions: (latitude: 5650, longitude: 12050, time: 3) Coordinates: * latitude (latitude) float32 13.505002 13.515002 13.525002 13.535002 ... * longitude (longitude) float32 -170.495 -170.485 -170.475 -170.465 ... * time (time) datetime64[ns] 2017-03-07T01:00:00 2017-03-07T02:00:00 ... Data variables: RadarMax (time, latitude, longitude) float32 ... Attributes: start_date: 03/07/2017 01:00 end_date: 03/07/2017 01:55 elapsed: 60 data_rights: Respond (TM) Confidential Data. (c) Insurance Services Offi... In [4]: %timeit foo = radmax_ds.RadarMax.load() The slowest run took 35509.20 times longer than the fastest. This could mean that an intermediate result is being cached. 1 loop, best of 3: 216 µs per loop In [5]: 216 * 35509.2 Out[5]: 7669987.199999999 ``` So, without any slicing, it takes approximately 7.5 seconds for me to load this complete file into memory. Now, let's see what happens when I slice the DataArray and load it: ``` ipython In [1]: import xarray as xr In [2]: radmax_ds = xr.open_dataset('tests/radmax_baseline.nc') In [3]: %timeit foo = radmax_ds.RadarMax[::1, ::1, ::1].load() 1 loop, best of 3: 7.56 s per loop In [4]: radmax_ds.close() In [5]: radmax_ds = xr.open_dataset('tests/radmax_baseline.nc') In [6]: %timeit foo = radmax_ds.RadarMax[::1, ::10, ::10].load()
Let me know if you want a copy of the file. It is a compressed netcdf4, taking up only 1.7MB. I wonder if this is related to #1985? |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2004/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |