issues: 307318224

This data as json

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
307318224	MDU6SXNzdWUzMDczMTgyMjQ=	2004	Slicing DataArray can take longer than not slicing	291576	closed	0			14	2018-03-21T16:20:49Z	2020-12-03T18:15:35Z	2020-12-03T18:15:35Z	CONTRIBUTOR				Code Sample, a copy-pastable example if possible ```ipython In [1]: import xarray as xr In [2]: radmax_ds = xr.open_dataset('tests/radmax_baseline.nc') In [3]: radmax_ds Out[3]: <xarray.Dataset> Dimensions: (latitude: 5650, longitude: 12050, time: 3) Coordinates: * latitude (latitude) float32 13.505002 13.515002 13.525002 13.535002 ... * longitude (longitude) float32 -170.495 -170.485 -170.475 -170.465 ... * time (time) datetime64[ns] 2017-03-07T01:00:00 2017-03-07T02:00:00 ... Data variables: RadarMax (time, latitude, longitude) float32 ... Attributes: start_date: 03/07/2017 01:00 end_date: 03/07/2017 01:55 elapsed: 60 data_rights: Respond (TM) Confidential Data. (c) Insurance Services Offi... In [4]: %timeit foo = radmax_ds.RadarMax.load() The slowest run took 35509.20 times longer than the fastest. This could mean that an intermediate result is being cached. 1 loop, best of 3: 216 µs per loop In [5]: 216 * 35509.2 Out[5]: 7669987.199999999 ``` So, without any slicing, it takes approximately 7.5 seconds for me to load this complete file into memory. Now, let's see what happens when I slice the DataArray and load it: ``` ipython In [1]: import xarray as xr In [2]: radmax_ds = xr.open_dataset('tests/radmax_baseline.nc') In [3]: %timeit foo = radmax_ds.RadarMax[::1, ::1, ::1].load() 1 loop, best of 3: 7.56 s per loop In [4]: radmax_ds.close() In [5]: radmax_ds = xr.open_dataset('tests/radmax_baseline.nc') In [6]: %timeit foo = radmax_ds.RadarMax[::1, ::10, ::10].load() `` I killed this session after 17 minutes.top` did not report any unusual io wait, and memory usage was not out of control. I am using v0.10.2 of xarray. My suspicion is that there is something wrong with the indexing system that is causing xarray to read in the data in a bad order. Notice that if I slice all the data, then the timing works out the same as reading it all in straight-up. Not shown here is a run where if I slice every 100 lats and 100 longitudes, then the timing is shorter again, but not to the same amount of time as reading it all in at once. Let me know if you want a copy of the file. It is a compressed netcdf4, taking up only 1.7MB. I wonder if this is related to #1985?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2004/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		completed	13221727	issue

Links from other tables

1 row from issues_id in issues_labels
14 rows from issue in issue_comments