html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/6561#issuecomment-1116344892,https://api.github.com/repos/pydata/xarray/issues/6561,1116344892,IC_kwDOAMm_X85CihI8,8419421,2022-05-03T17:13:02Z,2022-05-03T17:13:02Z,NONE,"Thanks for the feedback and explanation. It seems the poorly constructed netCDF file is fundamentally to blame for triggering this behavior. A warning is a good idea, though.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1223031600 https://github.com/pydata/xarray/issues/4043#issuecomment-1065536538,https://api.github.com/repos/pydata/xarray/issues/4043,1065536538,IC_kwDOAMm_X84_gswa,8419421,2022-03-11T21:16:59Z,2022-03-11T21:16:59Z,NONE,"I believe I am experiencing a similar issue, although with code that I thought was smart enough to chunk the data request into smaller pieces: ``` import numpy as np import xarray as xr from dask.diagnostics import ProgressBar import intake wrf_url = ('https://rda.ucar.edu/thredds/catalog/files/g/ds612.0/' 'PGW3D/2006/catalog.xml') catalog_u = intake.open_thredds_merged(wrf_url, path=['*_U_2006060*']) catalog_v = intake.open_thredds_merged(wrf_url, path=['*_V_2006060*']) ds_u = catalog_u.to_dask() ds_u['U'] = ds_u.U.chunk(""auto"") ds_v = catalog_v.to_dask() ds_v['V'] = ds_v.V.chunk(""auto"") ds = xr.merge((ds_u, ds_v)) def unstagger(ds, var, coord, new_coord): var1 = ds[var].isel({coord: slice(None, -1)}) var2 = ds[var].isel({coord: slice(1, None)}) return ((var1 + var2) / 2).rename({coord: new_coord}) with ProgressBar(): ds['U_unstaggered'] = unstagger(ds, 'U', 'west_east_stag', 'west_east') ds['V_unstaggered'] = unstagger(ds, 'V', 'south_north_stag', 'south_north') ds['speed'] = np.hypot(ds.U_unstaggered, ds.V_unstaggered) ds.speed.isel(bottom_top=10).sel(Time='2006-06-07T18:00').plot() ``` This throws an error because, according to the RDA help folks, a request for an entire variable is made, which far exceeds their server's 500 MB request limit: ``` rda.ucar.edu/thredds/dodsC/files/g/ds612.0/PGW3D/2006/wrf3d_d01_PGW_U_20060607.nc.dods?U%5B0:1: 7%5D%5B0:1:49%5D%5B0:1:1014%5D%5B0:1:1359%5D ``` Here's the error: ``` Traceback (most recent call last): File ""/home/decker/classes/met325/rda_plot.py"", line 29, in ds.speed.isel(bottom_top=10).sel(Time='2006-06-07T18:00').plot() File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/plot/plot.py"", line 862, in __call__ return plot(self._da, **kwargs) File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/plot/plot.py"", line 293, in plot darray = darray.squeeze().compute() File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/core/dataarray.py"", line 951, in compute return new.load(**kwargs) File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/core/dataarray.py"", line 925, in load ds = self._to_temp_dataset().load(**kwargs) File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/core/dataset.py"", line 862, in load evaluated_data = da.compute(*lazy_data.values(), **kwargs) File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/dask/base.py"", line 571, in compute results = schedule(dsk, keys, **kwargs) File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/dask/threaded.py"", line 79, in get results = get_async( File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/dask/local.py"", line 507, in get_async raise_exception(exc, tb) File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/dask/local.py"", line 315, in reraise raise exc File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/dask/local.py"", line 220, in execute_task result = _execute_task(task, data) File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/dask/core.py"", line 119, in _execute_task return func(*(_execute_task(a, cache) for a in args)) File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/dask/array/core.py"", line 116, in getter c = np.asarray(c) File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/core/indexing.py"", line 357, in __array__ return np.asarray(self.array, dtype=dtype) File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/core/indexing.py"", line 521, in __array__ return np.asarray(self.array, dtype=dtype) File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/core/indexing.py"", line 422, in __array__ return np.asarray(array[self.key], dtype=None) File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/conventions.py"", line 62, in __getitem__ return np.asarray(self.array[key], dtype=self.dtype) File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/core/indexing.py"", line 422, in __array__ return np.asarray(array[self.key], dtype=None) File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/backends/pydap_.py"", line 39, in __getitem__ return indexing.explicit_indexing_adapter( File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/core/indexing.py"", line 711, in explicit_indexing_adapter result = raw_indexing_method(raw_key.tuple) File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/backends/pydap_.py"", line 47, in _getitem result = robust_getitem(array, key, catch=ValueError) File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/backends/common.py"", line 64, in robust_getitem return array[key] File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/pydap/model.py"", line 323, in __getitem__ out.data = self._get_data_index(index) File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/pydap/model.py"", line 353, in _get_data_index return self._data[index] File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/pydap/handlers/dap.py"", line 170, in __getitem__ raise_for_status(r) File ""/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/pydap/net.py"", line 38, in raise_for_status raise HTTPError( webob.exc.HTTPError: 403 403 ``` I thought smaller requests would automagically happen with this code. Is it intended that a large request be made?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,614144170 https://github.com/pydata/xarray/issues/2534#issuecomment-613068333,https://api.github.com/repos/pydata/xarray/issues/2534,613068333,MDEyOklzc3VlQ29tbWVudDYxMzA2ODMzMw==,8419421,2020-04-13T19:57:08Z,2020-04-13T19:57:08Z,NONE,"Here is another example: ``` import xarray as xr ncdata = xr.open_dataset('https://thredds.ucar.edu/thredds/dodsC/nws/metar/ncdecoded/files/Surface_METAR_20200411_0000.nc') df = ncdata.to_dataframe() ``` Output is: ``` Traceback (most recent call last): File ""bug.py"", line 4, in df = ncdata.to_dataframe() File ""/home/decker/miniconda3/envs/met212/lib/python3.8/site-packages/xarray/core/dataset.py"", line 4399, in to_dataframe return self._to_dataframe(self.dims) File ""/home/decker/miniconda3/envs/met212/lib/python3.8/site-packages/xarray/core/dataset.py"", line 4385, in _to_dataframe data = [ File ""/home/decker/miniconda3/envs/met212/lib/python3.8/site-packages/xarray/core/dataset.py"", line 4386, in self._variables[k].set_dims(ordered_dims).values.reshape(-1) MemoryError: Unable to allocate array with shape (117819, 5021) and data type |S64 ``` If I'm doing the math right, xarray is trying to allocate roughly 35 GB even though this NetCDF file is only on the order of 50 MB in size. Output of `xr.show_versions()`
Details ``` INSTALLED VERSIONS ------------------ commit: None python: 3.8.1 | packaged by conda-forge | (default, Jan 5 2020, 20:58:18) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 5.4.31-1-MANJARO machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_US.utf8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.3 xarray: 0.14.1 pandas: 0.25.3 numpy: 1.17.5 scipy: 1.4.1 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.1.2 cartopy: 0.17.0 seaborn: None numbagg: None setuptools: 45.1.0.post20200119 pip: 19.3.1 conda: None pytest: 5.3.4 IPython: 7.11.1 sphinx: None ```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,376370028