home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where author_association = "NONE" and user = 8419421 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

issue 3

  • to_dataframe() excessive memory usage 1
  • Opendap access failure error 1
  • Excessive memory consumption by to_dataframe() 1

user 1

  • sgdecker · 3 ✖

author_association 1

  • NONE · 3 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1116344892 https://github.com/pydata/xarray/issues/6561#issuecomment-1116344892 https://api.github.com/repos/pydata/xarray/issues/6561 IC_kwDOAMm_X85CihI8 sgdecker 8419421 2022-05-03T17:13:02Z 2022-05-03T17:13:02Z NONE

Thanks for the feedback and explanation. It seems the poorly constructed netCDF file is fundamentally to blame for triggering this behavior. A warning is a good idea, though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Excessive memory consumption by to_dataframe() 1223031600
1065536538 https://github.com/pydata/xarray/issues/4043#issuecomment-1065536538 https://api.github.com/repos/pydata/xarray/issues/4043 IC_kwDOAMm_X84_gswa sgdecker 8419421 2022-03-11T21:16:59Z 2022-03-11T21:16:59Z NONE

I believe I am experiencing a similar issue, although with code that I thought was smart enough to chunk the data request into smaller pieces: ``` import numpy as np import xarray as xr from dask.diagnostics import ProgressBar import intake

wrf_url = ('https://rda.ucar.edu/thredds/catalog/files/g/ds612.0/' 'PGW3D/2006/catalog.xml') catalog_u = intake.open_thredds_merged(wrf_url, path=['_U_2006060']) catalog_v = intake.open_thredds_merged(wrf_url, path=['_V_2006060'])

ds_u = catalog_u.to_dask() ds_u['U'] = ds_u.U.chunk("auto") ds_v = catalog_v.to_dask() ds_v['V'] = ds_v.V.chunk("auto") ds = xr.merge((ds_u, ds_v))

def unstagger(ds, var, coord, new_coord): var1 = ds[var].isel({coord: slice(None, -1)}) var2 = ds[var].isel({coord: slice(1, None)}) return ((var1 + var2) / 2).rename({coord: new_coord})

with ProgressBar(): ds['U_unstaggered'] = unstagger(ds, 'U', 'west_east_stag', 'west_east') ds['V_unstaggered'] = unstagger(ds, 'V', 'south_north_stag', 'south_north') ds['speed'] = np.hypot(ds.U_unstaggered, ds.V_unstaggered) ds.speed.isel(bottom_top=10).sel(Time='2006-06-07T18:00').plot() ```

This throws an error because, according to the RDA help folks, a request for an entire variable is made, which far exceeds their server's 500 MB request limit: rda.ucar.edu/thredds/dodsC/files/g/ds612.0/PGW3D/2006/wrf3d_d01_PGW_U_20060607.nc.dods?U%5B0:1: 7%5D%5B0:1:49%5D%5B0:1:1014%5D%5B0:1:1359%5D

Here's the error: Traceback (most recent call last): File "/home/decker/classes/met325/rda_plot.py", line 29, in <module> ds.speed.isel(bottom_top=10).sel(Time='2006-06-07T18:00').plot() File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/plot/plot.py", line 862, in __call__ return plot(self._da, **kwargs) File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/plot/plot.py", line 293, in plot darray = darray.squeeze().compute() File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/core/dataarray.py", line 951, in compute return new.load(**kwargs) File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/core/dataarray.py", line 925, in load ds = self._to_temp_dataset().load(**kwargs) File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/core/dataset.py", line 862, in load evaluated_data = da.compute(*lazy_data.values(), **kwargs) File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/dask/base.py", line 571, in compute results = schedule(dsk, keys, **kwargs) File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/dask/threaded.py", line 79, in get results = get_async( File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/dask/local.py", line 507, in get_async raise_exception(exc, tb) File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/dask/local.py", line 315, in reraise raise exc File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/dask/local.py", line 220, in execute_task result = _execute_task(task, data) File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/dask/core.py", line 119, in _execute_task return func(*(_execute_task(a, cache) for a in args)) File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/dask/array/core.py", line 116, in getter c = np.asarray(c) File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/core/indexing.py", line 357, in __array__ return np.asarray(self.array, dtype=dtype) File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/core/indexing.py", line 521, in __array__ return np.asarray(self.array, dtype=dtype) File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/core/indexing.py", line 422, in __array__ return np.asarray(array[self.key], dtype=None) File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/conventions.py", line 62, in __getitem__ return np.asarray(self.array[key], dtype=self.dtype) File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/core/indexing.py", line 422, in __array__ return np.asarray(array[self.key], dtype=None) File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/backends/pydap_.py", line 39, in __getitem__ return indexing.explicit_indexing_adapter( File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/core/indexing.py", line 711, in explicit_indexing_adapter result = raw_indexing_method(raw_key.tuple) File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/backends/pydap_.py", line 47, in _getitem result = robust_getitem(array, key, catch=ValueError) File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/xarray/backends/common.py", line 64, in robust_getitem return array[key] File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/pydap/model.py", line 323, in __getitem__ out.data = self._get_data_index(index) File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/pydap/model.py", line 353, in _get_data_index return self._data[index] File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/pydap/handlers/dap.py", line 170, in __getitem__ raise_for_status(r) File "/home/decker/local/miniconda3/envs/met325/lib/python3.10/site-packages/pydap/net.py", line 38, in raise_for_status raise HTTPError( webob.exc.HTTPError: 403 403 I thought smaller requests would automagically happen with this code. Is it intended that a large request be made?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Opendap access failure error 614144170
613068333 https://github.com/pydata/xarray/issues/2534#issuecomment-613068333 https://api.github.com/repos/pydata/xarray/issues/2534 MDEyOklzc3VlQ29tbWVudDYxMzA2ODMzMw== sgdecker 8419421 2020-04-13T19:57:08Z 2020-04-13T19:57:08Z NONE

Here is another example: import xarray as xr ncdata = xr.open_dataset('https://thredds.ucar.edu/thredds/dodsC/nws/metar/ncdecoded/files/Surface_METAR_20200411_0000.nc') df = ncdata.to_dataframe()

Output is: Traceback (most recent call last): File "bug.py", line 4, in <module> df = ncdata.to_dataframe() File "/home/decker/miniconda3/envs/met212/lib/python3.8/site-packages/xarray/core/dataset.py", line 4399, in to_dataframe return self._to_dataframe(self.dims) File "/home/decker/miniconda3/envs/met212/lib/python3.8/site-packages/xarray/core/dataset.py", line 4385, in _to_dataframe data = [ File "/home/decker/miniconda3/envs/met212/lib/python3.8/site-packages/xarray/core/dataset.py", line 4386, in <listcomp> self._variables[k].set_dims(ordered_dims).values.reshape(-1) MemoryError: Unable to allocate array with shape (117819, 5021) and data type |S64

If I'm doing the math right, xarray is trying to allocate roughly 35 GB even though this NetCDF file is only on the order of 50 MB in size.

Output of xr.show_versions()

Details ``` INSTALLED VERSIONS ------------------ commit: None python: 3.8.1 | packaged by conda-forge | (default, Jan 5 2020, 20:58:18) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 5.4.31-1-MANJARO machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_US.utf8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.3 xarray: 0.14.1 pandas: 0.25.3 numpy: 1.17.5 scipy: 1.4.1 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.1.2 cartopy: 0.17.0 seaborn: None numbagg: None setuptools: 45.1.0.post20200119 pip: 19.3.1 conda: None pytest: 5.3.4 IPython: 7.11.1 sphinx: None ```
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  to_dataframe() excessive memory usage 376370028

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.85ms · About: xarray-datasette