issues: 1223031600
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1223031600 | I_kwDOAMm_X85I5fsw | 6561 | Excessive memory consumption by to_dataframe() | 8419421 | closed | 0 | 4 | 2022-05-02T15:33:33Z | 2023-12-15T20:47:32Z | 2023-12-15T20:47:32Z | NONE | What happened?This is a reincarnation of #2534 with a reproduceable example. A 51 MB netCDF file leads to to_dataframe() requesting 23 GB. What did you expect to happen?I expect to_dataframe() to require much less than 23 GB of memory for this operation. Minimal Complete Verifiable Example```Python import urllib.request import xarray as xr url = 'http://people.envsci.rutgers.edu/decker/Surface_METAR_20220501_0000.nc' fname = 'metar.nc' urllib.request.urlretrieve(url, filename=fname) ncdata = xr.open_dataset(fname) df = ncdata.to_dataframe() ``` MVCE confirmation
Relevant log output
Anything else we need to know?No response Environment
/home/decker/local/miniconda3/envs/xarraybug/lib/python3.10/site-packages/_distutils_hack/__init__.py:30: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
INSTALLED VERSIONS
------------------
commit: None
python: 3.10.4 | packaged by conda-forge | (main, Mar 24 2022, 17:39:04) [GCC 10.3.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-1160.62.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.8.1
xarray: 2022.3.0
pandas: 1.4.2
numpy: 1.22.3
scipy: None
netCDF4: 1.5.8
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.0
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
setuptools: 62.1.0
pip: 22.0.4
conda: None
pytest: None
IPython: None
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6561/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |