home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 627735640

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
627735640 MDU6SXNzdWU2Mjc3MzU2NDA= 4113 xarray.DataArray.stack load data into memory 36678697 closed 0     6 2020-05-30T13:45:38Z 2022-04-19T16:10:26Z 2022-04-19T16:10:25Z NONE      

Stacking is loading the data into memory, which is unexpected, or at least undocumented, afaik.

MCVE Code Sample

```python import os import psutil import numpy as np import xarray as xr

def main():

xr.DataArray(
    np.random.randn(1024, 1024, 100),
    dims=("x", "y", "z"),
).to_netcdf("da.nc")

da = xr.open_dataarray("da.nc")
print(f" da: {mb(da.nbytes)} MB")
print_ram_state()

mda = da.stack(px=("x", "y"))
print_ram_state()

def print_ram_state(): # https://stackoverflow.com/a/21632554 process = psutil.Process(os.getpid()) ram_state = process.memory_info().rss print(f"RAM: {mb(ram_state) :.2f} MB")

def mb(nbytes): return nbytes / (1024 * 1024)

if name == "main": main()

```

Problem Description

Using xarray.DataArray.stack method is loading the data into memory, which is unexpected behavior, or at least undocumented afaik.

Versions

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 | packaged by conda-forge | (default, Mar 23 2020, 23:03:20) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 5.3.0-53-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: None xarray: 0.15.1 pandas: 1.0.3 numpy: 1.17.5 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2.16.0 distributed: 2.16.0 matplotlib: 3.2.1 cartopy: None seaborn: 0.10.1 numbagg: None setuptools: 46.4.0.post20200518 pip: 20.1.1 conda: 4.8.3 pytest: 5.4.2 IPython: 7.14.0 sphinx: 3.0.4
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4113/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 6 rows from issue in issue_comments
Powered by Datasette · Queries took 483.035ms · About: xarray-datasette