home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 321553778

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
321553778 MDU6SXNzdWUzMjE1NTM3Nzg= 2109 Dataset.expand_dims() not lazy 206773 closed 0     2 2018-05-09T12:39:44Z 2018-05-09T15:45:31Z 2018-05-09T15:45:31Z NONE      

The following won't come back for a very long time or will fail with an out-of-memory error:

```python

ds = xr.open_dataset("D:\EOData\LC-CCI\ESACCI-LC-L4-LCCS-Map-300m-P1Y-2015-v2.0.8.nc") ds <xarray.Dataset> Dimensions: (lat: 64800, lon: 129600) Coordinates: * lat (lat) float32 89.9986 89.9958 89.9931 89.9903 ... * lon (lon) float32 -179.999 -179.996 -179.993 -179.99 ... Data variables: change_count (lat, lon) int8 ... crs int32 ... current_pixel_state (lat, lon) int8 ... observation_count (lat, lon) int16 ... processed_flag (lat, lon) int8 ... lccs_class (lat, lon) uint8 ... Attributes: title: ESA CCI Land Cover Map summary: This dataset contains the global ESA CCI land... type: ESACCI-LC-L4-LCCS-Map-300m-P1Y id: ESACCI-LC-L4-LCCS-Map-300m-P1Y-2015-v2.0.7 project: Climate Change Initiative - European Space Ag... references: http://www.esa-landcover-cci.org/ ... ds_with_time = ds.expand_dims('time') Zzzzzzz... ```

Problem description

When I call Dataset.expand_dims('time') on one of my ~2GB datasets (compressed), it seems to load all data data into memory, at least memory consumption goes beyond 12GB eventually ending in an out-of-memory exception.

(Sorry for the German UI.)

Expected Output

Dataset.expand_dims should execute lazy and fast and not require considerable memory as adding a scalar time dimension should only affect indexing but not an array's memory layout. Array data should not be loaded into memory (through Dask, Zarr, etc).

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.2.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 63 Stepping 2, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None xarray: 0.10.2 pandas: 0.20.3 numpy: 1.13.1 scipy: 0.19.1 netCDF4: 1.3.1 h5netcdf: 0.5.0 h5py: 2.7.1 Nio: None zarr: 2.2.0 bottleneck: 1.2.1 cyordereddict: None dask: 0.15.2 distributed: 1.19.1 matplotlib: 2.1.1 cartopy: 0.16.0 seaborn: None setuptools: 36.3.0 pip: 9.0.1 conda: None pytest: 3.1.3 IPython: None sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2109/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 0.869ms · About: xarray-datasette