home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 371891466

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1385#issuecomment-371891466 https://api.github.com/repos/pydata/xarray/issues/1385 371891466 MDEyOklzc3VlQ29tbWVudDM3MTg5MTQ2Ng== 1197350 2018-03-09T17:53:15Z 2018-03-09T17:53:15Z MEMBER

Calling ds = xr.decode_cf(ds, decode_times=False) on the dataset returns instantly. However, the variable data is wrapped in the adaptors, effectively destroying the chunks ```python

ds.SST.variable._data LazilyIndexedArray(array=DaskIndexingAdapter(array=dask.array<_apply_mask, shape=(16401, 2400, 3600), dtype=float32, chunksize=(1, 2400, 3600)>), key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))) ```

Calling getitem on this array triggers the whole dask array to be computed, which would takes forever and would completely blow out the notebook memory. This is because of #1372, which would be fixed by #1725.

This has actually become a major showstopper for me. I need to work with this dataset in decoded form.

Versions

INSTALLED VERSIONS ------------------ commit: None python: 3.6.4.final.0 python-bits: 64 OS: Linux OS-release: 3.12.62-60.64.8-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.1 pandas: 0.22.0 numpy: 1.13.3 scipy: 1.0.0 netCDF4: 1.3.1 h5netcdf: 0.5.0 h5py: 2.7.1 Nio: None zarr: 2.2.0a2.dev176 bottleneck: 1.2.1 cyordereddict: None dask: 0.17.1 distributed: 1.21.3 matplotlib: 2.1.2 cartopy: 0.15.1 seaborn: 0.8.1 setuptools: 38.4.0 pip: 9.0.1 conda: None pytest: 3.3.2 IPython: 6.2.1
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  224553135
Powered by Datasette · Queries took 0.815ms · About: xarray-datasette