home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 636619598

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/4113#issuecomment-636619598 https://api.github.com/repos/pydata/xarray/issues/4113 636619598 MDEyOklzc3VlQ29tbWVudDYzNjYxOTU5OA== 6815844 2020-06-01T05:24:35Z 2020-06-01T05:24:35Z MEMBER

Reading with chunks load the memory more than reading without chunks, but not loading an amount of memory equals to the size of the array (300MB for a 800MB array in the example below). And by the way, also loading up the memory a bit more when stacking.

I think it depends on the chunk size. If I use the chunks chunks=dict(x=128, y=128), the memory usage is RAM: 118.14 MB da: 800.0 MB RAM: 119.14 MB RAM: 125.59 MB RAM: 943.79 MB

When stacking a chunked array, only chunks alongside the first stacking dimension are conserved, and chunks along the second stacking dimension seem to be merged.

I am not sure where 512 comes from in your example (maybe dask does something). If I work with chunks=dict(x=128, y=128), the chunksize after the stacking was (100, 16384), which is reasonable (z=100, px=(128, 128)).

A workaround could have been to save the data already stacked, but "MultiIndex cannot yet be serialized to netCDF".

You can do reset_index before saving it into the netCDF, but it requires another computation when creating the MultiIndex after loading.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  627735640
Powered by Datasette · Queries took 77.036ms · About: xarray-datasette