home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 1340994913

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1340994913 I_kwDOAMm_X85P7fVh 6924 Memory Leakage Issue When Running to_netcdf 64621312 closed 0     2 2022-08-16T23:58:17Z 2023-01-17T18:38:40Z 2023-01-17T18:38:40Z NONE      

What is your issue?

I have a zarr file that I'd like to convert to a netcdf which is too large to fit in memory. My computer has 32GB of RAM so writing ~5.5GB chunks shouldn't be a problem. However, within seconds of running this script, my memory usage quickly tops out consuming the available ~20GB and the script fails.

Data: Dropbox link to zarr file containing radar rainfall data for 6/28/2014 over the United States that is around 1.8GB in total.

Code: ```python import xarray as xr import zarr

fpath_zarr = "out_zarr_20140628.zarr"

ds_from_zarr = xr.open_zarr(store=fpath_zarr, chunks={'outlat':3500, 'outlon':7000, 'time':30})

ds_from_zarr.to_netcdf("ds_zarr_to_nc.nc", encoding= {"rainrate":{"zlib":True}}) ```

Outputs: python MemoryError: Unable to allocate 5.48 GiB for an array with shape (30, 3500, 7000) and data type float64

Package versions: dask 2022.7.0 xarray 2022.3.0 zarr 2.8.1

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6924/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 0.793ms · About: xarray-datasette