home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 410361639

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/2159#issuecomment-410361639 https://api.github.com/repos/pydata/xarray/issues/2159 410361639 MDEyOklzc3VlQ29tbWVudDQxMDM2MTYzOQ== 2622379 2018-08-03T20:03:21Z 2018-08-03T20:14:17Z NONE

Yes, xarray should support that very easily -- assuming you have dask installed: python ds = auto_merge('*.nc') ds.to_netcdf('larger_than_memory.nc') auto_merge conserves the chunk sizes resulting from the individual files. If the single files are still too large to fit into memory individually you can rechunk to smaller chunk sizes. The same goes of course for the original xarray.open_mfdataset.

I tested it on a ~25 GB dataset (on a machine with less memory than that).

Note: ds = auto_merge('*.nc') actually runs in a matter of milliseconds, as it merely provides a view of the merged dataset. Only once you call ds.to_netcdf('larger_than_memory.nc') all the disk I/O happens.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
  324350248
Powered by Datasette · Queries took 0.633ms · About: xarray-datasette