home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 497381301

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/2501#issuecomment-497381301 https://api.github.com/repos/pydata/xarray/issues/2501 497381301 MDEyOklzc3VlQ29tbWVudDQ5NzM4MTMwMQ== 1872600 2019-05-30T15:55:56Z 2019-05-30T15:58:48Z NONE

I'm hitting some memory issues with using open_mfdataset with a cluster also.

Specifically, I'm trying to open 8760 NetCDF files with an 8 node, 40 cpu LocalCluster.

When I issue: ds = xr.open_mfdataset(files, parallel=True) all looks good on the Dask dashboard: and the tasks complete with no errors in about 4 minutes.

Then 4 more minutes go by before I get a bunch of errors like: distributed.nanny - WARNING - Worker exceeded 95% memory budget. Restarting distributed.nanny - WARNING - Worker process 26054 was killed by unknown signal distributed.nanny - WARNING - Restarting worker and my cell doesn't complete.

Any suggestions?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  372848074
Powered by Datasette · Queries took 0.453ms · About: xarray-datasette