home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 776160305

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/3781#issuecomment-776160305 https://api.github.com/repos/pydata/xarray/issues/3781 776160305 MDEyOklzc3VlQ29tbWVudDc3NjE2MDMwNQ== 885575 2021-02-09T18:51:13Z 2021-02-09T18:51:13Z NONE

@lvankampenhout, I ran into your problem. OP's seems like it's actually in to_netcdf(), but I think yours (ours) is in Dask's lazy loading and therefore unrelated.

In short, ds will have some Dask arrays whose contents don't actually get loaded until you call to_netcdf(). By default, Dask loads in parallel, and the default Dask parallel scheduler chokes when you do your own parallelism on top. In my case, I was able to get around it by doing

python ds.load(scheduler='sync')

at some point. If it's outside do_work(), I think you can skip the scheduler='sync' part, but inside do_work(), it's required. This bypasses the parallelism in Dask, which is probably what you want if you're doing your own parallelism.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  567678992
Powered by Datasette · Queries took 1.176ms · About: xarray-datasette