issues: 361016974
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
361016974 | MDU6SXNzdWUzNjEwMTY5NzQ= | 2417 | Limiting threads/cores used by xarray(/dask?) | 10819524 | closed | 0 | 9 | 2018-09-17T19:50:07Z | 2019-02-11T18:07:41Z | 2019-02-11T18:07:40Z | CONTRIBUTOR | I'm fairly new to xarray and I'm currently trying to leverage it to subset some NetCDFs. I'm running this on a shared server and would like to know how best to limit the processing power used by xarray so that it plays nicely with others. I've read through the dask and xarray documentation a bit but it doesn't seem clear to me how to set a cap on cpus/threads. Here's an example of a spatial subset: ``` import glob import os import xarray as xr from multiprocessing.pool import ThreadPool import dask wd = os.getcwd() test_data = os.path.join(wd, 'test_data') lat_bnds = (43, 50) lon_bnds = (-67, -80) output = 'test_data_subset' def subset_nc(ncfile, lat_bnds, lon_bnds, output): if not glob.os.path.exists(output): glob.os.makedirs(output) outfile = os.path.join(output, os.path.basename(ncfile).replace('.nc', '_subset.nc'))
list_files = glob.glob(os.path.join(test_data, '*')) print(list_files) for i in list_files: subset_nc(i, lat_bnds, lon_bnds, output) ``` I've tried a few variations on this by moving the |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2417/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |