github: issue_comments: 9 rows where issue = 361016974 sorted by updated

9 rows where issue = 361016974 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
462422387	https://github.com/pydata/xarray/issues/2417#issuecomment-462422387	https://api.github.com/repos/pydata/xarray/issues/2417	MDEyOklzc3VlQ29tbWVudDQ2MjQyMjM4Nw==	Zeitsperre 10819524	2019-02-11T17:41:47Z	2019-02-11T17:41:47Z	CONTRIBUTOR	Hi @jhamman, please excuse the lateness of this reply. It turned out that in the end all I needed to do was set `OMP_NUM_THREADS` to the number based on my cores I want to use (2 threads/core) before launching my processes. Thanks for the help and for keeping this open. Feel free to close this thread.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Limiting threads/cores used by xarray(/dask?) 361016974
460393715	https://github.com/pydata/xarray/issues/2417#issuecomment-460393715	https://api.github.com/repos/pydata/xarray/issues/2417	MDEyOklzc3VlQ29tbWVudDQ2MDM5MzcxNQ==	jhamman 2443309	2019-02-04T20:07:56Z	2019-02-04T20:07:56Z	MEMBER	@Zeitsperre - are you still having problems in this area? If not, is okay if we close this issue?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Limiting threads/cores used by xarray(/dask?) 361016974
460325261	https://github.com/pydata/xarray/issues/2417#issuecomment-460325261	https://api.github.com/repos/pydata/xarray/issues/2417	MDEyOklzc3VlQ29tbWVudDQ2MDMyNTI2MQ==	andytraumueller 10809480	2019-02-04T16:57:27Z	2019-02-04T20:07:09Z	NONE	hi, my testcode is running properly on 5 threads thanks for the help ```python import xarray as xr import os import numpy import sys import dask from multiprocessing.pool import ThreadPool dask-worker = --nthreads 1 with dask.config.set(schedular='threads', pool=ThreadPool(5)): dset = xr.open_mfdataset("/data/Environmental_Data/Sea_Surface_Height//.nc", engine='netcdf4', concat_dim='time', chunks={"latitude":180,"longitude":360}) dset1 = dset["adt"]-dset["sla"] dset1.to_dataset(name = 'ssh_mean') dset["ssh_mean"] = dset1 dset = dset.drop("crs") dset = dset.drop("lat_bnds") dset = dset.drop("lon_bnds") dset = dset.drop("xarray_dataarray_variable") dset = dset.drop("nv") dset_all_over_monthly_mean = dset.groupby("time.month").mean(dim="time", skipna=True) dset_all_over_season1_mean = dset_all_over_monthly_mean.sel(month=[1,2,3]) dset_all_over_season1_mean.mean(dim="month",skipna=True) dset_all_over_season1_mean.to_netcdf("/data/Environmental_Data/dump/mean/all_over_season1_mean_ssh_copernicus_0.25deg_season1_data_mean.nc") ```	{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Limiting threads/cores used by xarray(/dask?) 361016974
460298993	https://github.com/pydata/xarray/issues/2417#issuecomment-460298993	https://api.github.com/repos/pydata/xarray/issues/2417	MDEyOklzc3VlQ29tbWVudDQ2MDI5ODk5Mw==	jhamman 2443309	2019-02-04T15:50:09Z	2019-02-04T15:51:43Z	MEMBER	On a few systems, I've noticed that I need to set the environment variable `OMP_NUM_THREADS` to `1` to limit parallel evaluation within dask threads. I wonder if that something like this is happening here? xref: https://stackoverflow.com/questions/39422092/error-with-omp-num-threads-when-using-dask-distributed	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Limiting threads/cores used by xarray(/dask?) 361016974
460292772	https://github.com/pydata/xarray/issues/2417#issuecomment-460292772	https://api.github.com/repos/pydata/xarray/issues/2417	MDEyOklzc3VlQ29tbWVudDQ2MDI5Mjc3Mg==	andytraumueller 10809480	2019-02-04T15:34:04Z	2019-02-04T15:34:04Z	NONE	i am also interest, I am running a lot of critical processes and I want to at least have 5 cores idleing.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Limiting threads/cores used by xarray(/dask?) 361016974
460020879	https://github.com/pydata/xarray/issues/2417#issuecomment-460020879	https://api.github.com/repos/pydata/xarray/issues/2417	MDEyOklzc3VlQ29tbWVudDQ2MDAyMDg3OQ==	jhamman 2443309	2019-02-03T03:54:59Z	2019-02-03T03:54:59Z	MEMBER	@Zeitsperre - this issue has been inactive for a while. Did you find a solution to y our problem?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Limiting threads/cores used by xarray(/dask?) 361016974
422461245	https://github.com/pydata/xarray/issues/2417#issuecomment-422461245	https://api.github.com/repos/pydata/xarray/issues/2417	MDEyOklzc3VlQ29tbWVudDQyMjQ2MTI0NQ==	shoyer 1217238	2018-09-18T16:31:03Z	2018-09-18T16:31:03Z	MEMBER	If your data using in-file HDF5 chunks/compression it's possible that HDF5 is uncompressing the data is parallel, though I haven't seen that before personally.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Limiting threads/cores used by xarray(/dask?) 361016974
422445732	https://github.com/pydata/xarray/issues/2417#issuecomment-422445732	https://api.github.com/repos/pydata/xarray/issues/2417	MDEyOklzc3VlQ29tbWVudDQyMjQ0NTczMg==	Zeitsperre 10819524	2018-09-18T15:44:03Z	2018-09-18T15:44:03Z	CONTRIBUTOR	As per your suggestion, I retried with chunking and found a new error (due to the nature of my data having rotated poles, dask demanded that I save my data with astype(); this isn't my major concern so I'll deal with that somewhere else). What I did notice was that when chunking was specified (`ds = xr.open_dataset(ncfile).chunking({'time': 10})`), I lost all parallelism and although I had specified different thread counts, the performance never crossed 110% (I imagine the extra 10% was due to I/O). This is really a mystery and unfortunately, I haven't a clue how this beahviour is possible if parallel processing is disabled by default. The speed of my results when dask multprocessing isn't specified suggests that it must be using more processing power: using Multiprocessing calls to CDO with 5 ForkPoolWorkers = ~2h/5 files (100% x 5 CPUs) xarray without dask multiprocessing specifications = ~3min/5 files (spikes of 3500% on one CPU) Could these spikes in CPU usage be due to other processes (e.g. memory usage, I/O)?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Limiting threads/cores used by xarray(/dask?) 361016974
422206083	https://github.com/pydata/xarray/issues/2417#issuecomment-422206083	https://api.github.com/repos/pydata/xarray/issues/2417	MDEyOklzc3VlQ29tbWVudDQyMjIwNjA4Mw==	shoyer 1217238	2018-09-17T23:40:52Z	2018-09-17T23:40:52Z	MEMBER	Step 1 would be making sure that you're actually using dask :). Xarray only uses dask with `open_dataset()` if you supply the `chunks` keyword argument. That said, xarray's only built-in support for parallelism is through Dask, so I'm not sure what is using all your CPU.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Limiting threads/cores used by xarray(/dask?) 361016974

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

9 rows where issue = 361016974 sorted by updated_at descending

dask-worker = --nthreads 1

Advanced export