github: issue_comments: 16 rows where issue = 729980097 sorted by updated

16 rows where issue = 729980097 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
719907040	https://github.com/pydata/xarray/issues/4541#issuecomment-719907040	https://api.github.com/repos/pydata/xarray/issues/4541	MDEyOklzc3VlQ29tbWVudDcxOTkwNzA0MA==	mathause 10194086	2020-10-31T09:10:10Z	2020-10-31T09:10:10Z	MEMBER	Yes that would be great.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Option to skip tests in `weighted()` 729980097
717625573	https://github.com/pydata/xarray/issues/4541#issuecomment-717625573	https://api.github.com/repos/pydata/xarray/issues/4541	MDEyOklzc3VlQ29tbWVudDcxNzYyNTU3Mw==	jbusecke 14314623	2020-10-28T00:45:31Z	2020-10-28T00:45:31Z	CONTRIBUTOR	Another option would be to put the check in a `.map_blocks` call for dask arrays. This would only run and raise at compute time. Uh that sounds great actually. Same functionality, no triggered computation, and no intervention needed from the user. Should I try to implement this?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Option to skip tests in `weighted()` 729980097
717343483	https://github.com/pydata/xarray/issues/4541#issuecomment-717343483	https://api.github.com/repos/pydata/xarray/issues/4541	MDEyOklzc3VlQ29tbWVudDcxNzM0MzQ4Mw==	dcherian 2448579	2020-10-27T15:57:50Z	2020-10-27T15:57:50Z	MEMBER	Another option would be to put the check in a `.map_blocks` call for dask arrays. This would only run and raise at compute time.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Option to skip tests in `weighted()` 729980097
717342942	https://github.com/pydata/xarray/issues/4541#issuecomment-717342942	https://api.github.com/repos/pydata/xarray/issues/4541	MDEyOklzc3VlQ29tbWVudDcxNzM0Mjk0Mg==	dcherian 2448579	2020-10-27T15:57:03Z	2020-10-27T15:57:03Z	MEMBER	The discussion goes back to here: #2922 (comment) (by @dcherian) Ah, sorry! I was thinking of weights as being numpy arrays, not so much dask arrays. Do you do something between w = data.weighted(weights) and w.mean()? Yeah I think this is the issue. `.weighted` should be lazy. Thinking a bit more about this I now favour the isnull().any() test and would add a check_weights kwargs. This would be OK. We could also drop the check and let users deal with it, and also add a warning to the docstring.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Option to skip tests in `weighted()` 729980097
717320425	https://github.com/pydata/xarray/issues/4541#issuecomment-717320425	https://api.github.com/repos/pydata/xarray/issues/4541	MDEyOklzc3VlQ29tbWVudDcxNzMyMDQyNQ==	mathause 10194086	2020-10-27T15:23:55Z	2020-10-27T15:23:55Z	MEMBER	The discussion goes back to here: https://github.com/pydata/xarray/pull/2922#issuecomment-545200082 (by @dcherian) I decided to replace all NaN in the weights with 0. Can we raise an error instead? It should be easy for the user to do `weights.fillna(0)` instead of relying on xarray's magical behaviour. Thinking a bit more about this I now favour the `isnull().any()` test and would add a `check_weights` kwargs. I would even be fine to set `check_weights=False` per default and say the user is responsible to supply valid weights (but I'd want others to weigh in here). In addition, `a.isnull().any()` is quite a bit faster than `a.fillna(0)` (even if there are no nans present). This is mostly true for numpy arrays, not so much for dask (by my limited tests). On the other hand the `isnull().any()` test is a small percentage of the total time (https://github.com/pydata/xarray/issues/3883#issuecomment-630387515). I am also not entirely sure I understand where your issue lies. You eventually have to compute, right? Do you do something between `w = data.weighted(weights)` and `w.mean()`? Ah maybe I understand, your data looks like: `data`: `<xarray.DataArray (time: 1000, models: 1)>` `weights`: `<xarray.DataArray (time: 1000, models: 100)>` And now `weights` gets checked for all 100 models where only one would be relevant. Is this correct? (So as another workaround would be using `xr.align` before sending `weights` to `weighted`.) My limited speed tests: ```python import numpy as np import xarray as xr a = xr.DataArray(np.random.randn(1000, 1000, 10, 10)) %timeit a.isnull().any() %timeit a.fillna(0) b = xr.DataArray(np.random.randn(1000, 1000, 10, 10)).chunk(100) %timeit b.isnull().any() %timeit b.fillna(0) ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Option to skip tests in `weighted()` 729980097
717266102	https://github.com/pydata/xarray/issues/4541#issuecomment-717266102	https://api.github.com/repos/pydata/xarray/issues/4541	MDEyOklzc3VlQ29tbWVudDcxNzI2NjEwMg==	jbusecke 14314623	2020-10-27T14:03:34Z	2020-10-27T14:03:34Z	CONTRIBUTOR	Thanks @mathause , I was wondering how much of a performance trade off `.fillna(0)` is on a dask array with no nans, compared to the check. I favor this, since it allows slicing before the calculation is triggered: I have a current situation where I do a bunch of operations on a large multi-model dataset. The weights are time and member dependent and I am trying to save each member separately. Having the calculation triggered for the full dataset is problematic and `fillna(0)` avoids that (working with a hacked branch where I simply removed the check for nans)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Option to skip tests in `weighted()` 729980097
717240738	https://github.com/pydata/xarray/issues/4541#issuecomment-717240738	https://api.github.com/repos/pydata/xarray/issues/4541	MDEyOklzc3VlQ29tbWVudDcxNzI0MDczOA==	mathause 10194086	2020-10-27T13:24:43Z	2020-10-27T13:24:43Z	MEMBER	The other possibility would be to do sth like: ```python def init(..., skipna=False): `if skipna: weights = weighs.fillna(0)` ``` we did decide to not do this somewhere in the discussion, not entirely sure anymore why.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Option to skip tests in `weighted()` 729980097
717107362	https://github.com/pydata/xarray/issues/4541#issuecomment-717107362	https://api.github.com/repos/pydata/xarray/issues/4541	MDEyOklzc3VlQ29tbWVudDcxNzEwNzM2Mg==	mathause 10194086	2020-10-27T09:27:25Z	2020-10-27T09:27:25Z	MEMBER	`weights` cannot contain `NaN`s else the result will just be `NaN`, even with `skipna=True`. But then the weights rarely contain `NaN`. So this test is a bit a trade-off between time and convenience. A kwarg can certainly make sense (was also requested before). I would probably not call the kwarg `skipna`. Maybe `check_weights`? or `check_nan`? (better names welcome) I think `da.isnull().any()` is lazy and it's the `if` that makes it eager. So an alternative would be to make the statement lazy but I don't know how this would be done. The relevant test is here: https://github.com/pydata/xarray/blob/adc55ac4d2883e0c6647f3983c3322ca2c690514/xarray/tests/test_weighted.py#L22	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Option to skip tests in `weighted()` 729980097
716937842	https://github.com/pydata/xarray/issues/4541#issuecomment-716937842	https://api.github.com/repos/pydata/xarray/issues/4541	MDEyOklzc3VlQ29tbWVudDcxNjkzNzg0Mg==	max-sixty 5635139	2020-10-27T02:29:59Z	2020-10-27T06:15:51Z	MEMBER	If it leads to incorrect results, I agree. If it leads to a lazy error (even if more confusing), or a result array full of NaNs, then I think it's fine. Not super confident on the latter case, tbc. If we want more control, I would advocate for using a standard kwarg that offers control over the computation — e.g. `skip_na` often gives more performance in exchange for (edit: the user) ensuring no `NaN`s — rather than an idiosyncratic kwarg that's derived by the internals of this implementation	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Option to skip tests in `weighted()` 729980097
716974071	https://github.com/pydata/xarray/issues/4541#issuecomment-716974071	https://api.github.com/repos/pydata/xarray/issues/4541	MDEyOklzc3VlQ29tbWVudDcxNjk3NDA3MQ==	jbusecke 14314623	2020-10-27T04:33:04Z	2020-10-27T04:33:04Z	CONTRIBUTOR	Sounds good. I'll see if I can make some time to test and put up a PR this week.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Option to skip tests in `weighted()` 729980097
716930400	https://github.com/pydata/xarray/issues/4541#issuecomment-716930400	https://api.github.com/repos/pydata/xarray/issues/4541	MDEyOklzc3VlQ29tbWVudDcxNjkzMDQwMA==	jbusecke 14314623	2020-10-27T02:06:35Z	2020-10-27T02:06:35Z	CONTRIBUTOR	What would happen in this case if a dask array with nans is passed? Would this somehow silently influence the results or would it not matter (in that case I wonder what the check was for). If this could lead to undetected errors I would still consider a kwargs a safer alternative, especially for new users?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Option to skip tests in `weighted()` 729980097
716928594	https://github.com/pydata/xarray/issues/4541#issuecomment-716928594	https://api.github.com/repos/pydata/xarray/issues/4541	MDEyOklzc3VlQ29tbWVudDcxNjkyODU5NA==	max-sixty 5635139	2020-10-27T02:00:40Z	2020-10-27T02:00:40Z	MEMBER	Sorry if my initial issue was unclear. Not at all, my mistake So you favor not having a 'skip' kwarg to just internally skipping the call to `.any()` if `weights` is a dask array? 👍	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Option to skip tests in `weighted()` 729980097
716927242	https://github.com/pydata/xarray/issues/4541#issuecomment-716927242	https://api.github.com/repos/pydata/xarray/issues/4541	MDEyOklzc3VlQ29tbWVudDcxNjkyNzI0Mg==	jbusecke 14314623	2020-10-27T01:56:28Z	2020-10-27T01:56:28Z	CONTRIBUTOR	Sorry if my initial issue was unclear. So you favor not having a 'skip' kwarg to just internally skipping the call to `.any()` if `weights` is a dask array?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Option to skip tests in `weighted()` 729980097
716913449	https://github.com/pydata/xarray/issues/4541#issuecomment-716913449	https://api.github.com/repos/pydata/xarray/issues/4541	MDEyOklzc3VlQ29tbWVudDcxNjkxMzQ0OQ==	max-sixty 5635139	2020-10-27T01:13:04Z	2020-10-27T01:13:04Z	MEMBER	Sorry, I completely misunderstood! I thought you were asking about skipping tests as in pytest, hence my confusion. For sure re skipping those checks with dask arrays.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Option to skip tests in `weighted()` 729980097
716908428	https://github.com/pydata/xarray/issues/4541#issuecomment-716908428	https://api.github.com/repos/pydata/xarray/issues/4541	MDEyOklzc3VlQ29tbWVudDcxNjkwODQyOA==	max-sixty 5635139	2020-10-27T00:57:04Z	2020-10-27T01:10:30Z	MEMBER	I don't have that much context on `xgcm` so others may know better on this. ~Could you help me understand in what context you're running the tests?~ ~IIRC we used to have `--skip-slow` here, but it wasn't used that much and so is no longer there. It's definitely possible to add that sort of flag.~	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Option to skip tests in `weighted()` 729980097
716910761	https://github.com/pydata/xarray/issues/4541#issuecomment-716910761	https://api.github.com/repos/pydata/xarray/issues/4541	MDEyOklzc3VlQ29tbWVudDcxNjkxMDc2MQ==	dcherian 2448579	2020-10-27T01:04:14Z	2020-10-27T01:04:22Z	MEMBER	The relevant context is that `.any()` will trigger computation on a dask array. Maybe we skip the check using `is_duck_dask_array`?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Option to skip tests in `weighted()` 729980097

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);