github: issue_comments: 7 rows where issue = 331981984 sorted by updated

7 rows where issue = 331981984 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
399059204	https://github.com/pydata/xarray/issues/2230#issuecomment-399059204	https://api.github.com/repos/pydata/xarray/issues/2230	MDEyOklzc3VlQ29tbWVudDM5OTA1OTIwNA==	rpnaut 30219501	2018-06-21T10:48:21Z	2018-06-21T10:48:21Z	NONE	Thank you for considering that issue in your pull request #2236. I will switch to comment your work in the related thread, but I would leave this issue open until a solution is found for the min_count option.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Inconsistency between Sum of NA's and Mean of NA's: resampling gives 0 or 'NA' 331981984
398045641	https://github.com/pydata/xarray/issues/2230#issuecomment-398045641	https://api.github.com/repos/pydata/xarray/issues/2230	MDEyOklzc3VlQ29tbWVudDM5ODA0NTY0MQ==	fujiisoup 6815844	2018-06-18T12:59:48Z	2018-06-18T12:59:48Z	MEMBER	@rpnaut, thanks for lookng inside the code. See #2236.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Inconsistency between Sum of NA's and Mean of NA's: resampling gives 0 or 'NA' 331981984
397313140	https://github.com/pydata/xarray/issues/2230#issuecomment-397313140	https://api.github.com/repos/pydata/xarray/issues/2230	MDEyOklzc3VlQ29tbWVudDM5NzMxMzE0MA==	rpnaut 30219501	2018-06-14T14:20:10Z	2018-06-14T14:34:18Z	NONE	I really have problems in reading the code in duck_array_ops.py. The program starts with defining 12 operators. One of them is: `sum = _create_nan_agg_method('sum', numeric_only=True)` I really do not understand where the train is going. Thats due to my limited programming skills for object-oriented code. No guess what '_create_nan_agg_method' is doing. I tried to change the code in method `def _nansum_object(value, axis=None, kwargs): """ In house nansum for object array """ return _dask_or_eager_func('sum')(value, axis=axis, kwargs) #return np.array(np.nan)` but it seems that he will not touch that method during the 'resample().sum()' process. I need some help to really modify the operators. Is there any hint for me? For the pandas code it seems to be much easier.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Inconsistency between Sum of NA's and Mean of NA's: resampling gives 0 or 'NA' 331981984
397092870	https://github.com/pydata/xarray/issues/2230#issuecomment-397092870	https://api.github.com/repos/pydata/xarray/issues/2230	MDEyOklzc3VlQ29tbWVudDM5NzA5Mjg3MA==	shoyer 1217238	2018-06-13T21:27:33Z	2018-06-13T21:27:33Z	MEMBER	OK, I see you already saw the pandas issues :). For earth science it would be nice to have an option telling xarray what to do in case of a sum over values being all NA. Do you see a chance to have a fast fix for that issue in the model code? Yes, I would be very open to adding a `min_count` argument. We could probably copy the implementation of `sum` with `min_count` largely from pandas: https://github.com/pandas-dev/pandas/blob/0c4e611927772af44b02204192b29282341a5716/pandas/core/nanops.py#L329 In xarray this would go into `_create_nan_agg_method` in https://github.com/pydata/xarray/blob/master/xarray/core/duck_array_ops.py (sorry, this has gotten a little messy!)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Inconsistency between Sum of NA's and Mean of NA's: resampling gives 0 or 'NA' 331981984
397090519	https://github.com/pydata/xarray/issues/2230#issuecomment-397090519	https://api.github.com/repos/pydata/xarray/issues/2230	MDEyOklzc3VlQ29tbWVudDM5NzA5MDUxOQ==	shoyer 1217238	2018-06-13T21:19:55Z	2018-06-13T21:19:55Z	MEMBER	The difference between `mean` and `sum` here isn't resample specific. Xarray consistently interprets a "NA skipping sum" consistently as returning 0 in the case of all NaN inputs: ``` float(xarray.DataArray([np.nan]).sum()) 0.0 `This is consistent with the sum of an empty set being 0, e.g.,` float(xarray.DataArray([]).sum()) 0.0 ``` The reason why a "NA skipping mean" is different in the case of all NaN inputs is that the mean simply isn't well defined on an empty set. The mean would literally be a sum of zero divided by a count of zero, which is not a valid number: the literal meaning of NaN as "not a number". There was a long discussion/debate about this recently in pandas. See https://github.com/pandas-dev/pandas/issues/18678 and links there-in. There are certainly use-cases where it is nicer for the sum of all NaN outputs to be NaN (exactly as you mention here), but ultimately pandas decided that the answer for this operation should be zero. The decisive considerations were simplicity and consistency with other tools (including NumPy and R). What pandas added to solve this use-case is an optional `min_count` argument (see pandas.DataFrame.sum for an example). We could definitely copy this behavior in xarray if someone is interested in implementing it.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Inconsistency between Sum of NA's and Mean of NA's: resampling gives 0 or 'NA' 331981984
396934730	https://github.com/pydata/xarray/issues/2230#issuecomment-396934730	https://api.github.com/repos/pydata/xarray/issues/2230	MDEyOklzc3VlQ29tbWVudDM5NjkzNDczMA==	rpnaut 30219501	2018-06-13T13:21:40Z	2018-06-13T13:47:56Z	NONE	I can overcome this by using `In [14]: fcut.resample(dim='time',freq='M',how='mean',skipna=False) Out[14]: <xarray.Dataset> Dimensions: (bnds: 2, time: 5) Coordinates: * time (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ... Dimensions without coordinates: bnds Data variables: rotated_pole (time) float64 1.0 1.0 1.0 1.0 1.0 time_bnds (time, bnds) float64 1.438e+07 1.438e+07 1.702e+07 ... TOT_PREC (time) float64 nan nan nan nan nan` BUT THE PROBLEM IS: A) that this behaviour is in contradiction to the computation of a mean. I can always compute a mean with the default option 'skipna=True' regardless I have a few NA's in the timeseries (the output is a number not considering the NA's) or only NA's in the timeseries (the output is NA). This is what i would expect. B) that setting `skipna=False' does not allow for computations if only one value of the timeseries is NA. I would like to have the behaviour of the mean operator also for the sum operator. Also for the climate data operators (CDO) the developers decided to give the users two options, skipna=True and skipna=False. But skipna == TRUE should result in the same behaviour for both operators (mean and sum).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Inconsistency between Sum of NA's and Mean of NA's: resampling gives 0 or 'NA' 331981984
396928537	https://github.com/pydata/xarray/issues/2230#issuecomment-396928537	https://api.github.com/repos/pydata/xarray/issues/2230	MDEyOklzc3VlQ29tbWVudDM5NjkyODUzNw==	fujiisoup 6815844	2018-06-13T13:00:45Z	2018-06-13T13:01:13Z	MEMBER	Thank you for raising an issue. Could you try using `.sum(skipna=False)` for resampled data? As similar to `pandas.DataFrame.sum`, our `.sum` (and other reduction methods) assumes `skipna=True` unless explicitly specified.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Inconsistency between Sum of NA's and Mean of NA's: resampling gives 0 or 'NA' 331981984

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);