home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 331981984

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
331981984 MDU6SXNzdWUzMzE5ODE5ODQ= 2230 Inconsistency between Sum of NA's and Mean of NA's: resampling gives 0 or 'NA' 30219501 closed 0     7 2018-06-13T12:54:47Z 2018-08-16T06:59:33Z 2018-08-16T06:59:33Z NONE      

Problem description

For datamining with xarray there is always the following issue with the resampling-method.
If i resample e.g. a daily timeseries over one month and if the data are 'NA' at each day, I get zero as a result. That is annoying considering a timeseries of precipitation. It is definitely a difference if the monthly precipitation is zero for one month (each day zero precipitation) or the monthly precipitation was not measured due to problems with the device (each day NA)

Data example

I have a dataset with hourly values for 5 month 'fcut'. python <xarray.Dataset> Dimensions: (bnds: 2, time: 3672) Coordinates: rlon float32 22.06 rlat float32 5.06 * time (time) datetime64[ns] 2006-05-01 2006-05-01T01:00:00 ... Dimensions without coordinates: bnds Data variables: rotated_pole int32 1 time_bnds (time, bnds) float64 1.304e+07 1.305e+07 1.305e+07 ... TOT_PREC (time) float64 nan nan nan nan nan nan nan nan nan nan nan ... Attributes: Doing a resample process gives only zero values for each month.

In [10]: fcut.resample(dim='time',freq='M',how='sum') Out[10]: <xarray.Dataset> Dimensions: (bnds: 2, time: 5) Coordinates: * time (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ... Dimensions without coordinates: bnds Data variables: rotated_pole (time) int64 1 1 1 1 1 time_bnds (time, bnds) float64 1.07e+10 1.07e+10 1.225e+10 1.225e+10 ... TOT_PREC (time) float64 0.0 0.0 0.0 0.0 0.0 But I expect to have NA for each month, as it is the case for the operator 'mean'

I know that there is an ongoing discussion about that topic (see for example https://github.com/pandas-dev/pandas/issues/9422).

For earth science it would be nice to have an option telling xarray what to do in case of a sum over values being all NA. Do you see a chance to have a fast fix for that issue in the model code?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2230/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 7 rows from issue in issue_comments
Powered by Datasette · Queries took 0.626ms · About: xarray-datasette