home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

6 rows where author_association = "NONE" and issue = 333248242 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • rpnaut 6

issue 1

  • Refactor nanops · 6 ✖

author_association 1

  • NONE · 6 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
412054270 https://github.com/pydata/xarray/pull/2236#issuecomment-412054270 https://api.github.com/repos/pydata/xarray/issues/2236 MDEyOklzc3VlQ29tbWVudDQxMjA1NDI3MA== rpnaut 30219501 2018-08-10T11:20:40Z 2018-08-10T11:24:37Z NONE

Ok. The strange thing with the spatial dimensions is, that the new syntax forces the user to tell exactly, on which dimension the mathematical operator for resampling (like sum) should be applied. The syntax is now data.resample(time="M").sum(dim="time",min_count=1).

That weird - two times giving the dimension. However, doing so the xarray is not summing up for, e.g. the first month, all values he finds in a specific dataaray, but only those values along the dimension time.

AND NOW, the good new is that I got the following picture with your 'min_count=0' and 'min_count=1':

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Refactor nanops 333248242
411713980 https://github.com/pydata/xarray/pull/2236#issuecomment-411713980 https://api.github.com/repos/pydata/xarray/issues/2236 MDEyOklzc3VlQ29tbWVudDQxMTcxMzk4MA== rpnaut 30219501 2018-08-09T10:34:19Z 2018-08-09T16:15:55Z NONE

To wrap it up. Your implementation works for timeseries - data. There is something strange with time-space data, which should be fixed. If this is fixed, it is worth to test in my evaluation environment. Do you have a feeling, why the new syntax is giving such strange behaviour? Shall we put the bug onto the issue list?

And maybe, it would be interesting to have in the future the min_count argument also available for the old syntax and not only the new. The reason: The dimension name is not flexible anymore - it cannot be a variable like dim=${dim}

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Refactor nanops 333248242
411705656 https://github.com/pydata/xarray/pull/2236#issuecomment-411705656 https://api.github.com/repos/pydata/xarray/issues/2236 MDEyOklzc3VlQ29tbWVudDQxMTcwNTY1Ng== rpnaut 30219501 2018-08-09T10:01:32Z 2018-08-09T10:30:15Z NONE

Thanks, @fujiisoup .

I have good news and i have bad news.

A) Your min_count argument still seems to work only if using the new syntax for resample, i.e. data.resample($dim=$freq).sum(). I guess, this is due to the cancellation of the old syntax in the future. Using the old syntax data.resample(dim=$dim,freq=$freq,how=$oper) your code seems to ignore the min_count argument.

B) Your min_count argument is not allowed for type 'dataset' but only for type 'dataarray'. Starting with the dataset located here: https://swiftbrowser.dkrz.de/public/dkrz_c0725fe8741c474b97f291aac57f268f/GregorMoeller/ I've got the following message: data.resample(time="M").sum(min_count=1) TypeError: sum() got an unexpected keyword argument 'min_count'

Thus, I have tested your implementation only on dataarrays. I take the netcdf - array 'TOT_PREC' and try to compute the monthly sum: In [39]: data = array.open_dataset("eObs_gridded_0.22deg_rot_v14.0.TOT_PREC.1950-2016.nc_CutParamTimeUnitCor_FinalEvalGrid") In [40]: datamonth = data["TOT_PREC"].resample(time="M").sum() In [41]: datamonth Out[41]: <xarray.DataArray 'TOT_PREC' (time: 5)> array([ 551833.25 , 465640.09375, 328445.90625, 836892.1875 , 503601.5 ], dtype=float32) Coordinates: time (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ...

So, the xarray-package is still throwing away the dimensions in x- and y-direction. It has nothing to do with any min-count argument. THIS MUST BE A BUG OF XARRAY. The afore-mentioned dimensions only survive using the old-syntax: ``` In [41]: datamonth = data["TOT_PREC"].resample(dim="time",freq="M",how="sum") /usr/bin/ipython3:1: FutureWarning: .resample() has been modified to defer calculations. Instead of passing 'dim' and how="sum", instead consider using .resample(time="M").sum('time') #!/usr/bin/env python3

In [42]: datamonth Out[42]: <xarray.DataArray 'TOT_PREC' (time: 5, rlat: 136, rlon: 144)> array([[[ 0. , 0. , ..., 0. , 0. ], [ 0. , 0. , ..., 0. , 0. ], ..., [ 0. , 0. , ..., 44.900028, 41.400024], [ 0. , 0. , ..., 49.10001 , 46.5 ]]], dtype=float32) Coordinates: * time (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ... * rlon (rlon) float32 -22.6 -22.38 -22.16 -21.94 -21.72 -21.5 -21.28 ... * rlat (rlat) float32 -12.54 -12.32 -12.1 -11.88 -11.66 -11.44 -11.22 ...

```

Nevertheless, I have started to use your min_count argument only at one point (the x- and y-dimensions do not matter). In that case, your implementation works fine: ``` In [46]: pointdata = data.isel(rlon=10,rlat=10) In [47]: pointdata["TOT_PREC"] Out[47]: <xarray.DataArray 'TOT_PREC' (time: 153)> array([ nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, ... nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan], dtype=float32) Coordinates: rlon float32 -20.4 rlat float32 -10.34 * time (time) datetime64[ns] 2006-05-01T12:00:00 2006-05-02T12:00:00 ... Attributes: standard_name: precipitation_amount long_name: Precipitation units: kg m-2 grid_mapping: rotated_pole cell_methods: time: sum

In [48]: pointdata["TOT_PREC"].resample(time="M").sum(min_count=1) Out[48]: <xarray.DataArray 'TOT_PREC' (time: 5)> array([ nan, nan, nan, nan, nan]) Coordinates: * time (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ... rlon float32 -20.4 rlat float32 -10.34

In [49]: pointdata["TOT_PREC"].resample(time="M").sum(min_count=0) Out[49]: <xarray.DataArray 'TOT_PREC' (time: 5)> array([ 0., 0., 0., 0., 0.], dtype=float32) Coordinates: * time (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ... rlon float32 -20.4 rlat float32 -10.34

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Refactor nanops 333248242
399076615 https://github.com/pydata/xarray/pull/2236#issuecomment-399076615 https://api.github.com/repos/pydata/xarray/issues/2236 MDEyOklzc3VlQ29tbWVudDM5OTA3NjYxNQ== rpnaut 30219501 2018-06-21T11:50:19Z 2018-06-21T12:04:58Z NONE

Okay. Using the old resample-nomenclature I tried to compare the results with your modified code (min_count = 2) and with the tagged version 0.10.7 (no min_count argument). But I am not sure, if this also works. Do you examine the keywords given from the old-resample method?.

However, in the comparison I did not saw the nan's I expected over water.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Refactor nanops 333248242
399071095 https://github.com/pydata/xarray/pull/2236#issuecomment-399071095 https://api.github.com/repos/pydata/xarray/issues/2236 MDEyOklzc3VlQ29tbWVudDM5OTA3MTA5NQ== rpnaut 30219501 2018-06-21T11:26:45Z 2018-06-21T12:03:23Z NONE

!!!Correction!!!. The resample-example above gives also a missing lon-lat dimensions in case of unmodified model code. The resulting numbers are the same.

data_aggreg = data["TOT_PREC"].resample(time="M").sum() data_aggreg <xarray.DataArray 'TOT_PREC' (time: 5)> array([ 551833.25 , 465640.09375, 328445.90625, 836892.1875 , 503601.5 ], dtype=float32) Coordinates: * time (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ...

Now I am a little bit puzzled. But ...

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!IT SEEMS TO BE A BUG!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

If I do the resample process using the old nomenclature: data_aggreg = data["TOT_PREC"].resample(dim="time",how="sum",freq="M") it works. DO we have a bug in xarray?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Refactor nanops 333248242
399061170 https://github.com/pydata/xarray/pull/2236#issuecomment-399061170 https://api.github.com/repos/pydata/xarray/issues/2236 MDEyOklzc3VlQ29tbWVudDM5OTA2MTE3MA== rpnaut 30219501 2018-06-21T10:55:56Z 2018-06-21T11:03:16Z NONE

Hello from me hopefully contributing some needfull things.

At first, I would like to comment that I checked out your code.

I ran the following code example using a datafile uploaded under the following link: https://swiftbrowser.dkrz.de/public/dkrz_c0725fe8741c474b97f291aac57f268f/GregorMoeller/

``` import xarray import matplotlib.pyplot as plt

data = xarray.open_dataset("eObs_gridded_0.22deg_rot_v14.0.TOT_PREC.1950-2016.nc_CutParamTimeUnitCor_FinalEvalGrid") data <xarray.Dataset> Dimensions: (rlat: 136, rlon: 144, time: 153) Coordinates: * rlon (rlon) float32 -22.6 -22.38 -22.16 -21.94 -21.72 -21.5 ... * rlat (rlat) float32 -12.54 -12.32 -12.1 -11.88 -11.66 -11.44 ... * time (time) datetime64[ns] 2006-05-01T12:00:00 ... Data variables: rotated_pole int32 ... TOT_PREC (time, rlat, rlon) float32 ... Attributes: CDI: Climate Data Interface version 1.8.0 (http://m... Conventions: CF-1.6 history: Thu Jun 14 12:34:59 2018: cdo -O -s -P 4 remap... CDO: Climate Data Operators version 1.8.0 (http://m... cdo_openmp_thread_number: 4

data_aggreg = data["TOT_PREC"].resample(time="M").sum(min_count=0) data_aggreg2 = data["TOT_PREC"].resample(time="M").sum(min_count=1) I have recognized that the min_count option at recent state technically only works for DataArrays and not for DataSets. However, more interesting is the fact that the dimensions are destroyed: data_aggreg <xarray.DataArray 'TOT_PREC' (time: 5)> array([ 551833.25 , 465640.09375, 328445.90625, 836892.1875 , 503601.5 ], dtype=float32) Coordinates: * time (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ... no longitude and latitude survives your operation. If I would use the the sum-operator on the full dataset (where maybe the code was not modified?) I got

data_aggreg = data.resample(time="M").sum() data_aggreg <xarray.Dataset> Dimensions: (rlat: 136, rlon: 144, time: 5) Coordinates: * time (time) datetime64[ns] 2006-05-31 2006-06-30 2006-07-31 ... * rlon (rlon) float32 -22.6 -22.38 -22.16 -21.94 -21.72 -21.5 ... * rlat (rlat) float32 -12.54 -12.32 -12.1 -11.88 -11.66 -11.44 ... Data variables: rotated_pole (time) int64 1 1 1 1 1 TOT_PREC (time, rlat, rlon) float32 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Refactor nanops 333248242

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 239.944ms · About: xarray-datasette