home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where issue = 243270042 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 4

  • rabernat 1
  • jhamman 1
  • byersiiasa 1
  • rpnaut 1

author_association 2

  • MEMBER 2
  • NONE 2

issue 1

  • Time Dimension, Big problem with methods 'groupby' and 'to_netcdf' · 4 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
316810057 https://github.com/pydata/xarray/issues/1480#issuecomment-316810057 https://api.github.com/repos/pydata/xarray/issues/1480 MDEyOklzc3VlQ29tbWVudDMxNjgxMDA1Nw== rabernat 1197350 2017-07-20T19:46:15Z 2017-07-20T19:46:15Z MEMBER

As I understand he is getting monthly data out of groupby-method and in his example the "time" survives. It seems to be that the functionality of groupby-month changed during the years, because the groupby-method in Nicolas's example did not aggregate same calendar month to one time stamp.

There has been no change in xarray's groupby behavior. Nicolas' example would work with today's code. When you call datset.groupby('time.month').mean('time'), you remove the time dimension by aggregating over it. If you had applied a different (non-reducing) function to the group (e.g. datset.groupby('time.month').apply(lambda x : x**2), you would preserve the time dimension.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Time Dimension, Big problem with methods 'groupby' and 'to_netcdf' 243270042
316729712 https://github.com/pydata/xarray/issues/1480#issuecomment-316729712 https://api.github.com/repos/pydata/xarray/issues/1480 MDEyOklzc3VlQ29tbWVudDMxNjcyOTcxMg== rpnaut 30219501 2017-07-20T14:57:11Z 2017-07-20T14:57:11Z NONE

You are so right. I did not realize that there is the resample method, which hopefully can also be combined with the 'apply' functionality. The documentation I mentioned was from "nicolasfauchereau.github.io/climatecode/posts/xray" (look at In[24] and In[25]. As I understand he is getting monthly data out of groupby-method and in his example the "time" survives. It seems to be that the functionality of groupby-month changed during the years, because the groupby-method in Nicolas's example did not aggregate same calendar month to one time stamp.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Time Dimension, Big problem with methods 'groupby' and 'to_netcdf' 243270042
315849728 https://github.com/pydata/xarray/issues/1480#issuecomment-315849728 https://api.github.com/repos/pydata/xarray/issues/1480 MDEyOklzc3VlQ29tbWVudDMxNTg0OTcyOA== jhamman 2443309 2017-07-17T19:01:01Z 2017-07-17T19:01:01Z MEMBER

@rpnaut - @byersiiasa is correct. It sounds like you want

Python datset.resample('MS', dim='time', how='mean')

The month labels (1, 2, 3, ...) come from pandas TimeGrouper and this is functionality we will want to keep around.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Time Dimension, Big problem with methods 'groupby' and 'to_netcdf' 243270042
315782686 https://github.com/pydata/xarray/issues/1480#issuecomment-315782686 https://api.github.com/repos/pydata/xarray/issues/1480 MDEyOklzc3VlQ29tbWVudDMxNTc4MjY4Ng== byersiiasa 17701232 2017-07-17T15:04:56Z 2017-07-17T15:04:56Z NONE

As far as I know I can imagine this is the intended functionality.

The examples given in the documentation seems to have a different behaviour. That is, the timestamps are retained and the first date of each month is used.

I cannot find where this is the case, apart from when using .resample. Could you put a link to the doc page?

The issue is perhaps more with the example that you present (of only 1 year data) and expected behaviour. Normally groupby('time.month') would be applied to multiple years of data. i.e. group data by month and find the monthly averages for Jan-Dec for 30 years of data, e.g. a climatology.

And so in this case it absolutely makes sense to keep the months as 1 to 12, or something similar (perhaps 'Jan','Feb'etc). Applying a datestring of the first day of the month wouldn't make sense because which year would you choose when you have 30 years of data?

If you do want a time series of monthly means, then .resample is the function you want and it will give you the datestamps in the format that you desire.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Time Dimension, Big problem with methods 'groupby' and 'to_netcdf' 243270042

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 12.391ms · About: xarray-datasette