home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 214088387

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
214088387 MDU6SXNzdWUyMTQwODgzODc= 1308 Using groupby with custom index 7300413 closed 0     8 2017-03-14T14:24:11Z 2017-03-15T15:32:34Z 2017-03-15T15:32:34Z NONE      

Hello,

I have 6 hourly data (ERA Interim) for around 10 years. I want to calculate the annual 6 hourly climatology, i.e, 366*4 values, with each value corresponding to a 6 hourly interval. I am chunking the data along longitude. I'm using xarray 0.9.1 with Python 3.6 (Anaconda).

For a daily climatology on this data, I do the usual: python mean = data.groupby('time.dayofyear').mean(dim='time').compute() For the 6 hourly version, I am trying the following: python test = (data['time.hour']/24 + data['time.dayofyear']) test.name = 'dayHourly' new_test = data.groupby(test).mean(dim='time').compute() The first one (daily climatology) takes around 15 minutes for my data, whereas the second one ran for almost 30 minutes after which I gave up and killed the process.

Is there some obvious reason why the first is much faster than the second? data in both cases is the 6 hourly dataset. And is there an alternative way of expressing this computation which would make it faster?

TIA, Joy

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1308/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 8 rows from issue in issue_comments
Powered by Datasette · Queries took 0.89ms · About: xarray-datasette