home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 243270042

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
243270042 MDU6SXNzdWUyNDMyNzAwNDI= 1480 Time Dimension, Big problem with methods 'groupby' and 'to_netcdf' 30219501 closed 0     4 2017-07-16T22:10:52Z 2017-07-20T19:46:15Z 2017-07-17T19:01:01Z NONE      

My problem is that I would like to use the easy functionality of the xarray-library in python, but I run into problems with the time dimension in case of aggregating data and in case of writing netcdf. I am using pandas version 0.17.1 and xarray 0.9.6.

I have opened a dataset, which contains daily data over the year 2013: datset=xr.open_dataset(filein).

The contents of the file are: <xarray.Dataset> Dimensions: (bnds: 2, rlat: 228, rlon: 234, time: 365) Coordinates: * rlon (rlon) float64 -28.24 -28.02 -27.8 -27.58 -27.36 -27.14 ... * rlat (rlat) float64 -23.52 -23.3 -23.08 -22.86 -22.64 -22.42 ... * time (time) datetime64[ns] 2013-01-01T11:30:00 ... Dimensions without coordinates: bnds Data variables: rotated_pole |S1 '' time_bnds (time, bnds) float64 1.073e+09 1.073e+09 1.073e+09 ... ASWGLOB_S (time, rlat, rlon) float64 nan nan nan nan nan nan nan nan ... Attributes: CDI: Climate Data Interface version 1.7.0 (http://m... Conventions: CF-1.4 When I use now the groupby method to compute the monthly means, the time dimension is destroyed:

``` datset.groupby('time.month').mean('time') <xarray.Dataset> Dimensions: (bnds: 2, month: 12, rlat: 228, rlon: 234) Coordinates: * rlon (rlon) float64 -28.24 -28.02 -27.8 -27.58 -27.36 -27.14 ... * rlat (rlat) float64 -23.52 -23.3 -23.08 -22.86 -22.64 -22.42 -22.2 ... * month (month) int64 1 2 3 4 5 6 7 8 9 10 11 12 Dimensions without coordinates: bnds Data variables: time_bnds (month, bnds) float64 1.074e+09 1.074e+09 1.077e+09 1.077e+09 ... ASWGLOB_S (month, rlat, rlon) float64 nan nan nan nan nan nan nan nan ...

```

Now I have instead of a time dimension a month dimension with values from 1 to 12. Is this a side effect of the 'mean' - function? As long as i do not use this mean function, the time variable is retained.

The examples given in the documentation seems to have a different behaviour. That is, the timestamps are retained and the first date of each month is used.

It seems to be impossible to reinvent my old time dimension.

  • Method A: I have tried to create my own time variable with endresult.assign_coords(time=pd.date_range(start='2013-01',end='2014-01',freq='M' . That perfectly gives me a new coordinate with the correct dates. Afterwards, I have to swap the dimensions from month to time. It was only possible by changing the dimension of the coordinate 'time' to the dimension of the coordinate 'month'. However, the netcdf file contained wrong dates as output, i.e. values from 1 to 12. Thus the first time step was at 31-January 2013 and the next one day later and the next one day later and so on. If I add the attributes 'calendar' and 'units' to the time-coordinate, then the output seems to be correct but type int64 is not readable by programs like ncview.
  • Method B: Create the own time variable by using pandas and then converting the datetime64-dates to the usual python datetime-object. Further, the datetime-object is converted to numbers with the netcdf4.datetime.date2num method. Further, I assign this numbers to the time-coordinate and add the encoded attributes for units and calendar. However, the encoded units are not writen to the netcdf-data. So I have to add them with an external program like ncatted.

How to improve method A und B in order to have a correct time stamp in my nc-file.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1480/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 4 rows from issue in issue_comments
Powered by Datasette · Queries took 74.049ms · About: xarray-datasette