home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

9 rows where issue = 104484316 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 6

  • shoyer 3
  • markelg 2
  • rabernat 1
  • jhamman 1
  • clarkfitzg 1
  • stale[bot] 1

author_association 3

  • MEMBER 6
  • CONTRIBUTOR 2
  • NONE 1

issue 1

  • CDO-like convenience methods to select times · 9 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
751265292 https://github.com/pydata/xarray/issues/557#issuecomment-751265292 https://api.github.com/repos/pydata/xarray/issues/557 MDEyOklzc3VlQ29tbWVudDc1MTI2NTI5Mg== stale[bot] 26384082 2020-12-25T15:49:52Z 2020-12-25T15:49:52Z NONE

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity

If this issue remains relevant, please comment here or remove the stale label; otherwise it will be marked as closed automatically

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  CDO-like convenience methods to select times 104484316
427335183 https://github.com/pydata/xarray/issues/557#issuecomment-427335183 https://api.github.com/repos/pydata/xarray/issues/557 MDEyOklzc3VlQ29tbWVudDQyNzMzNTE4Mw== shoyer 1217238 2018-10-05T11:33:45Z 2018-10-05T11:33:45Z MEMBER

dataset.time.dt.year.isin(elnino_years) should work already with.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  CDO-like convenience methods to select times 104484316
427291294 https://github.com/pydata/xarray/issues/557#issuecomment-427291294 https://api.github.com/repos/pydata/xarray/issues/557 MDEyOklzc3VlQ29tbWVudDQyNzI5MTI5NA== markelg 6883049 2018-10-05T08:46:12Z 2018-10-05T08:46:12Z CONTRIBUTOR

I though this issue was long forgotten ; ) The fact is that today I still call this seltime function very often in my code so I am glad to see it back, thank you.

I like the .isin(elnino_years) syntax, and I see that it is consistent with pandas. Something like dataset.time.year.isin(elnino_years) would be very nice too, right now is just a "to_index()" away, as this works: dataset.time.to_index().year.isin(elnino_years). However I am not sure about how this is related with multidimensional indexing, as time is only one dimension.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  CDO-like convenience methods to select times 104484316
427279439 https://github.com/pydata/xarray/issues/557#issuecomment-427279439 https://api.github.com/repos/pydata/xarray/issues/557 MDEyOklzc3VlQ29tbWVudDQyNzI3OTQzOQ== shoyer 1217238 2018-10-05T07:59:47Z 2018-10-05T08:00:28Z MEMBER

I don’t think there’s an easy way to this with vectorized indexing, but if we supported multidimensional indexing with boolean keys as proposed in https://github.com/pydata/xarray/issues/1887 (equivalent to where with drop=True) we could write pr_dataset[pr_dataset['time.year'].isin(elnino_years)]

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  CDO-like convenience methods to select times 104484316
427078606 https://github.com/pydata/xarray/issues/557#issuecomment-427078606 https://api.github.com/repos/pydata/xarray/issues/557 MDEyOklzc3VlQ29tbWVudDQyNzA3ODYwNg== rabernat 1197350 2018-10-04T16:12:43Z 2018-10-04T16:12:43Z MEMBER

Is there a way to do this easily now with vectorized indexing?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  CDO-like convenience methods to select times 104484316
137478562 https://github.com/pydata/xarray/issues/557#issuecomment-137478562 https://api.github.com/repos/pydata/xarray/issues/557 MDEyOklzc3VlQ29tbWVudDEzNzQ3ODU2Mg== markelg 6883049 2015-09-03T15:05:35Z 2015-09-04T09:49:54Z CONTRIBUTOR

I agree with your arguments against creating too many datetime specific methods. However, I am not convinced about using .sel with 'time.year', it looks a bit hackish to me. Maybe all "CDO methods" can be integrated in one single "seltimes" method. Please look at the function below. The time_coord arguments lets you choose the time coordinate that you want to filter, and by using numpy.logical_and is it possible to do the whole operation in one single function call. It could be turned into a method. What do you think?

``` python %matplotlib inline import xray import numpy as np from matplotlib import pyplot as plt

ifile = "HadCRUT.4.3.0.0.median.nc" ixd = xray.open_dataset(ifile) ```

You can download the example file from here http://www.metoffice.gov.uk/hadobs/hadcrut4/data/4.3.0.0/gridded_fields/HadCRUT.4.3.0.0.median_netcdf.zip Now comes the function.

``` python def seltime(ixd, time_coord, **kwargs): """ Select time steps by groups of years, months, etc.

Parameters
----------
ixd : xray.Dataset or xray.DataArray, input dataset. Will be "self" if 
the function is turned into a method.
time_coord: str, name of the time coordinate to use. Needed to avoid ambiguities.
**kwargs: String or interable: Currently years, months, days and hours are supported.

Returns
----------
xray.Dataset or xrat.DataArray filtered by the constraints defined
by the kwargs.
"""
ntimes = len(ixd.coords[time_coord])
time_mask = np.repeat(True, ntimes)

for time_unit, time_values in kwargs.iteritems():
    if time_unit not in ("year", "month", "day", "hour"):
        raise KeyError(time_unit)
    time_str = "{}.{}".format(time_coord, time_unit)
    time_mask_temp = np.in1d(ixd[time_str], time_values)
    time_mask = np.logical_and(time_mask, time_mask_temp)
oxd = ixd.sel(**{time_coord : time_mask})
return oxd

```

It works, and it looks fast enough.

python elnino_years = (1958, 1973, 1983, 1987, 1998) months = (1, 2, 3, 4) %timeit elnino_xd = seltime(ixd, "time", year=elnino_years, month=months) elnino_xd.time

``` 1000 loops, best of 3: 1.05 ms per loop

<xray.DataArray 'time' (time: 20)>
array(['1958-01-16T12:00:00.000000000Z', '1958-02-15T00:00:00.000000000Z',
       '1958-03-16T12:00:00.000000000Z', '1958-04-16T00:00:00.000000000Z',
       '1973-01-16T13:00:00.000000000+0100',
       '1973-02-15T01:00:00.000000000+0100',
       '1973-03-16T13:00:00.000000000+0100',
       '1973-04-16T01:00:00.000000000+0100',
       '1983-01-16T13:00:00.000000000+0100',
       '1983-02-15T01:00:00.000000000+0100',
       '1983-03-16T13:00:00.000000000+0100',
       '1983-04-16T02:00:00.000000000+0200',
       '1987-01-16T13:00:00.000000000+0100',
       '1987-02-15T01:00:00.000000000+0100',
       '1987-03-16T13:00:00.000000000+0100',
       '1987-04-16T02:00:00.000000000+0200',
       '1998-01-16T13:00:00.000000000+0100',
       '1998-02-15T01:00:00.000000000+0100',
       '1998-03-16T13:00:00.000000000+0100',
       '1998-04-16T02:00:00.000000000+0200'], dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 1958-01-16T12:00:00 1958-02-15 ...
Attributes:
    start_month: 1
    start_year: 1850
    end_month: 4
    end_year: 2015
    long_name: time
    standard_name: time
    axis: T

```

If we average in time we can see the warm tonge in the pacific.

python plt.figure(figsize=(12, 5)) elnino_xd.temperature_anomaly.mean("time").plot() plt.gca().invert_yaxis()

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  CDO-like convenience methods to select times 104484316
137257727 https://github.com/pydata/xarray/issues/557#issuecomment-137257727 https://api.github.com/repos/pydata/xarray/issues/557 MDEyOklzc3VlQ29tbWVudDEzNzI1NzcyNw== jhamman 2443309 2015-09-02T21:57:54Z 2015-09-02T21:57:54Z MEMBER

I'd also like to see this come into being.

My two cents on the syntax is that your second idea is best:

Python pr_dataset.sel('time.year', elnino_years)

I don't care much for option 3 since the variable name is being used as an attribute (what if my time variable is called "Times"?).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  CDO-like convenience methods to select times 104484316
137182504 https://github.com/pydata/xarray/issues/557#issuecomment-137182504 https://api.github.com/repos/pydata/xarray/issues/557 MDEyOklzc3VlQ29tbWVudDEzNzE4MjUwNA== shoyer 1217238 2015-09-02T17:37:56Z 2015-09-02T17:37:56Z MEMBER

Currently, the way to do this is to create a boolean indexer, with something like the following:

pr_dataset.sel(time=np.in1d(pr_dataset['time.year'], elnino_years))

I agree that this is overly verbose and we can come up with something better. I'm not quite happy with selyear, though: - It's ambiguous which datetime variable the year refers do (there can be more than one time variable on some datasets). - I'm also not a big fan of adding a bunch of new API methods that are datetime specific -- it creates a lot of noise (pandas has an issue with this).

Something like pr_dataset.sel(year=elnino_years) would be the ideal fix for this second concern (we've discussed this in another issue, can't remember which one now), but it's still ambiguous which time variable it refers to.

So, some other possible ways to spell this: 1. pr_dataset.sel('time', year=elnino_years) 2. pr_dataset.sel('time.year', elnino_years) 3. pr_dataset.sel.time.year(elnino_years)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  CDO-like convenience methods to select times 104484316
137134794 https://github.com/pydata/xarray/issues/557#issuecomment-137134794 https://api.github.com/repos/pydata/xarray/issues/557 MDEyOklzc3VlQ29tbWVudDEzNzEzNDc5NA== clarkfitzg 5356122 2015-09-02T15:32:17Z 2015-09-02T15:32:17Z MEMBER

Just curious- how would we currently do this?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  CDO-like convenience methods to select times 104484316

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 38.597ms · About: xarray-datasette