home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where author_association = "NONE", issue = 146182176 and user = 167164 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • naught101 · 4 ✖

issue 1

  • Multidimensional groupby · 4 ✖

author_association 1

  • NONE · 4 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
218675077 https://github.com/pydata/xarray/pull/818#issuecomment-218675077 https://api.github.com/repos/pydata/xarray/issues/818 MDEyOklzc3VlQ29tbWVudDIxODY3NTA3Nw== naught101 167164 2016-05-12T06:54:53Z 2016-05-12T06:54:53Z NONE

forcing_data.isel(lat=lat, lon=lon).values() returns a ValuesView, which scikit-learn doesn't like. However, forcing_data.isel(lat=lat, lon=lon).to_array().T seems to work..

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multidimensional groupby 146182176
218667702 https://github.com/pydata/xarray/pull/818#issuecomment-218667702 https://api.github.com/repos/pydata/xarray/issues/818 MDEyOklzc3VlQ29tbWVudDIxODY2NzcwMg== naught101 167164 2016-05-12T06:02:55Z 2016-05-12T06:02:55Z NONE

@shoyer: Where does times come from in that code?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multidimensional groupby 146182176
218654978 https://github.com/pydata/xarray/pull/818#issuecomment-218654978 https://api.github.com/repos/pydata/xarray/issues/818 MDEyOklzc3VlQ29tbWVudDIxODY1NDk3OA== naught101 167164 2016-05-12T04:02:43Z 2016-05-12T04:03:01Z NONE

Example forcing data:

<xarray.Dataset> Dimensions: (lat: 360, lon: 720, time: 2928) Coordinates: * lon (lon) float64 0.25 0.75 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75 ... * lat (lat) float64 -89.75 -89.25 -88.75 -88.25 -87.75 -87.25 -86.75 ... * time (time) datetime64[ns] 2012-01-01 2012-01-01T03:00:00 ... Data variables: SWdown (time, lat, lon) float64 446.5 444.9 445.3 447.8 452.4 456.3 ...

Where there might be an arbitrary number of data variables, and the scikit-learn input would be time (rows) by data variables (columns). I'm currently doing this:

``` python def predict_gridded(model, forcing_data, flux_vars): """predict model results for gridded data

:model: TODO
:data: TODO
:returns: TODO

"""
# set prediction metadata
prediction = forcing_data[list(forcing_data.coords)]

# Arrays like (var, lon, lat, time)
result = np.full([len(flux_vars),
                  forcing_data.dims['lon'],
                  forcing_data.dims['lat'],
                  forcing_data.dims['time']],
                 np.nan)
print("predicting for lon: ")
for lon in range(len(forcing_data['lon'])):
    print(lon, end=', ')
    for lat in range(len(forcing_data['lat'])):
        result[:, lon, lat, :] = model.predict(
            forcing_data.isel(lat=lat, lon=lon)
                        .to_dataframe()
                        .drop(['lat', 'lon'], axis=1)
        ).T
print("")
for i, fv in enumerate(flux_vars):
    prediction.update(
        {fv: xr.DataArray(result[i, :, :, :], 
                          dims=['lon', 'lat', 'time'],
                          coords=forcing_data.coords)
        }
    )

return prediction

```

and I think it's working (still debugging, and it's pretty slow running)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multidimensional groupby 146182176
218372591 https://github.com/pydata/xarray/pull/818#issuecomment-218372591 https://api.github.com/repos/pydata/xarray/issues/818 MDEyOklzc3VlQ29tbWVudDIxODM3MjU5MQ== naught101 167164 2016-05-11T06:24:11Z 2016-05-11T06:24:11Z NONE

I want to be able to run a scikit-learn model over a bunch of variables in a 3D (lat/lon/time) dataset, and return values for each coordinate point. Is something like this multi-dimensional groupby required (I'm thinking groupby(lat, lon) => 2D matrices that can be fed straight into scikit-learn), or is there already some other mechanism that could achieve something like this? Or is the best way at the moment just to create a null dataset, and loop over lat/lon and fill in the blanks as you go?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multidimensional groupby 146182176

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 209.835ms · About: xarray-datasette