home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where issue = 91109966 and user = 1217238 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • shoyer · 5 ✖

issue 1

  • multiple files - variable X not equal across datasets · 5 ✖

author_association 1

  • MEMBER 5
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
115933651 https://github.com/pydata/xarray/issues/443#issuecomment-115933651 https://api.github.com/repos/pydata/xarray/issues/443 MDEyOklzc3VlQ29tbWVudDExNTkzMzY1MQ== shoyer 1217238 2015-06-27T01:22:25Z 2015-06-27T22:44:02Z MEMBER

OK, I understand now. One of these files looks like:

<xray.Dataset> Dimensions: (lat: 39, lon: 59, mean_height_agl: 50, time: 1) Coordinates: * time (time) datetime64[ns] 2011-05-21T13:00:00 * lon (lon) float64 -29.0 -28.0 -27.0 -26.0 -25.0 -24.0 ... * lat (lat) float64 32.0 33.0 34.0 35.0 36.0 37.0 38.0 39.0 ... * mean_height_agl (mean_height_agl) float64 28.28 97.21 191.1 310.7 ... Data variables: ash_concentration (mean_height_agl, lat, lon) float64 9.583e-16 ... ash_mass_loading (lat, lon) float64 1.091e-11 1.091e-11 1.091e-11 ... ash_drydep (lat, lon) float64 4.086e-10 4.084e-10 4.08e-10 ... ash_wetdep (lat, lon) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... so2_concentration (mean_height_agl, lat, lon) float64 3.199e-13 ... so2_mass_loading (lat, lon) float64 2.602e-09 2.602e-09 2.602e-09 ...

The problem is that the mean_height_agl coordinate changes between each file.

Another interesting aspect of this file, which relates to how I as hoping to fix https://github.com/xray/xray/issues/438, is that it includes a time coordinate with length 1, but none of the other dataset variables use that coordinate.

This suggests to me that we need some sort of hook that can allow you to transform a single dataset before they are joined with open_mfdataset. Perhaps a preprocess argument? Then you could write, e.g.,:

``` python def fix_my_data(ds): return (ds.assign_coords( agl=('mean_height_agl', range(ds.dims['mean_height_agl']))) .swap_dims({'mean_height_agl': 'agl'}) .squeeze('time'))

ds = xray.open_mfdataset('*.nc', preprocess=fix_my_data) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multiple files - variable X not equal across datasets 91109966
115903258 https://github.com/pydata/xarray/issues/443#issuecomment-115903258 https://api.github.com/repos/pydata/xarray/issues/443 MDEyOklzc3VlQ29tbWVudDExNTkwMzI1OA== shoyer 1217238 2015-06-26T22:02:50Z 2015-06-26T22:02:50Z MEMBER

Do you concatenate these files along one of the existing axes or a new axis? This might require new API but should probably be supported.

Could you print two of these netCDF files that you want to automatically combine with open_mfdataset? I know they have the same structure but it's useful to see how/if the values differ.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multiple files - variable X not equal across datasets 91109966
115887945 https://github.com/pydata/xarray/issues/443#issuecomment-115887945 https://api.github.com/repos/pydata/xarray/issues/443 MDEyOklzc3VlQ29tbWVudDExNTg4Nzk0NQ== shoyer 1217238 2015-06-26T21:26:19Z 2015-06-26T21:26:19Z MEMBER

Marking this as a bug, I'll see if I can reproduce this with a similar dataset.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multiple files - variable X not equal across datasets 91109966
115492420 https://github.com/pydata/xarray/issues/443#issuecomment-115492420 https://api.github.com/repos/pydata/xarray/issues/443 MDEyOklzc3VlQ29tbWVudDExNTQ5MjQyMA== shoyer 1217238 2015-06-26T03:48:08Z 2015-06-26T03:48:08Z MEMBER

Do you get the error message if you specify the full path to this file in open_mfdataset?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multiple files - variable X not equal across datasets 91109966
115449452 https://github.com/pydata/xarray/issues/443#issuecomment-115449452 https://api.github.com/repos/pydata/xarray/issues/443 MDEyOklzc3VlQ29tbWVudDExNTQ0OTQ1Mg== shoyer 1217238 2015-06-26T01:01:45Z 2015-06-26T01:01:45Z MEMBER

Could you print two of the incompatible datasets? I'm not sure if there is a general pattern here (or not).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  multiple files - variable X not equal across datasets 91109966

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 170.836ms · About: xarray-datasette