home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where issue = 109202603 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 3

  • shoyer 3
  • monkeybutter 1
  • j08lue 1

author_association 3

  • MEMBER 3
  • CONTRIBUTOR 1
  • NONE 1

issue 1

  • Aggregating NetCDF files · 5 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
370381376 https://github.com/pydata/xarray/issues/597#issuecomment-370381376 https://api.github.com/repos/pydata/xarray/issues/597 MDEyOklzc3VlQ29tbWVudDM3MDM4MTM3Ng== j08lue 3404817 2018-03-05T10:49:19Z 2018-03-05T10:49:19Z CONTRIBUTOR

Can't this be closed?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Aggregating NetCDF files 109202603
144794506 https://github.com/pydata/xarray/issues/597#issuecomment-144794506 https://api.github.com/repos/pydata/xarray/issues/597 MDEyOklzc3VlQ29tbWVudDE0NDc5NDUwNg== shoyer 1217238 2015-10-01T17:30:55Z 2015-10-01T17:30:55Z MEMBER

So unfortunately there isn't an easy way to handle irregular data like this with xray. Before you put this stuff in a single dataset, you would need to align the time variables, probably by doing .reindex with the method argument, which lets you do nearest-neighbor interpolation, or by using .resample. That way, you could make, for example, a dataset with the latest image for each location at the start of each month.

The other option (probably also useful) is to only concatenate one spatial tile along time, so you don't need to do any interpolation or resampling. The should work out of the box with open_mfdataset.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Aggregating NetCDF files 109202603
144680733 https://github.com/pydata/xarray/issues/597#issuecomment-144680733 https://api.github.com/repos/pydata/xarray/issues/597 MDEyOklzc3VlQ29tbWVudDE0NDY4MDczMw== monkeybutter 2526498 2015-10-01T09:49:11Z 2015-10-01T09:59:56Z NONE

I have created the NetCDF files myself from geotiffs and I have made them so that there is no geographical overlapping between them. Basically each file contains a 1x1 degree area and a year worth of satellite data. The only problem that I can see with this approach is that the time dimension is different between files (the satellite covers different areas at different times). This might be problem when aggregating a big area because if the time dimension has to be homogenised it will be filled with basically no data over the whole area (sparse arrays). Depending how this sparsity is implemented it can fill memory pretty quickly.

Some of these files can be found at: http://dapds00.nci.org.au/thredds/catalog/uc0/rs0_dev/gdf_trial/20150709/LS5TM/catalog.html

A sample of open_dataset for a file is:

``` import xray dap_file = 'http://dapds00.nci.org.au/thredds/dodsC/uc0/rs0_dev/gdf_trial/20150709/LS5TM/LS5TM_1987_-34_147.nc' ds = xray.open_dataset(dap_file, decode_coords=False) print(ds)

<xray.Dataset> Dimensions: (latitude: 4000, longitude: 4000, time: 11) Coordinates: * time (time) datetime64[ns] 1987-05-27T23:26:36 1987-08-31T23:29:21 ... * latitude (latitude) float64 -33.0 -33.0 -33.0 -33.0 -33.0 -33.0 -33.0 ... * longitude (longitude) float64 147.0 147.0 147.0 147.0 147.0 147.0 147.0 ... Data variables: crs int32 ... B10 (time, latitude, longitude) float64 ... B20 (time, latitude, longitude) float64 ... B30 (time, latitude, longitude) float64 ... B40 (time, latitude, longitude) float64 ... B50 (time, latitude, longitude) float64 ... B70 (time, latitude, longitude) float64 ... Attributes: history: NetCDF-CF file created 20150709. license: Generalised Data Framework NetCDF-CF Test File spatial_coverage: 1.000000 degrees grid featureType: grid geospatial_lat_min: -34.0 geospatial_lat_max: -33.0 geospatial_lat_units: degrees_north geospatial_lat_resolution: -0.00025 geospatial_lon_min: 147.0 geospatial_lon_max: 148.0 geospatial_lon_units: degrees_east geospatial_lon_resolution: 0.00025 ```

Thank you very much for your help.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Aggregating NetCDF files 109202603
144577690 https://github.com/pydata/xarray/issues/597#issuecomment-144577690 https://api.github.com/repos/pydata/xarray/issues/597 MDEyOklzc3VlQ29tbWVudDE0NDU3NzY5MA== shoyer 1217238 2015-09-30T23:59:04Z 2015-09-30T23:59:04Z MEMBER

It would probably be helpful to show the results of printing the several of these datasets when opened via open_dataset.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Aggregating NetCDF files 109202603
144577524 https://github.com/pydata/xarray/issues/597#issuecomment-144577524 https://api.github.com/repos/pydata/xarray/issues/597 MDEyOklzc3VlQ29tbWVudDE0NDU3NzUyNA== shoyer 1217238 2015-09-30T23:57:38Z 2015-09-30T23:57:38Z MEMBER

What do these different time range and geographical regions look like?

If they are adjacent and non-overlapping, then xray could be a very good fit and I can help whip up an example to get you started. If this is not the case (and I know that can be an issue with satellite data), then it's going to be more awkward to put them together into a single logical Dataset. It might make more sense to work with individual Datasets, e.g., at the level of an individual satellite image.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Aggregating NetCDF files 109202603

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.966ms · About: xarray-datasette