home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

18 rows where issue = 206905158 and user = 10050469 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • fmaussion · 18 ✖

issue 1

  • Add RasterIO backend · 18 ✖

author_association 1

  • MEMBER 18
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
306445080 https://github.com/pydata/xarray/pull/1260#issuecomment-306445080 https://api.github.com/repos/pydata/xarray/issues/1260 MDEyOklzc3VlQ29tbWVudDMwNjQ0NTA4MA== fmaussion 10050469 2017-06-06T10:25:01Z 2017-06-06T10:25:01Z MEMBER

OK, let's get this one in and see what people will report about it.

Thanks @shoyer for your patience, @NicWayand and @jhamman for the original PR, @gidden for the testing/reviews and @sgillies for rasterio!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add RasterIO backend 206905158
305446339 https://github.com/pydata/xarray/pull/1260#issuecomment-305446339 https://api.github.com/repos/pydata/xarray/issues/1260 MDEyOklzc3VlQ29tbWVudDMwNTQ0NjMzOQ== fmaussion 10050469 2017-06-01T09:53:24Z 2017-06-01T09:53:24Z MEMBER

@gidden I just updated the documentation recipe to use an accessor to compute the lons and lats, let me know what you think

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add RasterIO backend 206905158
305428608 https://github.com/pydata/xarray/pull/1260#issuecomment-305428608 https://api.github.com/repos/pydata/xarray/issues/1260 MDEyOklzc3VlQ29tbWVudDMwNTQyODYwOA== fmaussion 10050469 2017-06-01T08:39:02Z 2017-06-01T08:39:02Z MEMBER

@gidden yes absolutely please give it a try, I think it's ready

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add RasterIO backend 206905158
305240650 https://github.com/pydata/xarray/pull/1260#issuecomment-305240650 https://api.github.com/repos/pydata/xarray/issues/1260 MDEyOklzc3VlQ29tbWVudDMwNTI0MDY1MA== fmaussion 10050469 2017-05-31T16:23:30Z 2017-05-31T16:23:30Z MEMBER

OK, all green.

Currently the rasterio tests are running on py36 only. Should I add rasterio to the other test suites as well?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add RasterIO backend 206905158
304137429 https://github.com/pydata/xarray/pull/1260#issuecomment-304137429 https://api.github.com/repos/pydata/xarray/issues/1260 MDEyOklzc3VlQ29tbWVudDMwNDEzNzQyOQ== fmaussion 10050469 2017-05-25T22:05:32Z 2017-05-25T22:05:41Z MEMBER

Thanks @shoyer , I have addressed all your comments but one which I didn't understand. Maybe we should wait for an answer of the rasterio devs about the dtype stuff before going on too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add RasterIO backend 206905158
303800141 https://github.com/pydata/xarray/pull/1260#issuecomment-303800141 https://api.github.com/repos/pydata/xarray/issues/1260 MDEyOklzc3VlQ29tbWVudDMwMzgwMDE0MQ== fmaussion 10050469 2017-05-24T17:48:05Z 2017-05-24T17:48:05Z MEMBER

This is ready for another round of reviews! I think this has come out quite nicely. Everything is much simpler now.

I have: - included all your comments - removed the GIS part - added an example on how to parse lons and lats in the new "recipes" section

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add RasterIO backend 206905158
303369876 https://github.com/pydata/xarray/pull/1260#issuecomment-303369876 https://api.github.com/repos/pydata/xarray/issues/1260 MDEyOklzc3VlQ29tbWVudDMwMzM2OTg3Ng== fmaussion 10050469 2017-05-23T11:29:39Z 2017-05-23T11:29:39Z MEMBER

Would it be better to use the string representation of the CRS internally after reading in?

Yes this was my intention.

BTW, your serialization above doesn't work because the variable "raster" also has a CRS attr. This problem will be solved by the next iteration of my code (when data arrays will be returned instead of datasets)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add RasterIO backend 206905158
303327051 https://github.com/pydata/xarray/pull/1260#issuecomment-303327051 https://api.github.com/repos/pydata/xarray/issues/1260 MDEyOklzc3VlQ29tbWVudDMwMzMyNzA1MQ== fmaussion 10050469 2017-05-23T08:23:59Z 2017-05-23T08:23:59Z MEMBER

@shoyer @gidden tjhanks for testing, this is very useful.

Indeed rasterio uses a dict-like mapping for the PROJ4 strings (source: rasterio docs). This isn't a big deal since it provides the to_string() and from_string() methods which allow the do the round-trip.

I personally never noted the difference because pyproj (the python interface to the PROJ.4 library) can handle both representations:

``` In [1]: import xarray as xr

In [2]: ds = xr.open_rasterio('RGB.byte.tif')

In [3]: ds.crs Out[3]: CRS({'init': 'epsg:32618'})

In [4]: import pyproj

In [5]: pyproj.Proj(ds.crs) Out[5]: <pyproj.Proj at 0x7f12317c0468>

In [6]: pyproj.Proj(ds.crs.to_string()) Out[6]: <pyproj.Proj at 0x7f12317c0408> ```

My suggestion here (to avoid the serialisation problems you mention @gidden ) is to convert the CRS object to a string at read time

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add RasterIO backend 206905158
303159959 https://github.com/pydata/xarray/pull/1260#issuecomment-303159959 https://api.github.com/repos/pydata/xarray/issues/1260 MDEyOklzc3VlQ29tbWVudDMwMzE1OTk1OQ== fmaussion 10050469 2017-05-22T17:01:58Z 2017-05-22T17:01:58Z MEMBER

Uh, right, this is obviously not a string!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add RasterIO backend 206905158
302949290 https://github.com/pydata/xarray/pull/1260#issuecomment-302949290 https://api.github.com/repos/pydata/xarray/issues/1260 MDEyOklzc3VlQ29tbWVudDMwMjk0OTI5MA== fmaussion 10050469 2017-05-21T17:04:33Z 2017-05-21T17:04:33Z MEMBER

Thanks for looking into this!

I'm a slightly reluctant to add a dedicated method for convert rasterio CRS objects into lat/lon arrays. It feels a little overly specialized.

OK, so this corresponds to my solution 3 above (do nothing). I will however add the relevant lines of code to the documentation so that users wanting to add lons and lats to their data can do so. This will leave room for solution 2 is someone has the time to do it later. (side note: "rasterio CRS objects" are in fact strings corresponding to a PROJ4 string that will always be understood by pyproj, rasterio, gdal, etc. Examples: '+proj=aea +lat_1=-18 +lat_2=-32 +lat_0=0 +lon_0=24 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs ' or 'EPSG:4326')

Why return a Dataset rather than a DataArray? The later feels a little more natural to me, given that a rasterio describes a single array.

Agreed. Again this is built out of discussions on https://github.com/pydata/xarray/issues/790, but now obsolete. This speaks even stronger against the use of a DataStore, right? Is there any class I should inherit from to create a DataArray from a rasterio file? I basically need an init() and a __getitem()...

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add RasterIO backend 206905158
302941653 https://github.com/pydata/xarray/pull/1260#issuecomment-302941653 https://api.github.com/repos/pydata/xarray/issues/1260 MDEyOklzc3VlQ29tbWVudDMwMjk0MTY1Mw== fmaussion 10050469 2017-05-21T14:54:19Z 2017-05-21T16:05:14Z MEMBER

Thanks @gidden for the comments! Will look into it.

Your questions about the add_latlon makes me think that a kwarg at read time isn't the right approach. Here are the scenarios where you don't want to have the lat/lon computed per defaut: - when your data's crs is already WGS84, in that case x and y are lons and lats already - when your file isn't georeferenced properly - when your use case doesn't need them (salem for example will make better use of crs than xarray) - when your file is large: computing lons and lats on a huge 2D grid is going to be prohibitively expensive.

This latter use case is important, because it might be useful for users to first subset their data and then compute the lat lons. For this use case we could go for two options in place of the kwarg: 1. add a top level utility function get_latlon_from_crs which would work on any dataset with a proper crs (and could be extended) 2. compute lons and lats lazily (only when asked for) 3. do nothing and let the users do their own cuisine (consistent with xarray's general purpose)

I don't know how to do 2 because it implies using dask to compute two related variables at the same time. Furthermore, 2 requires dask while 1 could be extended towards other datasets which have a crs.

Right now I tend towards 3 (because I use salem), although I guess that many users will benefit from 1...

@gidden @shoyer @benbovy : thoughts?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add RasterIO backend 206905158
302652106 https://github.com/pydata/xarray/pull/1260#issuecomment-302652106 https://api.github.com/repos/pydata/xarray/issues/1260 MDEyOklzc3VlQ29tbWVudDMwMjY1MjEwNg== fmaussion 10050469 2017-05-19T09:14:47Z 2017-05-19T09:14:47Z MEMBER

which is also why I was questioning reusing the existing xarray backends system.

I am starting to understand what you mean, but in the absence of template I guess this was the easiest way to go (I overtook the design of the original PR). If you agree I'd suggest you to have a more detailed look at the current PR when you have time, and we can decide what to do from here. Since the public facing API shouldn't be affected we could also keep the current design for now and go back to it later when https://github.com/pydata/xarray/pull/1087 is ready.

Just a small suggestion: to me open_raster seems a slightly better name than open_rasterio as the dataset is a 'raster', not a 'rasterio'.

Yes I thought about it too, but a vast majority of the datasets xarray is reading are raster datasets (although in NetCDF format), hence "open_raster" could be confusing. "open_rasterio" underlines the fact that this opens "all datasets rasterio can open". I have no strong opinion about this though

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add RasterIO backend 206905158
302255969 https://github.com/pydata/xarray/pull/1260#issuecomment-302255969 https://api.github.com/repos/pydata/xarray/issues/1260 MDEyOklzc3VlQ29tbWVudDMwMjI1NTk2OQ== fmaussion 10050469 2017-05-17T23:07:44Z 2017-05-17T23:10:11Z MEMBER

Folks, I finally managed to find a couple of hours to wrap this up: this is now ready for review.

Everything seems to work the way I'd like it to work, and only one thing is missing: the lazy computation of lons and lats with dask (I don't know how to do this quickly and I have not enough time to spend on this right now, unfortunately). This has been waiting for too long now, so I suggest to merge this when ready and this feature later on.

The solution retained for the API is to add a new open_rasterio top-level function: this makes sense since many keywords of open_dataset aren't relevant for rasterio datasets. It also underlines that rasterio datasets are quit different from xarray's data model.

another example I could add to the soon to come xarray gallery could be:

```python import xarray as xr import matplotlib.pyplot as plt import cartopy.crs as ccrs ds = xr.open_rasterio('RGB.byte.tif', add_latlon=True) ax = plt.subplot(projection=ccrs.PlateCarree()) ds.raster.sel(band=1).plot(ax=ax, x='lon', y='lat', transform=ccrs.PlateCarree()); ax.coastlines('10m'); ````

cc @gidden @jhamman @shoyer

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add RasterIO backend 206905158
284225363 https://github.com/pydata/xarray/pull/1260#issuecomment-284225363 https://api.github.com/repos/pydata/xarray/issues/1260 MDEyOklzc3VlQ29tbWVudDI4NDIyNTM2Mw== fmaussion 10050469 2017-03-05T12:42:45Z 2017-03-05T12:42:45Z MEMBER

@shoyer thanks for the tips, I think that getting the lons and lats from dask is probably the most elegant method.

After trying various things I am still struggling with dask, and in particular on how to apply elemwise to functions like np.meshgrid (I get shape broadcasting errors). To get me started, I'd be grateful for an example on how to use dask to replace the code snippet below:

```python import xarray as xr import numpy as np

ds = xr.DataArray(np.zeros((2, 3)), coords={'x': np.arange(3), 'y': np.arange(2)}, dims=['y', 'x']).to_dataset(name='data')

non-dask version

lon, lat = np.meshgrid(ds.x, ds.y) ds['lon'] = (('y', 'x'), lon) ds['lat'] = (('y', 'x'), lat) ds.set_coords(['lon', 'lat'], inplace=True) print(ds) ```

Thanks a lot!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add RasterIO backend 206905158
279686053 https://github.com/pydata/xarray/pull/1260#issuecomment-279686053 https://api.github.com/repos/pydata/xarray/issues/1260 MDEyOklzc3VlQ29tbWVudDI3OTY4NjA1Mw== fmaussion 10050469 2017-02-14T11:43:27Z 2017-02-14T11:43:27Z MEMBER

I made some progress with the lazy indexing, I'd be glad to have a first rough feedback on whether this is going in the right direction or not.

We have a decision to make regarding the API: I think that creating the optional lon and lat coords automatically is not a good idea: - in some cases, the x and y coordinates are already lons and lats and the 2D coords are obsolete - for big data files this is going to take ages and take a lot of memory - my initial idea to make them lazily evaluated might work, but in an ugly way: computing both lons and lats needs to be done in one single operation, and I'm not sure how this can be done in an elegant way - additionally, there is no way to make them show up as coordinates (as per @shoyer 's comment: https://github.com/pydata/xarray/pull/1260#issuecomment-279101252)

The current implementation delegates this task to a utility function (get_latlon_coords_from_crs). It is currently very rasterio specific, but could be made more general -- API to be defined.

Thoughts?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add RasterIO backend 206905158
279076276 https://github.com/pydata/xarray/pull/1260#issuecomment-279076276 https://api.github.com/repos/pydata/xarray/issues/1260 MDEyOklzc3VlQ29tbWVudDI3OTA3NjI3Ng== fmaussion 10050469 2017-02-10T21:52:35Z 2017-02-10T21:52:35Z MEMBER

thanks @shoyer , I think I can work on this a bit further now and I'll get back to you if I have more questions.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add RasterIO backend 206905158
279072956 https://github.com/pydata/xarray/pull/1260#issuecomment-279072956 https://api.github.com/repos/pydata/xarray/issues/1260 MDEyOklzc3VlQ29tbWVudDI3OTA3Mjk1Ng== fmaussion 10050469 2017-02-10T21:37:45Z 2017-02-10T21:37:45Z MEMBER

Can you clarify what you mean by an optional coordinate?

Yes sorry, I meant the have them listed as coordinates without * instead of Data variables.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add RasterIO backend 206905158
279064452 https://github.com/pydata/xarray/pull/1260#issuecomment-279064452 https://api.github.com/repos/pydata/xarray/issues/1260 MDEyOklzc3VlQ29tbWVudDI3OTA2NDQ1Mg== fmaussion 10050469 2017-02-10T21:00:16Z 2017-02-10T21:00:16Z MEMBER

Before I'll get more into details with what needs to be done with rasterio itself, I'd like to get some xarray internals ready first. No need to do a full review yet, but I'd appreciate help with the following points:

  1. The order of the variables and coordinates is random. The current __repr__ can look like this:

<xarray.Dataset> Dimensions: (band: 1, x: 4, y: 3) Coordinates: * x (x) float64 1.0 1.5 2.0 2.5 * band (band) int64 1 * y (y) float64 2.0 1.0 0.0 Data variables: lat (y, x) float64 2.0 2.0 2.0 2.0 1.0 1.0 1.0 1.0 0.0 0.0 0.0 0.0 raster (band, y, x) float32 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 ... lon (y, x) float64 1.0 1.5 2.0 2.5 1.0 1.5 2.0 2.5 1.0 1.5 2.0 2.5 Attributes: crs: CRS({'wktext': True, 'proj': 'longlat', 'no_defs': True, 'ellps': 'WGS84'}) transform: [1.0, 0.5, 0.0, 2.0, 0.0, -1.0] How is this possible?

  1. Is it possible to make the x and y coords lazy? Could you point me on a place in the code where this has been done already?

  2. I'd like to have the lon and lat variables listed as optional coordinates, and also make them lazy. Any hint?

Thanks a lot for your help, I'm afraid this is going to need a few iterations ;-)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add RasterIO backend 206905158

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 51.142ms · About: xarray-datasette