html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/790#issuecomment-211109010,https://api.github.com/repos/pydata/xarray/issues/790,211109010,MDEyOklzc3VlQ29tbWVudDIxMTEwOTAxMA==,2443309,2016-04-17T20:34:31Z,2016-04-17T20:34:31Z,MEMBER,"Thanks for the comments @IamJeffG. I haven't had any time recently to mess around with this so I haven't made any progress since the original notebook. > It's not good to assume a negative y-step size. Rarely, I will come across a dataset that breaks convention with a positive y coordinate, meaning the first pixel is the lower-left corner, but at least the dataset is self-consistent. Rasterio works beautifully even with these black sheep, so we don't want an xarray reader to force the assumption. Agreed. My notebook was just a quick example of how this could work and it would certainly benefit from some generalization when applying this as an xarray backend. > In a past life I made side library that wraps rasterio's API to take and return xarray.DataArrays. It provides IO/clip/warp/rasterize operations on DataArrays, which themselves are annotated with the CRS and affine transforms as attributes. Interesting. Any chance that's available for public viewing? > Even if xarray's new rasterio backend only provides a reader ... I only want to expose the reader and the necessary metadata to use the georeferenced dataset. Warping and other projection transformations would need to be handled downstream. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,140063713 https://github.com/pydata/xarray/issues/790#issuecomment-198474766,https://api.github.com/repos/pydata/xarray/issues/790,198474766,MDEyOklzc3VlQ29tbWVudDE5ODQ3NDc2Ng==,2443309,2016-03-18T18:04:52Z,2016-03-18T18:04:52Z,MEMBER,"@shoyer - that's what I was thinking too. In fact, that's more or less what I did in this example, although this is a eager implementation: https://anaconda.org/jhamman/rasterio_to_xarray/notebook ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,140063713 https://github.com/pydata/xarray/issues/790#issuecomment-198428399,https://api.github.com/repos/pydata/xarray/issues/790,198428399,MDEyOklzc3VlQ29tbWVudDE5ODQyODM5OQ==,1217238,2016-03-18T16:04:40Z,2016-03-18T16:04:40Z,MEMBER,"Because each point can be computed separately, we _could_ straightforwardly add latitude/longitude as lazily computed 2D arrays (under ""coordinates""), similarly to how we currently handle on-the-fly data rescaling. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,140063713 https://github.com/pydata/xarray/issues/790#issuecomment-197939841,https://api.github.com/repos/pydata/xarray/issues/790,197939841,MDEyOklzc3VlQ29tbWVudDE5NzkzOTg0MQ==,2443309,2016-03-17T15:44:48Z,2016-03-17T15:44:48Z,MEMBER,"As for 1) I'm open to having more discussion on decoding the coordinates. My contention here is that are useful, even in their unstructured format, since it permits visualization out of the box. I'll ping @perrygeo for more on this. 2) I don't really want to get into this because there isn't a standard treatment in geotiffs so it would, at best, be a guess on our end. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,140063713 https://github.com/pydata/xarray/issues/790#issuecomment-197601132,https://api.github.com/repos/pydata/xarray/issues/790,197601132,MDEyOklzc3VlQ29tbWVudDE5NzYwMTEzMg==,10050469,2016-03-16T23:22:10Z,2016-03-16T23:22:10Z,MEMBER,"Hi @jhamman , this is close to how I would've done it, but I am maybe not the most qualified (probably the gis specialists from rasterio would be more helpful). But still, a couple of remarks from my side: - I wouldn't necessarily do the `try_to_get_latlon_coords` systematically. When the raster coords are lat-lon, the new coords are redundant. And when the coords are x-y, the lat-lon info are only partly useful (since the grid will be unstructured in lat-lon). Furthermore, I am not sure if `+init=EPSG:4326` is the only lat-lon proj available (there are surely more - at least if you leave the wgs-84 area) - as mentioned by perrygeo in your rasterio post, the data model of geotiffs is not always clear. The pixel coordinates are very likely to be at the top-left corner of the pixel (as I assume in my small `salem` library). Most netcdf datasets we are using in the meteo/climate community are pixel-centered. I don't know if this is something that `xarray` wants to consider, but this becomes important if you want to make accurate projections. (in practice, the two concepts are equivalent for most applications, but you have to know what is what: in my small library I called those representations `center_grid` and `corner_grid`: https://github.com/fmaussion/salem/blob/master/salem/gis.py#L101 ) To your questions: 1. I agree that returning a dataset is a good idea. I don't know if `raster` is a good name, but I have no other idea right now 2. I don't know. The projection was always enough for me :flushed: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,140063713 https://github.com/pydata/xarray/issues/790#issuecomment-197573068,https://api.github.com/repos/pydata/xarray/issues/790,197573068,MDEyOklzc3VlQ29tbWVudDE5NzU3MzA2OA==,2443309,2016-03-16T22:05:34Z,2016-03-16T22:05:34Z,MEMBER,"@fmaussion - Here's an example of the basic functionality I'm thinking of implementing: https://anaconda.org/jhamman/rasterio_to_xarray/notebook A things to think about: 1. I've given each array the `raster` name. Does that make since? This allows us to return a `Dataset` instead of a `DataArray`. 2. Which attributes do we want to copy over from the rasterio dataset? It is not entirely clear which attributes in the `rasterio._io.RasterReader` object should become `attrs`. 3. I have not implemented lazy or windowed reading yet but it should be pretty straightforward using the `window` argument to `src.read()`. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,140063713 https://github.com/pydata/xarray/issues/790#issuecomment-196021945,https://api.github.com/repos/pydata/xarray/issues/790,196021945,MDEyOklzc3VlQ29tbWVudDE5NjAyMTk0NQ==,2443309,2016-03-13T19:06:03Z,2016-03-13T19:06:03Z,MEMBER,"@fmaussion - As for (1), I like your idea of leaving out the projection of the coordinates. That certainly makes things easier from the perspective of the backend. A `band` dimension in (2) seems pretty manageable. I'm not concerned about the GDAL dependency (3). I would love to see more robust conda support for GDAL but that's another issue. This would be an optional backend, similar to Pynio, which isn't broadly available on conda. We could sort out the CI issues. So, if we took the simplest approach for implementing a `rasterio` backend, `open_dataset` would always return a `Dataset` with a single unprojected `DataArray` (name to be determined). The other big question is what to call the dimensions, since that is not explicitly provided in all raster formats. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,140063713 https://github.com/pydata/xarray/issues/790#issuecomment-195955210,https://api.github.com/repos/pydata/xarray/issues/790,195955210,MDEyOklzc3VlQ29tbWVudDE5NTk1NTIxMA==,10050469,2016-03-13T13:13:56Z,2016-03-13T13:15:07Z,MEMBER,"Hi @jhamman , I tend to agree with your doubts. I'll still comment on your cons: To (1): I also think that xarray should avoid opening the projection can of worms. But the minimum things that xarray could do with rasterio is to read corner coordinates, dx and dy and define the two coordinates ""x"" and ""y"" out of it, without taking care of whether these are meters, degrees of arc or whatever. As long at the other rasterio file attributes are available as attribute of the `DataArray` or `DataSet` objects, users can do their own mixture To (2): some geotiffs files also have more than one band. I don't know if these bands are named or have metadata, so maybe xarray will have to take decisions about these names too (most probably 1, 2, 3...). I'll add a (3): rasterio depends on GDAL, which is huge and every now and then causes trouble on conda. This might also cause troubles to the continuous integration of xarray Altogether this might be more complicated than worth it, but maybe the rasterio folks have interest in this and might provide more support. If the idea for xarray accessors is implemented (https://github.com/pydata/xarray/issues/706#issuecomment-169099306) this will allow more specific libraries like mine to do their own rasterio support at low cost. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,140063713 https://github.com/pydata/xarray/issues/790#issuecomment-195676319,https://api.github.com/repos/pydata/xarray/issues/790,195676319,MDEyOklzc3VlQ29tbWVudDE5NTY3NjMxOQ==,2443309,2016-03-12T07:01:12Z,2016-03-12T07:01:12Z,MEMBER,"Thanks @fmaussion. This was a helpful illustration of how this could be done. The `salem` `GeoTiff` and `Grid` objects include all and more than I was hoping to implement in xarray. However, after a bit more looking into this, I have mixed feelings about whether this would work in `xarray`. A brief summary of the pros/cons as I see them now. Pros: - Rasterio supports a wide range of raster formats (e.g. GeoTiff, ArcInfo ASCII Grid, etc.) - Combined with `pyproj`, coordinate variables can be inferred - Supports windowed reading (and writing), this would fit in well with the chunking approach already taken by `xarray`. - Supports lazy loading of array values, this would fit in well with the loading policies of the other `xarray` backends. Cons: - Would require `xarray` to adopt conventions for projecting arrays, naming (coordinates, dimensions, arrays), and handling of raster attributes. I can image ways this could be done but it may take us in down a path we don't want to go. - The `xarray` backends generally return `Dataset`s, however, `rasterio` returns individual arrays that would better be applied to `DataArray`s. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,140063713 https://github.com/pydata/xarray/issues/790#issuecomment-195254611,https://api.github.com/repos/pydata/xarray/issues/790,195254611,MDEyOklzc3VlQ29tbWVudDE5NTI1NDYxMQ==,10050469,2016-03-11T08:25:04Z,2016-03-11T08:25:04Z,MEMBER,":+1: Rasterio shines at reading georeferencing metadata out of any file, and I guess it would be no big deal to treat the various info as attributes in an xarray dataset. It is also possible to do lazy reading out of rasterio files. (example with a geotiff file: https://github.com/fmaussion/salem/blob/master/salem/datasets.py#L263) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,140063713