home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

45 rows where issue = 341331807 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 10

  • djhoese 19
  • snowman2 10
  • dcherian 3
  • karimbahgat 3
  • fmaussion 3
  • shoyer 2
  • wy2136 2
  • philippjfr 1
  • benbovy 1
  • Geosynopsis 1

author_association 3

  • CONTRIBUTOR 29
  • MEMBER 9
  • NONE 7

issue 1

  • Add CRS/projection information to xarray objects · 45 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1279429113 https://github.com/pydata/xarray/issues/2288#issuecomment-1279429113 https://api.github.com/repos/pydata/xarray/issues/2288 IC_kwDOAMm_X85MQon5 dcherian 2448579 2022-10-14T20:27:26Z 2022-10-14T20:27:26Z MEMBER

I've proposed adding CRSIndex to rioxarray to experiment with propagating CRS info in existing workflows: https://github.com/corteva/rioxarray/issues/588

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
1211392529 https://github.com/pydata/xarray/issues/2288#issuecomment-1211392529 https://api.github.com/repos/pydata/xarray/issues/2288 IC_kwDOAMm_X85INGIR dcherian 2448579 2022-08-10T23:19:48Z 2022-08-10T23:19:48Z MEMBER

could be moved into some PandasMetaIndex helper class that would encapsulate one or more PandasIndex

Agreed.

reuse the CRS-related logic with other kinds of index structures (like kd-trees). I've been thinking a bit about the general issue of flexible geospatial xarray indexes but I'm not sure yet how best it could be solved.

Yeah I don't think the PandasMetaIndex is a good pathway for CRSIndex, since we'd want other underlying tree-like structures.

be supported with https://github.com/pydata/xarray/pull/6800

Hadn't seen that. That would be great!

the variables argument of Index.create_variables should be optional or not.

I actually didn't understand what I was supposed to do with variables :) . It would be nice to document in PandasIndex

There's some discussion in https://github.com/pydata/xarray/issues/4366 about adding a new .drop_indexes() (or drop_xindexes()?) method.

My takeaway was that on the API side, we should prioritize adding set_xindex and drop_indexes. These are critical for experiments.

Re: reduction to scalar, I agree it seems tricky.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
1211331828 https://github.com/pydata/xarray/issues/2288#issuecomment-1211331828 https://api.github.com/repos/pydata/xarray/issues/2288 IC_kwDOAMm_X85IM3T0 benbovy 4160723 2022-08-10T22:05:58Z 2022-08-10T22:05:58Z MEMBER

That's great @dcherian! Some comments (notably regarding your notes in your linked notebook):

A lot of boilerplate code in your CRSIndex class (and in RasterIndex) could be moved into some PandasMetaIndex helper class that would encapsulate one or more PandasIndex. Then CRSIndex could subclass it and just implement the CRS-related logic. This would be useful for many other indexes I think.

But as you suggest it, it would be nice if we could also reuse the CRS-related logic with other kinds of index structures (like kd-trees). I've been thinking a bit about the general issue of flexible geospatial xarray indexes but I'm not sure yet how best it could be solved.

Can we support passing a CRS object directly using **kwargs? Not at the moment. This is the data model: Indexes are constructed from a subset of Coordinate variables. all necessary information should be in a coordinate variable

This should be supported with #6800 (I need to re-submit a PR targeting main, see #6849). For the case of CRSIndex, I think it might be useful to support both ways (i.e., either set the CRS via **kwargs or get it from a coordinate variable).

Index.create_variables. set_xindex tries to pass a variables kwarg but other methods don't. Seems like a bug.

Yes this should be clarified, i.e., whether the variables argument of Index.create_variables should be optional or not. I think we can make it required but I need to check.

How do we delete an existing index?

There's some discussion in #4366 about adding a new .drop_indexes() (or drop_xindexes()?) method.

error message with join='exact' is very generic.

That's a tricky one to improve.

GroupBy doesn't propagate index.

Maybe related to #6836 ?

spatial_ref is confusing. A scalar index?

I agree. The index should probably be dropped in that case (i.e., reduction of both the x and y dimensions), leaving the spatial_ref coordinate as a non-indexed coordinate. But this may be tricky to handle in general (i.e., for any index). Ideally this should also be consistent with the behavior of the index for other operations like scalar selection.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
1205808926 https://github.com/pydata/xarray/issues/2288#issuecomment-1205808926 https://api.github.com/repos/pydata/xarray/issues/2288 IC_kwDOAMm_X85H3y8e snowman2 8699967 2022-08-04T21:56:28Z 2022-08-04T21:56:28Z CONTRIBUTOR

That's nice @dcherian :+1:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
1205793138 https://github.com/pydata/xarray/issues/2288#issuecomment-1205793138 https://api.github.com/repos/pydata/xarray/issues/2288 IC_kwDOAMm_X85H3vFy dcherian 2448579 2022-08-04T21:35:50Z 2022-08-04T21:35:50Z MEMBER

Here is an alpha version of a CRSIndex heavily drawing on @benbovy's RasterIndex https://github.com/dcherian/crsindex/blob/main/crsindex.ipynb

As a non-expert, I very arbitrarily chose to propagate CRS info using a spatial_ref scalar variable. When assigned to a dataset we get

so spatial_ref is bolded but is not a dimension, and is associated with an index.

{
    "total_count": 5,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 5,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
499498940 https://github.com/pydata/xarray/issues/2288#issuecomment-499498940 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQ5OTQ5ODk0MA== djhoese 1828519 2019-06-06T13:42:42Z 2019-06-06T13:44:00Z CONTRIBUTOR

@Geosynopsis Cool. Your library is the third library that does something similar to what's discussed here (at least recently created ones). I'm glad there are so many people who need this functionality. The packages are: My un-started geoxarray project where I've tried to move these types of conversations (https://github.com/geoxarray/geoxarray) and rioxarray (https://github.com/corteva/rioxarray) which combines xarray and rasterio and started by @snowman2. Given what your project is trying to do maybe you could add the geopandas functionality on to rioxarray instead of a separate package? Let's discuss in an issue on rioxarray if possible, feel free to start it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
499495979 https://github.com/pydata/xarray/issues/2288#issuecomment-499495979 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQ5OTQ5NTk3OQ== Geosynopsis 8723085 2019-06-06T13:34:47Z 2019-06-06T13:34:47Z NONE

May be late on the wagon, we were playing around with xarray dataset to address some general geospatial problems. We have published the preliminary library at xgeo. I would love to collaborate together if you guys have interest.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
419672386 https://github.com/pydata/xarray/issues/2288#issuecomment-419672386 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQxOTY3MjM4Ng== karimbahgat 6413369 2018-09-08T20:58:41Z 2018-09-08T20:58:41Z NONE

Agree about the documentation, so have vastly improved the documentation now, with a full list of examples for all functionality. Hopefully that makes it clearer how to use PyCRS and if it would be a good fit for your project. Have also added doctests of all the readme examples with Travis CI for both Py2 and 3. Appreciate all the issues raised and suggestions 👍 Thinking of releasing this as the first major stable version 1.0.0 on pypi for more reliablity, once the issues have been cleared up.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
417152163 https://github.com/pydata/xarray/issues/2288#issuecomment-417152163 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQxNzE1MjE2Mw== djhoese 1828519 2018-08-30T00:37:51Z 2018-08-30T00:37:51Z CONTRIBUTOR

@karimbahgat Thanks for the info and questions. As for xarray, it is a generic container format (array + dimensions + coordinates for those dimensions + attributes) but resembles the format of data stored in netcdf files. It can technically hold any N-dimensional data. This issue in particular is what is a good "standard" way for multiple libraries to represent CRS information in xarray's objects.

I think the lack of documentation is pycrs is my biggest hurdle right now as I don't know how I'm supposed to use the library, but I want to. It may also be that my use cases for CRS information are different than yours, but the structure of the package is not intuitive to me. But again a simple example of passing a PROJ.4 string to something and getting a CRS object would solve all that. I'll make some issues on pycrs when I get a chance (add travis/appveyor tests, add documentation, base classes for certain things, etc). For geotiff's CRS I think with most geotiff-reading libraries you load the CRS info as a PROJ.4 string.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
416811165 https://github.com/pydata/xarray/issues/2288#issuecomment-416811165 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQxNjgxMTE2NQ== karimbahgat 6413369 2018-08-29T03:21:49Z 2018-08-29T03:21:49Z NONE

Also, please let me know about your ideas for features you would like to see in a crs library, either here or raise as an issue. Creating library specific objects might be a good idea, if you want to give some use cases and reasonings. In what ways are library soecific crs objects difficult to create? Don’t they provide shortcuts for creating from proj4 strings?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
416809348 https://github.com/pydata/xarray/issues/2288#issuecomment-416809348 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQxNjgwOTM0OA== karimbahgat 6413369 2018-08-29T03:09:58Z 2018-08-29T03:09:58Z NONE

Hi. This project seems very interesting. Am i right in understanding it primarily provides a type of data structure/object that allows labelled access to gis raster data (plus additional dimensions, in most cases time i guess)? Maybe you could clarify which aspects of the concersation you wanted feedback on (it was impressingly long)? If youre thinking about how to represent crs in xarray, you’re right that PyCRS already does this. For a project that needs a container for crs objects from many possible sources, possibly with the option to modify the crs, plus flexibly converting to different representation formats, PyCRS does exactly that. Plus it’s as lightweight as it gets. It maps each parameter with an object, and all parameters and logical subgroupings can be accessed and changed from the toplevel CRS object. I need to add some more documentation on the structure of the CRS object, but just have a look at the code for now. One question: what types of sources/formats do xarray objects come from, is it primarily from netcdf files or also other ones? And what type of crs format does netcdf use? PyCRS can load from the main formats (proj4, esri/ogc wkt, various codes), but i have not yet added the geotiff specification. As for maintenance, it’s true that it hasnt been updated for a bit, though i just recently am trying to change that. However, for this particular project, most of my inactivity is honestly that it simply works. Crs format conversion should be a deterministic process, as they all refer to the same theoretical concepts. I’ve done some basic testing on converring between a variety of popular crs formats, and found that it gets most of them right. The downside is i’m not sure how widely it’s used so far, and i couldve gotten some of the parameter conversions wrong, so it may be that if you start using it users may find mistakes. The upside is that there are only so many parameters, and correcting mistakes should be an easy onetime fix.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
415982335 https://github.com/pydata/xarray/issues/2288#issuecomment-415982335 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQxNTk4MjMzNQ== djhoese 1828519 2018-08-25T16:52:47Z 2018-08-25T16:52:47Z CONTRIBUTOR

I wouldn't mind that. There seems like there is already a package that handles this: https://github.com/karimbahgat/PyCRS

@karimbahgat I'd love your input on this issue as a whole too.

There are a couple things that I had in mind for a "pycrs" library to handle that the PyCRS library doesn't do (converting to other libraries' CRS objects), but maybe that is a good thing.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
415849716 https://github.com/pydata/xarray/issues/2288#issuecomment-415849716 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQxNTg0OTcxNg== snowman2 8699967 2018-08-24T18:50:44Z 2018-08-24T18:50:44Z CONTRIBUTOR

I was thinking just a "simple" library with a CRS object with utilities to create the object from WKT, PROJ.4, CF projection parameters, etc... And could take the object and convert it to any of the other formats. I wouldn't add any transform/resampling code and instead be a codebase that can be used by libraries that do those things. This way it would be useful to many different projects (geoxarray, pyresample, cartopy, metpy, ...). Thoughts?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
415844110 https://github.com/pydata/xarray/issues/2288#issuecomment-415844110 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQxNTg0NDExMA== djhoese 1828519 2018-08-24T18:29:51Z 2018-08-24T18:29:51Z CONTRIBUTOR

The question is what is the purpose of that new package? I wouldn't mind a new package like that, but then that becomes something like what was discussed in https://github.com/pangeo-data/pangeo/issues/356. That package should probably cover CRS objects and Grid definitions. Then libraries like geoxarray, pyresample, cartopy, and metpy could all use that library. If that library ends up covering resampling/transforming at all I'd say I'll just absorb that logic in to pyresample and provide some new interfaces to things.

However, what does geoxarray become if it doesn't have that CRS logic in it. Just an xarray accessor? I guess that makes sense from a "Write programs that do one thing and do it well" philosophy point of view.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
415842467 https://github.com/pydata/xarray/issues/2288#issuecomment-415842467 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQxNTg0MjQ2Nw== snowman2 8699967 2018-08-24T18:23:57Z 2018-08-24T18:23:57Z CONTRIBUTOR

Sounds good 👍. I see the CRS code potentially being useful outside of geoxarray. What are your thoughts on moving the CRS specific code into its own package?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
415633883 https://github.com/pydata/xarray/issues/2288#issuecomment-415633883 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQxNTYzMzg4Mw== djhoese 1828519 2018-08-24T02:41:00Z 2018-08-24T02:41:00Z CONTRIBUTOR

FYI I've started a really basic layout of the CRS object in geoxarray: https://github.com/geoxarray/geoxarray

It doesn't actually do anything yet, but I copied all the utilities from pyresample that are useful (convert PROJ.4 to cartopy CRS, proj4 str to dict, etc). I decided that the CRS object should use the CF conventions naming for projection parameters based on a conversation I had with @dopplershift. The main factor being they are much more human readable than the PROJ.4 names. The issue with this is that I have much more experience dealing with PROJ.4 parameters.

I can probably also get a lot of information from metpy's CF plotting code: https://github.com/Unidata/MetPy/blob/master/metpy/plots/mapping.py

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
412410121 https://github.com/pydata/xarray/issues/2288#issuecomment-412410121 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQxMjQxMDEyMQ== shoyer 1217238 2018-08-13T05:12:05Z 2018-08-13T05:12:05Z MEMBER

Is there a way in xarray to associate these three arrays in a DataArray so that slicing is handled automatically but also not put the arrays in the coordinates?

Not yet, unfortunately, but this is what https://github.com/pydata/xarray/pull/2302 is trying to solve.

I could always add this logic myself to geoxarray's version of to_netcdf.

I think this would be the preferred approach.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
412302416 https://github.com/pydata/xarray/issues/2288#issuecomment-412302416 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQxMjMwMjQxNg== djhoese 1828519 2018-08-11T21:24:05Z 2018-08-11T21:24:05Z CONTRIBUTOR

Would anyone that is watching this thread hate if I made geoxarray python 3.6+? I doubt there are any features that are needed in 3.6, but also am not going to support python 2.

Additionally, @shoyer @fmaussion and any other xarray-dev, I've been thinking about the case where I have 2D image data and 2D longitude and latitude arrays (one lon/lat pair for each image pixel). Is there a way in xarray to associate these three arrays in a DataArray so that slicing is handled automatically but also not put the arrays in the coordinates? As mentioned above I don't want to put these lon/lat arrays in the .coords because they have to be fully computed if they are dask arrays (or at least that is my understanding). For my use cases this could mean a good chunk of memory being dedicated to these coordinates. From what I can tell my options are .coords or a .Dataset with all 3.

Similarly, is there any concept like a "hidden" coordinate where utilities like to_netcdf ignore the coordinate and don't write it? Maybe something like .coords['_crs'] = "blah blah blah"? I could always add this logic myself to geoxarray's version of to_netcdf.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
410770936 https://github.com/pydata/xarray/issues/2288#issuecomment-410770936 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQxMDc3MDkzNg== wy2136 8202276 2018-08-06T16:38:51Z 2018-08-06T16:38:51Z NONE

@djhoese Sure. It would be great to have collaborations in the future. The packages and libraries you guys are working on are awesome. We really need them in our research.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
410308698 https://github.com/pydata/xarray/issues/2288#issuecomment-410308698 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQxMDMwODY5OA== djhoese 1828519 2018-08-03T16:36:55Z 2018-08-03T16:36:55Z CONTRIBUTOR

@wy2136 Very cool. We have the ability in satpy (via pyresample) to create cartopy CRS objects and therefore cartopy plots from our xarray DataArray objects: https://github.com/pytroll/pytroll-examples/blob/master/satpy/Cartopy%20Plot.ipynb

It would be nice if we could work together in the future since it looks like you do a lot of the same stuff. When I make an official "geoxarray" library I think I'm going to make a "geo" accessor for some of these same operations (see above conversations).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
410276752 https://github.com/pydata/xarray/issues/2288#issuecomment-410276752 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQxMDI3Njc1Mg== wy2136 8202276 2018-08-03T14:47:31Z 2018-08-03T14:47:31Z NONE

Sorry for the confusion from the geoxarray package I have been using. It's currently still a simple personal package. I use it to register the geo accessor to do some calculations related to data on the longitude-latitude grids (area mean, for example) and make quick plots based on the geoplots package (https://github.com/wy2136/geoplots).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
410068839 https://github.com/pydata/xarray/issues/2288#issuecomment-410068839 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQxMDA2ODgzOQ== djhoese 1828519 2018-08-02T21:09:38Z 2018-08-02T21:09:38Z CONTRIBUTOR

For the user base I think if we can cover as many groups as possible that would be best. I know there are plenty of people who need to describe CRS information in their data, but don't use geotiffs and therefore don't really need rasterio/gdal. The group I immediately thought of was the metpy group which is why I talked to @dopplershift in the first place. The immediate need for this group (based on his scipy talk) will be people reading NetCDF files and putting the data on a cartopy plot. I think @dopplershift and I agreed that when it comes problems building/distributing software dealing with this type of data the cause is almost always gdal/libgdal. I'm in favor of making it optional if possible.

For the to_netcdf stuff I think anything that needs to be "adjusted" before writing to a NetCDF file can be handled by required users to call my_data.geo.to_netcdf(...). I'm not a huge fan of the accessor adding information automatically that the user didn't specifically request. Side effects on your data just for importing a module is not good. I will try to put together a package skeleton and lay out some of the stuff in my head in the next month, but am still catching up on work after SciPy and a week of vacation so not sure when exactly I'll get to it.

I just did a search for "geoxarray" on github and @wy2136's repositories came up where they are importing a geoxarray package. @wy2136, is there another geoxarray project that we are not aware of? Do you have anything else to add to this discussion?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
409566229 https://github.com/pydata/xarray/issues/2288#issuecomment-409566229 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwOTU2NjIyOQ== snowman2 8699967 2018-08-01T13:00:26Z 2018-08-01T13:00:26Z CONTRIBUTOR

Lots of good thoughts there.

I think a lot depends on who you plan on having for a user base. I like rasterio.crs.CRS, but it does require GDAL. I know GDAL is a heavy dependency (solves runtime problems and it a source of installation problems), but if your users are already using it, then it isn't too big of an ask. If not, you could look into something like pycrs. But, it looks like it hasn't been touched for a while (never a good sign). Or, there are other options, (requiring a more work & maintenance) such as re-creating the rasterio.crs.CRS object and using cython with a copy of only the spatial reference code from GDAL in the repo (might be better to create your own repo/package if you head in this direction). Or maybe someone could ask really nicely for the GDAL maintainers to consider making the spatial reference code a package on it's own (not likely, but would be really nice).

My preference to have the CRS object something created/retrieved by the accessor based on information in the file. If it is not, users will have to remove the CRS object when using to_netcdf() as the object will not be serializable and to_netcdf() will throw errors. I think allowing the user to set the CRS on the file (by adding the crs to .coords and grid_mapping to data_vars) by passing in a valid projection string (WKT, proj.4) geo.set_crs() would be a good idea. This way, when the user calls, to_netcdf() it will happily do what they want. In addition, the CRS will still be available later with the accessor when loading in with xr.open_dataset.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
408613922 https://github.com/pydata/xarray/issues/2288#issuecomment-408613922 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwODYxMzkyMg== djhoese 1828519 2018-07-28T15:06:04Z 2018-07-28T15:06:04Z CONTRIBUTOR

I was talking with @dopplershift the other day on gitter and he brought up a very important point: no matter how CRS information is represented the user should be able to access the individual parameters (reference longitude, datum, etc). This lead me to think that a new CRS class is probably needed, even though I wanted to avoid it, because it would likely be one of the easiest ways to provide access to the individual parameters. There are already cartopy CRS objects that IMO are difficult to create and rasterio CRS objects that require gdal which is a pretty huge dependency to require users to install just to describe their data. That said, I think no matter how it is coded I don't want to duplicate all the work that has been done in rasterio/gdal for handling WKT and converting between different CRS formats.

The other thing I've been pondering during idle brain time is: is it better for this library to require an xarray object to have projection information described in one and only one way (a CRS object instance for example) or does the xarray accessor handling multiple forms of this projection information. Does having a CRS object in .coords allow some functionality that a simple string would not have? Does not having a required .coords CRS element stop the accessor from adding one later? In the latter case of the accessor parsing existing attrs/coords/dims of the xarray object, I was thinking it could handle a PROJ.4 string and the CF "grid_mapping" specification to start. The main functionality that would be available here is that with little to no work a user could import geoxarray and have access to this to whatever functionality can be provided in a .geo accessor. Or if they load a netcdf file with xr.open_dataset there is no extra work required for a user to supply that data to another library that uses geoxarray.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
407857617 https://github.com/pydata/xarray/issues/2288#issuecomment-407857617 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNzg1NzYxNw== snowman2 8699967 2018-07-25T18:49:57Z 2018-07-25T18:49:57Z CONTRIBUTOR

The example I gave was just demonstrating that the dimension is not required for the crs coordinate variable.

I agree with the functionality that would support standardizing the crs as a coordinate variable on load for the case you specified. That way, the read in file will always behave the same and writing to netCDF it would be output correctly.

This all sounds like it is heading in a good direction. 👍

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
407753531 https://github.com/pydata/xarray/issues/2288#issuecomment-407753531 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNzc1MzUzMQ== djhoese 1828519 2018-07-25T13:26:45Z 2018-07-25T13:26:45Z CONTRIBUTOR

I was talking about open_dataset not reading standard CF files in the way we want, at least not the way it is now. I understand that setting the CRS in .coords will write out the CRS when you use to_netcdf. The issue is that a standard CF netcdf file created by someone else that strictly follows the CF standard will not be read in the same way. Put another way, you could not load a CF NetCDF file with open_dataset and write it out with to_netcdf and get the same output.

Also note that having the grid_mapping as a coordinate in xarray object results in it being listed in the coordinates attribute in the output netcdf file which is technically not part of the CF standard.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
407748105 https://github.com/pydata/xarray/issues/2288#issuecomment-407748105 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNzc0ODEwNQ== snowman2 8699967 2018-07-25T13:08:25Z 2018-07-25T13:08:25Z CONTRIBUTOR

It is not in the dimension, it is the coordinate attribute in the variable. That is handled automatically by xarray when writing to_netcdf if you add the crs to the coords.

From the ncdump: ``` dimensions: x = 65 ; y = 31 ; variables: double x(x) ; x:_FillValue = NaN ; x:long_name = "x coordinate of projection" ; x:standard_name = "projection_x_coordinate" ; x:units = "m" ; double y(y) ; y:_FillValue = NaN ; y:long_name = "y coordinate of projection" ; y:standard_name = "projection_y_coordinate" ; y:units = "m" ; int64 time ; time:units = "seconds since 2015-04-03T17:55:19" ; time:calendar = "proleptic_gregorian" ; int64 spatial_ref ; spatial_ref:spatial_ref = "PROJCS[\"UTM Zone 15, Northern Hemisphere\",GEOGCS[\"WGS 84\",DATUM[\"WGS_1984\",SPHEROID[\"WGS 84\",6378137,298.257223563,AUTHORITY[\"EPSG\",\"7030\"]],AUTHORITY[\"EPSG\",\"6326\"]],PRIMEM[\"Greenwich\",0,AUTHORITY[\"EPSG\",\"8901\"]],UNIT[\"degree\",0.0174532925199433,AUTHORITY[\"EPSG\",\"9122\"]],AUTHORITY[\"EPSG\",\"4326\"]],PROJECTION[\"Transverse_Mercator\"],PARAMETER[\"latitude_of_origin\",0],PARAMETER[\"central_meridian\",-93],PARAMETER[\"scale_factor\",0.9996],PARAMETER[\"false_easting\",500000],PARAMETER[\"false_northing\",0],UNIT[\"Meter\",1]]" ; double ndvi(y, x) ; ndvi:_FillValue = NaN ; ndvi:grid_mapping = "spatial_ref" ; ndvi:coordinates = "spatial_ref time" ;

// global attributes: :creation_date = "2018-04-11 13:14:55.401183" ; ```

It would definitely be a good idea to ensure that the crs variable is a coordinate, so I agree that having support for that would be a good idea.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
407742230 https://github.com/pydata/xarray/issues/2288#issuecomment-407742230 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNzc0MjIzMA== djhoese 1828519 2018-07-25T12:46:57Z 2018-07-25T12:50:04Z CONTRIBUTOR

The files I have created have the crs coordinate variable inside

Ok so the netcdf files that you have created and are reading with xarray.open_dataset have grid_mapping set to "crs" for your data variables, right? Do you also include a special "crs" dimension? I believe having this dimension would cause xarray to automatically consider "crs" a coordinate, but this is not CF standard from what I can tell. As I mentioned in your other issue the CF standard files I have for GOES-16 ABI L1B data do not have this "crs" dimension (or similarly named dimension) which means that the variable specified by the grid_mapping attribute is not considered a coordinate for the associated DataArray/Dataset.

This means that to properly associate a CRS with a DataArray/Dataset this new library would require its own version of open_dataset to assign these things correctly based on grid_mapping. Since the library would require users to use this function instead of xarray's then I don't think it would be out of the question for it to also have a custom to_netcdf method if we chose to use a non-CF representation of the CRS information. Not saying I feel strongly about it, just pointing out that it isn't a huge leap to require users to use the new/custom methods.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
407614809 https://github.com/pydata/xarray/issues/2288#issuecomment-407614809 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNzYxNDgwOQ== snowman2 8699967 2018-07-25T02:39:34Z 2018-07-25T02:39:34Z CONTRIBUTOR

That is interesting, I am definitely not an expert with non-uniform datasets. From the satellite datasets I have used, the 2D latitude and longitude coordinates are stored in the datasets and are not super useful. I usually have to use other ways to recreate the grid coordinates in the original projection (ex. SMAP uses the EASE Grid 2.0 but it stores the latitude/longitude of the points in the file) or reproject & flatten the coordinates. I have had to do this with weather data and made an xarray extension pangaea to handle it. So, that is what I was referring to when I misunderstood your question.

For your example of adding a crs attribute, ...

The files I have created have the crs coordinate variable inside the netCDF file already and it is always there when I load it in with xarray.open_dataset(). The method set_crs() could be used to add the crs coordinate variable and grid_mapping attributes to the dataset in the proper way so that it would be there on xarray.open_dataset() after dumping it to the file with to_netcdf().

The CF stuff is supported by rasterio, GDAL, QGIS and that is why I like it. If there is another way that is as well supported, I am not opposed to it.

In your example of methods is to_projection a remapping/resampling operation? If not, how does it differ from set_crs?

The to_projection() method would be a reproject/resampling operation.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
407564039 https://github.com/pydata/xarray/issues/2288#issuecomment-407564039 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNzU2NDAzOQ== djhoese 1828519 2018-07-24T21:52:01Z 2018-07-24T21:52:50Z CONTRIBUTOR

Regarding non-uniform datasets, I think we have a small misunderstanding. I'm talking about things like data from polar-orbiting satellites where the original data is only geolocated by longitude/latitude values per pixel and the spacing between these pixels is not uniform so you need every original longitude and latitude coordinate to properly geolocate the data (data, longitude, and latitude arrays all have the same shape). When it comes to the topics in this issue this is an problem because you would expect the lat/lon arrays to be set as coordinates but if you are dealing with dask arrays that means that these values are now fully computed (correct me if I'm wrong).

For your example of adding a crs attribute, I understand that that is how one could do it, but I'm saying it is not already done in xarray's open_dataset. In my opinion this is one of the biggest downsides of the CF way of specifying projections, they are a special case that doesn't fit the rest of the NetCDF model well (a scalar with all valuable data in the attributes that is indirectly specified on data variables).

In your example of methods is to_projection a remapping/resampling operation? If not, how does it differ from set_crs?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
407546587 https://github.com/pydata/xarray/issues/2288#issuecomment-407546587 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNzU0NjU4Nw== snowman2 8699967 2018-07-24T20:47:23Z 2018-07-24T20:48:17Z CONTRIBUTOR

In your own projects and use of raster-like data, do you ever deal with non-uniform/non-projected data?

I have dealt with non-uniform data in the geographic projection. I have found it easiest to deal with it if you can determine the original projection and project the coordinates back to that projection so it is uniform. But, I am by no means an expert in this arena. Most if the time I work "normal" data.

How do you prefer to handle/store individual lon/lat values for each pixel?

rasterio/GDAL/QGIS all seem to use the centroid.

Also it looks like xarray would have to be updated to add the "crs" coordinate since currently it is not considered a coordinate variable. So a new library may need to have custom to_netcdf/open_dataset methods, right?

Actually, it is not difficult to add as it stands: python ds.coords['crs'] = 0 ds.coords['crs'].attrs = dict(spatial_ref="PROJCS["UTM Zone 15, Northern Hemisphere",GEOGCS["WGS 84",D...") But, if a crs does not already exist on the dataset, I guess that a function that adds the crs properly would be useful so it can also add the grid_mapping to all of the variables.

Example: python ds.geo.set_crs("+init=epsg:4326")

I think that minor modifications will be needed once the crs is set properly on the xarray dataset. Because after that, the to_netcdf() will automatically produce georeferenced datasets.

I could see the first pass of the extension/library simply performing: 1. ds.geo.crs 2. ds.geo.set_crs() 3. ds.geo.to_projection()

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
407522046 https://github.com/pydata/xarray/issues/2288#issuecomment-407522046 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNzUyMjA0Ng== djhoese 1828519 2018-07-24T19:21:05Z 2018-07-24T19:21:05Z CONTRIBUTOR

@snowman2 Awesome. Thanks for the info, this is really good stuff to know. In your own projects and use of raster-like data, do you ever deal with non-uniform/non-projected data? How do you prefer to handle/store individual lon/lat values for each pixel? Also it looks like xarray would have to be updated to add the "crs" coordinate since currently it is not considered a coordinate variable. So a new library may need to have custom to_netcdf/open_dataset methods, right?

It kind of seems like a new library may be needed for this although I was hoping to avoid it. All of the conversions we've talked about could be really useful to a lot of people. I'm not aware of an existing library that handles these conversions as one of its main purposes and they always end up as a "nice utility" that helps the library as a whole. It seems like a library to solve this issue should be able to do the following:

  1. Store CRS information in xarray objects
  2. Write properly geolocated netcdf and geotiff files from xarray objects.
  3. Read netcdf and geotiff files as properly described xarray objects.
  4. Convert CRS information from one format to another: WKT, EPSG (if available), PROJ.4 str/dict, rasterio CRS, cartopy CRS
  5. Optionally (why not) be able to resample datasets to other projections.

Beyond reading/writing NetCDF and geotiff files I would be worried that this new library could easily suffer from major scope creep. Especially since this is one of the main purposes of the satpy library, even if it is dedicated to satellite imagery right now. @snowman2 I'm guessing the data cube project has similar use cases. If the reading/writing is limited to a specific set of formats then I could see pyresample being a playground for this type of functionality. The main reason for a playground versus a new from-scratch package would be the use of existing utilities in pyresample assuming resampling is a major feature of this new specification. Yet another braindump...complete.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
407404131 https://github.com/pydata/xarray/issues/2288#issuecomment-407404131 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNzQwNDEzMQ== snowman2 8699967 2018-07-24T13:21:31Z 2018-07-24T13:32:42Z CONTRIBUTOR

Here is an example of how it would look on a dataset: <xarray.Dataset> Dimensions: (x: 65, y: 31) Coordinates: * x (x) float64 ... * y (y) float64 ... time datetime64[ns] ... crs int64 ... Data variables: ndvi (y, x) float64 ... Attributes: Here is how the crs or spatial_ref coodinate variable would look: <xarray.DataArray 'crs' ()> array(0) Coordinates: time datetime64[ns] ... crs int64 0 Attributes: spatial_ref: PROJCS["UTM Zone 15, Northern Hemisphere",GEOGCS["WGS 84",D... And here is how it would look on the variables: <xarray.DataArray 'ndvi' (y: 31, x: 65)> array([[ ...]]) Coordinates: * x (x) float64 ... * y (y) float64 ... time datetime64[ns] ... crs int64 0 Attributes: grid_mapping: crs

@djhoese Whether or not we use the CF convention is not what I am concerned about. What I think would benefit the most people is with the file format to be able to do to_netcdf() with the file and be able to have it read in with standard GIS tools such as rasterio, GDAL, and QGIS. With this schema, this is possible.

Another benefit is that it keeps the crs or spatial_ref with it when you do your operations as it is a coordinate of the variable.

Also, as a side note if you use the center pixel coordinates, then GDAL, rasterio, and QGIS are able to read in the file and determine it's affine/transform without a problem.

For the new library, if you have a crs method attached to it, it isn't too difficult to convert it to whatever format you need it to be in. With the rasterio.crs.CRS.from_string you can "Make a CRS from an EPSG, PROJ, or WKT string" and you can also get back the EPSG, PROJ, or WKT string with a simple method call.

For example, using the recommended method to extend xarray, you could add a crs property:

python from rasterio.crs import CRS ........ @property def crs(self): """:obj:`rasterio.crs.CRS`: Projection from `xarray.DataArray` """ if self._crs is not None: return self._crs try: # look in grid_mapping self._crs = CRS.from_string(self._obj.coords[self._obj.grid_mapping].spatial_ref) except AttributeError: raise ValueError("Spatial reference not found.") return self._crs

And if you call your extension geo, all you would need to get the CRS would be: python ds.geo.crs To get proj.4 string: python ds.geo.crs.to_string() To get WKT string: python ds.geo.crs.wkt To get EPSG code: python ds.geo.crs.to_epsg()

{
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
407394381 https://github.com/pydata/xarray/issues/2288#issuecomment-407394381 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNzM5NDM4MQ== djhoese 1828519 2018-07-24T12:47:55Z 2018-07-24T12:47:55Z CONTRIBUTOR

@snowman2 I thought about that too, but here are the reasons I came up with for why this might not be the best idea:

  1. CF conventions change over time and depending on which version of the standard you are using, things can be represented differently. This would tie a geoxarray-like library to a specific version of the CF standard which may be confusing and would require adjustments when writing to a NetCDF file to match the user's desired version of the standard.
  2. Using a CF standard CRS description would require conversion to something more useful for just about every use case (that I can think of) that isn't saving to a netcdf file. For example, a PROJ.4 string can be passed to pyproj.Proj or in the near future cartopy to convert to a cartopy CRS object.
  3. If we have to add more information to the crs coordinate to make it more useful like a PROJ.4 string then we end up with multiple representations of the same thing, making maintenance of the information harder.

The result of this github issue should either be a new package that solves all (90+%) of these topics or an easy to implement, easy to use, geolocation description best practice so that libraries can more easily communicate. I think with the CF standard CRS object we would definitely need a new library to provide all the utilities for converting to and from various things.

Lastly, I don't know if I trust CF to be the one source of truth for stuff like this. If I've missed some other obvious benefits of this or if working with WKT or the CF standard CRS attributes isn't actually that complicated let me know.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
407249087 https://github.com/pydata/xarray/issues/2288#issuecomment-407249087 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNzI0OTA4Nw== snowman2 8699967 2018-07-24T01:19:26Z 2018-07-24T01:19:26Z CONTRIBUTOR

I am really excited about this discussion. I know of other libraries that have done the same thing and have written internal libraries myself.

If possible, I would hope that we could follow the CF convention on this as it makes the output netCDF file compatible with QGIS, GDAL, and rasterio when written using to_netcdf()

To do so, you add the crs coordinate to the dataset/dataarray. int crs; And then, you add the spatial_ref attribute to the crs which is a crs WKT string.

Next, you add the grid_mapping attribute to all associated variables that contains crs as the grid_mapping.

See an example here.

After that, you could store all kinds of information inside the crs variable such as the proj4 string, the affine, etc.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
406704673 https://github.com/pydata/xarray/issues/2288#issuecomment-406704673 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNjcwNDY3Mw== djhoese 1828519 2018-07-20T19:30:53Z 2018-07-20T19:30:53Z CONTRIBUTOR

I've thought about this a little more and I agree with @fmaussion that this doesn't need to be added to xarray. I think if "we", developers who work with projected datasets, can agree that "crs" in an xarray objects coordinates is a PROJ.4 string then that's half the battle of passing them between libraries. If not a PROJ.4 string, other ideas (dict?)?

I initially had the idea to start a new geoxarray type library but the more I thought about what features I would want in it, it started looking a lot like a new interface on pyresample via an xarray accessor. If not an accessor then a subclass but that defeats the purpose (easy collaboration between libraries). I'd also like to use the name "geo" for the accessor but have a feeling that won't jive well with everyone so I will likely fall back to "pyresample".

One thing that just came to mind while typing this that is another difficulty is that there will still be the need to have an object like pyresample's AreaDefinition to represent a geographic region (projection, extents, size). These could then be passed to things like a .resample method as a target projection or slicing based on another projection's coordinates.

When I started typing this I thought I had it all laid out in my head, not anymore. 😢

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
405616219 https://github.com/pydata/xarray/issues/2288#issuecomment-405616219 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNTYxNjIxOQ== djhoese 1828519 2018-07-17T15:06:54Z 2018-07-17T15:06:54Z CONTRIBUTOR

@shoyer I haven't read all of #1092 but that is another related issue for satpy where some satellite data formats use groups in NetCDF files which makes it difficult to use xr.open_dataset to access all the variables inside the file.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
405420045 https://github.com/pydata/xarray/issues/2288#issuecomment-405420045 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNTQyMDA0NQ== shoyer 1217238 2018-07-17T00:18:58Z 2018-07-17T00:18:58Z MEMBER

But I guess that is intended behavior and if the crs is a coordinate then joining things from different projections would not be allowed and raise an exception. However that is exactly what satpy wants/needs to handle in some cases (satellite datasets at different resolutions, multiple 'regions' of from the same overall instrument, two channels from the same instrument with slightly shifted geolocation, etc).

I think it would make more sense to think about using multiple xarray.Dataset objects for these use cases, possibly in some sort of hierarchical collection. The notion of xarray.Dataset is pretty closely tied to a single grid. The discussion in https://github.com/pydata/xarray/issues/1092 is definitely worth reading.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
405263631 https://github.com/pydata/xarray/issues/2288#issuecomment-405263631 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNTI2MzYzMQ== djhoese 1828519 2018-07-16T14:20:43Z 2018-07-16T14:20:43Z CONTRIBUTOR

@fmaussion I guess you're right. And that set of attributes to keep during certain operations would be very nice in my satpy library. We currently have to do a lot of special handling of that.

The one thing that a crs coordinate (PROJ.4 dict or str) doesn't handle is specifying what other coordinates define the X/Y projection coordinates. This logic also helps with non-uniform datasets where a longitude and latitude coordinate are needed. Of course, a downstream library could just define some type of standard for this. However, there are edge cases where I think the default handling of these coordinates by xarray would be bad. For example, satpy doesn't currently use Dataset objects directly and only uses DataArrays because of how coordinates have to be handled in a Dataset:

```In [3]: a = xr.DataArray(np.zeros((5, 10), dtype=np.float32), coords={'y': np.arange(5.), 'x': np.arange(10.)}, dims=('y', 'x'))

In [4]: b = xr.DataArray(np.zeros((5, 10), dtype=np.float32), coords={'y': np.arange(2., 7.), 'x': np.arange(2., 12.)}, dims=('y', 'x'))

In [6]: ds = xr.Dataset({'a': a, 'b': b})

In [7]: ds.coords Out[7]: Coordinates: * y (y) float64 0.0 1.0 2.0 3.0 4.0 5.0 6.0 * x (x) float64 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 ```

But I guess that is intended behavior and if the crs is a coordinate then joining things from different projections would not be allowed and raise an exception. However that is exactly what satpy wants/needs to handle in some cases (satellite datasets at different resolutions, multiple 'regions' of from the same overall instrument, two channels from the same instrument with slightly shifted geolocation, etc). I'm kind of just thinking out loud here, but I'll think about this more in my idle brain cycles today.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
405166458 https://github.com/pydata/xarray/issues/2288#issuecomment-405166458 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNTE2NjQ1OA== fmaussion 10050469 2018-07-16T07:23:53Z 2018-07-16T07:23:53Z MEMBER

Correct me if I'm wrong, but from the xarray side it would already be enough if there is a way in xarray to have a special crs attribute which is preserved by all operations. What the crs really is shouldn't bother xarray, and it can be specific to the downstream library [1]. With a crs and xarray's coordinates the geoloc can be sorted out in almost all cases, right?

The solution currently is to store crs as a scalar coordinate as mentioned above.

What could also be a possibility (without knowing how the internals would look like) is to have a registry of "special attributes" names which would always be preserved by xarray's operations. This registry would live in xarray and can be updated by downstream libraries and/or accessors. (xref: https://github.com/pydata/xarray/issues/1614 ).

[1] : my preference goes for a simple PROJ4 string

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
405106421 https://github.com/pydata/xarray/issues/2288#issuecomment-405106421 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNTEwNjQyMQ== fmaussion 10050469 2018-07-15T17:42:21Z 2018-07-16T07:18:44Z MEMBER

This has been discussed from time to time, although not in such an extensive way: thanks!

For me, the long list of discussion points that you mention is a clear argument for a dedicated package which will have to solve all these issues. A good example of difficulty to tackle is the fact that geo libraries often do not use the same standards (see the incompatibility between pyproj and cartopy.crs objects as a striking example).

Regarding the xarray side: I also brought this up for my own geo-lib a while ago, and it seems that the cleanest solution to carry a crs info is to store it as (scalar) coordinate instead of an attribute, which are lost in arithmetic computations (see @shoyer 's comment).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
405113116 https://github.com/pydata/xarray/issues/2288#issuecomment-405113116 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNTExMzExNg== fmaussion 10050469 2018-07-15T19:39:28Z 2018-07-15T19:39:28Z MEMBER

geopandas would be a great template for a geoxarray package ;-)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
405110142 https://github.com/pydata/xarray/issues/2288#issuecomment-405110142 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNTExMDE0Mg== djhoese 1828519 2018-07-15T18:48:01Z 2018-07-15T18:48:01Z CONTRIBUTOR

Also I should add the geopandas library as another reference.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
405109909 https://github.com/pydata/xarray/issues/2288#issuecomment-405109909 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNTEwOTkwOQ== djhoese 1828519 2018-07-15T18:43:43Z 2018-07-15T18:44:33Z CONTRIBUTOR

@fmaussion Note that I am the one who started the PROJ.4 CRS in cartopy pull request (https://github.com/SciTools/cartopy/pull/1023) and that it was this work that I copied to pyresample for my own pyresample work since I didn't want to wait for everything to be flushed out in cartopy. You can see an example of the to_cartopy_crs method here: https://github.com/pytroll/pytroll-examples/blob/master/satpy/Cartopy%20Plot.ipynb

It's also these cartopy CRS issues that make me think that Cartopy CRS objects aren't the right solution for this type of logic as a "how to represent CRS objects". In my experience (see: my cartopy PR :wink:) and watching and talking with people at SciPy 2018 is that multiple projects have work arounds for passing their CRS/projection information to cartopy.

In my biased experience/opinion PROJ.4 is or can be used in quite a few libraries/fields. If PROJ.4 or something that accepts PROJ.4 isn't used then we might as well come up with a new standard way of defining projections...just kidding.

Side note: FYI the geotiff format does not currently accept the sweep axis parameter +sweep that PROJ.4 needs to properly describe the geos projection used by GOES-16 ABI satellite instrument data. I've contacted some of the geotiff library people at some point and from what I remember it was a dead end without a lot of work behind fixing it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807
405105396 https://github.com/pydata/xarray/issues/2288#issuecomment-405105396 https://api.github.com/repos/pydata/xarray/issues/2288 MDEyOklzc3VlQ29tbWVudDQwNTEwNTM5Ng== philippjfr 1550771 2018-07-15T17:23:39Z 2018-07-15T17:23:39Z NONE

This would also be very helpful for geoviews.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add CRS/projection information to xarray objects 341331807

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 18.326ms · About: xarray-datasette