home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

8 rows where author_association = "MEMBER" and issue = 537772490 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 3

  • benbovy 5
  • dcherian 2
  • rabernat 1

issue 1

  • Idea: functionally-derived non-dimensional coordinates · 8 ✖

author_association 1

  • MEMBER · 8 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
926829182 https://github.com/pydata/xarray/issues/3620#issuecomment-926829182 https://api.github.com/repos/pydata/xarray/issues/3620 IC_kwDOAMm_X843Pkp- benbovy 4160723 2021-09-24T18:13:25Z 2021-09-24T18:13:25Z MEMBER

@djhoese not yet but hopefully soon! Most of the work on explicit indexes is currently happening in #5692, which once merged (probably after the next release) will provide all the infrastructure for custom indexes. This is quite a big internal refactoring (bigger than I initially thought) that we cannot avoid as we're changing Xarray's core data model. After that, we'll need to update some public API (Xarray object constructors, .set_index(), etc.) so that Xarray will accept custom index classes. This should take much less work than #5692, though.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Idea: functionally-derived non-dimensional coordinates 537772490
865648105 https://github.com/pydata/xarray/issues/3620#issuecomment-865648105 https://api.github.com/repos/pydata/xarray/issues/3620 MDEyOklzc3VlQ29tbWVudDg2NTY0ODEwNQ== benbovy 4160723 2021-06-22T06:53:28Z 2021-06-22T06:53:28Z MEMBER

@djhoese you're right, I thought it was better to do all the internal refactoring first but we maybe shouldn't wait too long before updating set_index and the DataArray / Dataset constructors so that you and others can start playing with custom indexes.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
  Idea: functionally-derived non-dimensional coordinates 537772490
856081652 https://github.com/pydata/xarray/issues/3620#issuecomment-856081652 https://api.github.com/repos/pydata/xarray/issues/3620 MDEyOklzc3VlQ29tbWVudDg1NjA4MTY1Mg== benbovy 4160723 2021-06-07T16:25:13Z 2021-06-08T07:34:09Z MEMBER

In your opinion will this type of CRSIndex/WCSIndex work need #5322? If so, will it also require (or benefit from) the additional internal xarray refactoring you mention in #5322?

Yes, CRSIndex/WCSIndex will need to provide an implementation for the query method added in #5322. However, this could be "as simple as" internally using PandasIndex for each 1-d coordinate in case of raster/grid data, maybe with an additional check that the values provided to .sel are in the same CRS (for example in the case of advanced indexing where xarray.DataArray or xarray.Variable objects are passed as arguments).

What will be probably more tricky is to find some common way to handle CRS for various indexes (e.g., regular gridded data vs. irregular data), probably via some class inheritance hierarchy or using mixins.

I can really see this becoming super easy for CRS-based dataset users where libraries like geoxarray (or xoak) "know" the common types of schemes/structures that might exist in the scientific field and have a simple .geo.set_index that figures out most of the parameters for .set_index by default.

In case we load such data from a file/store, thanks to the Xarray backend system, maybe we won't even need a .geo.set_index but we'll be able to build the right index(es) when opening the dataset!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Idea: functionally-derived non-dimensional coordinates 537772490
855720969 https://github.com/pydata/xarray/issues/3620#issuecomment-855720969 https://api.github.com/repos/pydata/xarray/issues/3620 MDEyOklzc3VlQ29tbWVudDg1NTcyMDk2OQ== benbovy 4160723 2021-06-07T08:29:15Z 2021-06-08T07:32:45Z MEMBER

We could also imagine

```python

returns a new dataset with both pixel and world (possibly lazy) coordinates

new_dataset = dataset.astro.append_world({'x': 'xw', 'y': 'yw', 'z': 'zw'})

so that we can directly select data either using the pixel coordinates...

new_dataset.sel(x=..., y=..., z=...)

...or using the world coordinates

new_dataset.sel(xw=..., yw=..., zw=...)

the WCS index would be attached to both pixel and world dataset coordinates

new_dataset <xarray.Dataset> Dimensions: (x: 100, y: 100, z: 100) Coordinates: * x (x) float64 ... * xw (x) float64 ... * y (y) float64 ... * yw (y) float64 ... * z (z) float64 ... * zw (z) float64 ... Data variables: field (x, y, z) float64 .... Indexes: x, y, z, zw, yw, zw WCSIndex ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Idea: functionally-derived non-dimensional coordinates 537772490
855710036 https://github.com/pydata/xarray/issues/3620#issuecomment-855710036 https://api.github.com/repos/pydata/xarray/issues/3620 MDEyOklzc3VlQ29tbWVudDg1NTcxMDAzNg== benbovy 4160723 2021-06-07T08:16:35Z 2021-06-07T08:16:35Z MEMBER

This looks like a nice use case for the forthcoming Xarray's custom index feature.

How I see CRS/WCS-aware Xarray datasets with custom indexes:

  • A set of coordinate(s) and their attributes hold data or metadata relevant for public use and that could be easily (de)serialized

  • A custom index (CRSIndex or WCSIndex) provides CRS/WCS-aware implementations of common Xarray operations such as alignment (merge/concat) and data selection (sel), via Xarray.Index's equals, union, intersection and query methods added in #5102 and #5322 (not yet ready for use outside of Xarray). Such custom index may also be used to hold some data that is tricky to propagate by other means, e.g., some internal information like "functional" coordinate parameters or a crs object. Xarray indexes should definitely provide more flexibility than coordinate data or attributes or accessor attributes for propagating this kind of information.

  • Xarray accessors may be used to extend Dataset/DataArray public API. They could use the information stored in the CRSIndex/WCSIndex, e.g., add a crs read-only property that returns the crs object stored in CRSIndex, or add some some extract_crs_parameters method to extract the parameters and store them in Dataset/DataArray attributes similarly to what @djhoese suggests in his comment above.

For this use case a possible workflow would then be something like this:

```python

create or open an Xarray dataset with x, y, z "pixel" (possibly lazy) coordinates

and set a WCS index

dataset = ( xr.Dataset(...) .set_index(['x', 'y', 'z'], WCSIndex, wcs_params={...}) )

select data using pixel coordinates

dataset.sel(x=..., y=..., z=...)

select data using world coordinates (via the "astro" accessor,

which may access methods/attributes of the WCS index)

dataset.astro.sel_world(x=..., y=..., z=...)

return a new dataset where the x,y,z "pixel" coordnates are replaced by the "world" coordinates

(again using the WCS index, and propagating it to the returned dataset)

world_dataset = dataset.astro.pixel_to_world(['x', 'y', 'z'])

select data using world coordinates

world_dataset.sel(x=..., y=..., z=...)

select data using pixel coordinates (via the "astro" accessor)

world_dataset.astro.sel_pixel(x=..., y=..., z=...)

this could be reverted

pixel_dataset = world_dataset.astro.world_to_pixel(['x', 'y', 'z']) assert pixel_dataset.identical(dataset)

depending on the implementation in WCSIndex, would either raise an error

or implicitly convert to either pixel or world coordinates

xr.merge([world_dataset, another_pixel_dataset]) ```

{
    "total_count": 2,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 2,
    "rocket": 0,
    "eyes": 0
}
  Idea: functionally-derived non-dimensional coordinates 537772490
565671981 https://github.com/pydata/xarray/issues/3620#issuecomment-565671981 https://api.github.com/repos/pydata/xarray/issues/3620 MDEyOklzc3VlQ29tbWVudDU2NTY3MTk4MQ== dcherian 2448579 2019-12-14T02:18:15Z 2019-12-14T02:19:17Z MEMBER

I should have said "discrete lazily evaluated form (which we support through dask)". I think we already have what you want in principle (caveats at the end).

Here's an example: ``` python import dask import numpy as np import xarray as xr

xr.set_options(display_style="html")

def arbitrary_function(dataset): return dataset["a"] * dataset["wavelength"] * dataset.attrs["wcs_param"]

ds = xr.Dataset()

construct a dask array.

In practice this could represent an on-disk dataset,

with data reads only occurring when necessary

ds["a"] = xr.DataArray(dask.array.ones((10,)), dims=["wavelength"], coords={"wavelength": np.arange(10)})

some coordinate system parameter

ds.attrs["wcs_param"] = 1.0

complicated pixel to world function

no compute happens since we are working with dask arrays

so this is quite cheap.

ds.coords["azimuth"] = arbitrary_function(ds) ds ```

So you can carry around your coordinate system parameters in the .attrs dictionary and the non-dimensional coordinate azimuth is only evaluated when needed e.g. when plotting ``` python

Both 'a' and 'azimuth' are computed now, since actual values are required to plot

ds.a.plot(x="azimuth") ```

In practice, there are a few limitations. @djhoese and @snowman2 may have useful perspective here.

  1. xarray tends to compute "non-dimensional coordinates" more than necessary. The more egregious examples have been fixed (#3068, #3311, #3454, #3453) but there may still be some places where fixes are needed (#3588).
  2. there's some discussion about carrying around Earth-specific coordinate system parameters here: https://github.com/pydata/xarray/issues/2288; https://github.com/pydata/xarray/issues/2996.

Additional info: 1. https://docs.dask.org/en/latest/array.html 2. https://xarray.pydata.org/en/stable/dask.html 3. https://blog.dask.org/2019/06/20/load-image-data

PS: If it helps, I'd be happy to chat over skype for a half hour getting you oriented with how we do things.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Idea: functionally-derived non-dimensional coordinates 537772490
565625792 https://github.com/pydata/xarray/issues/3620#issuecomment-565625792 https://api.github.com/repos/pydata/xarray/issues/3620 MDEyOklzc3VlQ29tbWVudDU2NTYyNTc5Mg== dcherian 2448579 2019-12-13T22:05:07Z 2019-12-13T22:05:07Z MEMBER

It would also be good to hear about "sub-pixel metadata" → this seems to be the main reason why you want to carry around the analytic rather than the discrete evaluated form (which we basically support through dask). Is that right or am I missing something?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Idea: functionally-derived non-dimensional coordinates 537772490
565622345 https://github.com/pydata/xarray/issues/3620#issuecomment-565622345 https://api.github.com/repos/pydata/xarray/issues/3620 MDEyOklzc3VlQ29tbWVudDU2NTYyMjM0NQ== rabernat 1197350 2019-12-13T21:53:55Z 2019-12-13T21:53:55Z MEMBER

Thanks for reaching out Erik! We’d love to find a way to better support Astro data in xarray. Before digging deeper, I just want to ask a clarification question. When you say “arbitrary complex mathematical functions”: what are the arguments / inputs to these functions?

Presumably they have to be evaluated at some point, ie for plotting. Can you describe what happens when the time comes to turn the arbitrary functions to actual numbers?

Your answer will help us respond more accurately to your question.

On Dec 13, 2019, at 3:51 PM, Erik Tollerud notifications@github.com wrote:

 @Cadair and I are from the solar and astrophysics communities, respectively (particularly SunPy and Astropy). In our fields, we have a concept of something called "World Coordinate Systems" (WCS) which basically are arbitrary mappings from pixel coordinates (which is often but not necessarily the same as the index) to physical coordinates. (For more on this and associated Python/Astropy APIs, see this document). If we are reading correctly, this concept maps roughly onto the xarray concept of "Non-dimension coordinates".

However, a critical difference is this: WCS are usually expressed as arbitrary complex mathematical functions, rather than coordinate arrays, as it is crucial to some of the science cases to carry sub-pixel or other coordinate-related metadata around along with the WCS.

So our question is: is it in-scope for xarray non-dimensional coordinates to be extended to be functional instead of having to be arrays? I.e., have the coordinate arrays be generated on-the-fly from some function instead of being realized as arrays at creation-time. We have thought about several ways this might be specified and are willing to do some trial implementations, but are first asking here if it is likely to be

Easy Hard Impossible PR will immediately be rejected on philosophical grounds, regardless? Thanks!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Idea: functionally-derived non-dimensional coordinates 537772490

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 16.261ms · About: xarray-datasette