home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

6 rows where issue = 718716799 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 2

  • chrisroat 3
  • shoyer 3

author_association 2

  • CONTRIBUTOR 3
  • MEMBER 3

issue 1

  • Scalar non-dimension coords forget their heritage · 6 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
708686170 https://github.com/pydata/xarray/issues/4501#issuecomment-708686170 https://api.github.com/repos/pydata/xarray/issues/4501 MDEyOklzc3VlQ29tbWVudDcwODY4NjE3MA== chrisroat 1053153 2020-10-14T22:07:38Z 2020-10-14T22:07:38Z CONTRIBUTOR

My mental model of what's happening may not be correct. I did want sel(), isel(), and squeeze() to all operate the same way (and maybe someday even work on non-dim coordinates!). Replacing squeeze() with isel() in my initial example gives the same failure, which I would want it to work:

``` import numpy as np import xarray as xr

arr1 = xr.DataArray(np.zeros((1,5)), dims=['y', 'x'], coords={'e': ('y', [10])}) arr2 = arr1.isel(y=0).expand_dims('y') xr.testing.assert_identical(arr1, arr2) ```

``` AssertionError: Left and right DataArray objects are not identical

Differing coordinates: L e (y) int64 10 R e int64 10 ```

The non-dim coordinate e has forgotten that it was associated with y. I'd prefer that this association remained.

Where it gets really interesting is in the following example where the non-dim coordinate moves from one dim to another. I understand the logic here (since the isel() were done in a way that correlates 'y' and 'z'). In my proposal, this would not happen without explicit user intervention -- which may actually be desired here (it's sorta surprising):

``` import numpy as np import xarray as xr

arr = xr.DataArray(np.zeros((2, 2, 5)), dims=['z', 'y', 'x'], coords={'e': ('y', [10, 20])}) print(arr.coords) print()

arr0 = arr.isel(z=0,y=0) arr1 = arr.isel(z=1,y=1)

arr_concat = xr.concat([arr0, arr1], 'z') print(arr_concat.coords) ```

``` Coordinates: e (y) int64 10 20

Coordinates: e (z) int64 10 20 ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Scalar non-dimension coords forget their heritage 718716799
708588777 https://github.com/pydata/xarray/issues/4501#issuecomment-708588777 https://api.github.com/repos/pydata/xarray/issues/4501 MDEyOklzc3VlQ29tbWVudDcwODU4ODc3Nw== shoyer 1217238 2020-10-14T18:41:21Z 2020-10-14T18:41:32Z MEMBER

If we changed isel() to only modify data variables, then we would be in trouble with something like isel(x=slice(3)). The coordinate system would be inconsistent if we slice only the data variables but not the coordinates.

There's a bit of a conflict here between two desirable properties:

  1. Methods on a Dataset only modify data variables, leaving coordinates unchanged
  2. Methods on a Dataset keep the entire coordinate system for the Dataset consistent, including coordinates
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Scalar non-dimension coords forget their heritage 718716799
708561579 https://github.com/pydata/xarray/issues/4501#issuecomment-708561579 https://api.github.com/repos/pydata/xarray/issues/4501 MDEyOklzc3VlQ29tbWVudDcwODU2MTU3OQ== chrisroat 1053153 2020-10-14T17:51:15Z 2020-10-14T17:51:15Z CONTRIBUTOR

One problem with this -- at least for now -- is that xarray currently doesn't allow coordinates on DataArray objects to have dimensions that don't appear on the DataArray itself.

Ah, then that would be the desire here.

It might also be surprising that this would make squeeze('y') inconsistent with isel(y=0)

The suggestion here is that both of these would behave the same. The MCVE was just for the squeeze case, but I expect that isel and sel would both allow non-dim coords to maintain the reference to their original dim (even if it becomes a non-dim coord itself).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Scalar non-dimension coords forget their heritage 718716799
708167473 https://github.com/pydata/xarray/issues/4501#issuecomment-708167473 https://api.github.com/repos/pydata/xarray/issues/4501 MDEyOklzc3VlQ29tbWVudDcwODE2NzQ3Mw== shoyer 1217238 2020-10-14T05:33:59Z 2020-10-14T05:33:59Z MEMBER

What I think would be more useful:

<xarray.DataArray (x: 5)> array([0., 0., 0., 0., 0.]) Coordinates: y int64 42 e (y) int64 10 <---- Note the (y) Dimensions without coordinates: x

Thanks for clarifying!

One problem with this -- at least for now -- is that xarray currently doesn't allow coordinates on DataArray objects to have dimensions that don't appear on the DataArray itself.

It might also be surprising that this would make squeeze('y') inconsistent with isel(y=0)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Scalar non-dimension coords forget their heritage 718716799
706640485 https://github.com/pydata/xarray/issues/4501#issuecomment-706640485 https://api.github.com/repos/pydata/xarray/issues/4501 MDEyOklzc3VlQ29tbWVudDcwNjY0MDQ4NQ== chrisroat 1053153 2020-10-11T02:37:00Z 2020-10-11T02:37:00Z CONTRIBUTOR

I'm not a huge fan of adding arguments for a case that rarely comes up (I presume). 

One difference in your example is that the 'e' coord is never based on 'y', so I would not want it expanded -- so I'd still like that test to pass.

The case I'm interested in is where the non-dimension coords are based on existing dimension coords that gets squeezed.

So in this example: ``` import numpy as np import xarray as xr

arr1 = xr.DataArray(np.zeros((1,5)), dims=['y', 'x'], coords={'y': [42], 'e': ('y', [10])}) arr1.squeeze() ```

The squeezed array looks like:

<xarray.DataArray (x: 5)> array([0., 0., 0., 0., 0.]) Coordinates: y int64 42 e int64 10 Dimensions without coordinates: x

What I think would be more useful:

<xarray.DataArray (x: 5)> array([0., 0., 0., 0., 0.]) Coordinates: y int64 42 e (y) int64 10 <---- Note the (y) Dimensions without coordinates: x

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Scalar non-dimension coords forget their heritage 718716799
706639713 https://github.com/pydata/xarray/issues/4501#issuecomment-706639713 https://api.github.com/repos/pydata/xarray/issues/4501 MDEyOklzc3VlQ29tbWVudDcwNjYzOTcxMw== shoyer 1217238 2020-10-11T02:26:42Z 2020-10-11T02:26:42Z MEMBER

Hi @chrisroat, thanks for the clear bug report!

It indeed be nice if squeeze followed by expand_dims preserved the original inputs, but I don't think that is possible in general -- the squeeze operation removes information.

For example, this array does currently satisfy your desired property, but wouldn't if we made the change you request: python arr1 = xr.DataArray(np.zeros((1,5)), dims=['y', 'x'], coords={'e': 10}) arr2 = arr1.squeeze('y').expand_dims('y') xr.testing.assert_identical(arr1, arr2) # passes

I suspect our best option for achieving this behavior would be to add another optional argument to expand_dims, e.g., perhaps - expand_dims(..., expand_coords=False): don't expand coordinates (default behavior) - expand_dims(..., expand_coords=True): expand all coordinates - expand_dims(..., expand_coords=['e']): only expand the coordinate 'e'

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Scalar non-dimension coords forget their heritage 718716799

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 16.045ms · About: xarray-datasette