home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

9 rows where issue = 977544678 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 6

  • dcherian 3
  • benbovy 2
  • mjwillson 1
  • max-sixty 1
  • jbusecke 1
  • TomNicholas 1

author_association 3

  • MEMBER 7
  • CONTRIBUTOR 1
  • NONE 1

issue 1

  • Shoudn't `assert_allclose` transpose datasets? · 9 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1128770505 https://github.com/pydata/xarray/issues/5733#issuecomment-1128770505 https://api.github.com/repos/pydata/xarray/issues/5733 IC_kwDOAMm_X85DR6vJ mjwillson 4502 2022-05-17T11:48:26Z 2022-05-17T11:48:26Z NONE

+1 for a check_dim_order option to .equals, assert_equal that can be disabled. (Ideally I think the default would be not to check dim order, but that ship has sailed now).

Or failing that, it would at least be nice to have xarray.testing.assert_equal_modulo_dim_order etc. When writing tests I usually don't care about dimension order and it's frustrating to have to manually do e.g. xarray.testing.assert_allclose(a, b.transpose(a.dims)).

As pointed out, most of the xarray API is dimension-order-invariant and so it's odd to have no supported way to do comparisons in a dimension-order-invariant way.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Shoudn't `assert_allclose` transpose datasets? 977544678
908436153 https://github.com/pydata/xarray/issues/5733#issuecomment-908436153 https://api.github.com/repos/pydata/xarray/issues/5733 IC_kwDOAMm_X842JaK5 benbovy 4160723 2021-08-30T15:24:17Z 2021-08-30T15:24:17Z MEMBER

@jbusecke I agree with your point of view that "xarray-style" comparison is more practical in a scientific workflow. Especially if dimension order is irrelevant for most (all?) xarray operations, ignoring the order for assert_allclose / assert_equal too makes sense and is consistent.

However, it might be also dangerous / harmful if the workflow includes data conversion between labeled vs. unlabelled formats. There's a risk of checking for equality with xarray, then later converting to numpy and assuming that arrays are equal without feeling the need to check again. If dimension sizes are the same this might lead to very subtle bugs.

Since it is easy to ignore or forget about default values, having a check_dim_order option that defaults to True is probably safer IMHO, although slightly less convenient. No strong views, though.

@dcherian I like your idea but I'm not sure what's best between your code snippet and checking equality of aligned dimensions datasets only if non-dimension-aligned are not equal.

{
    "total_count": 2,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 2,
    "rocket": 0,
    "eyes": 0
}
  Shoudn't `assert_allclose` transpose datasets? 977544678
908433305 https://github.com/pydata/xarray/issues/5733#issuecomment-908433305 https://api.github.com/repos/pydata/xarray/issues/5733 IC_kwDOAMm_X842JZeZ dcherian 2448579 2021-08-30T15:20:29Z 2021-08-30T15:20:29Z MEMBER

what the default comparison behaviour should be

I don't think we can change this because it's very backwards incompatible and affects tests in downstream packages.

But :+1: to adding a flag allowing users to opt out of dimension order checking.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Shoudn't `assert_allclose` transpose datasets? 977544678
908406439 https://github.com/pydata/xarray/issues/5733#issuecomment-908406439 https://api.github.com/repos/pydata/xarray/issues/5733 IC_kwDOAMm_X842JS6n TomNicholas 35968931 2021-08-30T14:48:23Z 2021-08-30T14:48:23Z MEMBER

I think we definitely need a flag, because we have two conflicting use cases: 1) Developers who use asserts in test suites and might well care about differences like python xr.testing.assert_allclose(ds1.data, ds2.data) # ok np.testing.assert_allclose(ds1.data.values, ds2.data.values) # fails 2) Users for whom xarray allows to forget about dimension order, and would expect transposed data to be equal.

The easiest way to cover both is to have an option to switch between them.

In my opinion the only question is what the default comparison behaviour should be. I like the idea of adding a check_dim_order kwarg to assert_equal, assert_allclose, and assert_identical, but have the flag default to False for the first two functions and True for the last one.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Shoudn't `assert_allclose` transpose datasets? 977544678
908389845 https://github.com/pydata/xarray/issues/5733#issuecomment-908389845 https://api.github.com/repos/pydata/xarray/issues/5733 IC_kwDOAMm_X842JO3V jbusecke 14314623 2021-08-30T14:27:01Z 2021-08-30T14:27:01Z CONTRIBUTOR

Strictly speaking, the values are different I guess. However I think this error would be clearer if it said that the dimension order was different but the values are equal once the dimensions are transposed.

I guess this comes down a bit to a philosophical question related to @benbovy s comment above. You can either make this operation be similar to the numpy equivalent (with some more xarray specific checks) or it can check whether the values at a certain combo of labels are the same/close.

The latter would be the way I think about data in xarray as a user. To me the removal of axis logic (via labels) is one of the biggest draws for myself, but importantly I also pitch this as one of the big reasons to switch to xarray for beginners.

I would argue that a 'strict' (numpy style) comparision is less practical in a scientific workflow and we do have the numpy implementation to achieve that functionality. So I would ultimately argue that xarray should check closeness between values at certain label positions by default.

However, this might be very opinionated on my end, and a better error message would already be a massive improvement.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Shoudn't `assert_allclose` transpose datasets? 977544678
907901984 https://github.com/pydata/xarray/issues/5733#issuecomment-907901984 https://api.github.com/repos/pydata/xarray/issues/5733 IC_kwDOAMm_X842HXwg dcherian 2448579 2021-08-29T23:50:59Z 2021-08-29T23:50:59Z MEMBER

I have a related question

``` python import xarray as xr

da = xr.DataArray([[1, 1, 1], [2, 2, 2]], dims=("x", "y"))

xr.testing.assert_identical(da, da.transpose()) AssertionError: Left and right DataArray objects are not identical Differing dimensions: (x: 2, y: 3) != (y: 3, x: 2) Differing values: L array([[1, 1, 1], [2, 2, 2]]) R array([[1, 2], [1, 2], [1, 2]]) ```

Strictly speaking, the values are different I guess. However I think this error would be clearer if it said that the dimension order was different but the values are equal once the dimensions are transposed.

I.e. we could

python if set(a.dims) == set(b.dims): a = a.transpose(b.dims) # check values and raise if actually different else: # current behaviour

Is this a good idea?

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Shoudn't `assert_allclose` transpose datasets? 977544678
904683190 https://github.com/pydata/xarray/issues/5733#issuecomment-904683190 https://api.github.com/repos/pydata/xarray/issues/5733 IC_kwDOAMm_X8417F62 dcherian 2448579 2021-08-24T14:16:45Z 2021-08-24T14:16:45Z MEMBER

What about a check_dim_order option? Also, it would be useful if information about non-matching dimension order was shown more explicitly in the assertion error message.

This sounds good to me. We should also have a check_attrs kwarg since that's another thing that only identical checks.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Shoudn't `assert_allclose` transpose datasets? 977544678
904552118 https://github.com/pydata/xarray/issues/5733#issuecomment-904552118 https://api.github.com/repos/pydata/xarray/issues/5733 IC_kwDOAMm_X8416l62 benbovy 4160723 2021-08-24T11:21:22Z 2021-08-24T11:21:22Z MEMBER

Is there any operation in xarray where dimension order matters? If yes, I'm not sure if it's a good idea to allow transposed dimension pass assert_allclose or assert_equal. But even otherwise, I'd find it a bit weird for a "numpy equivalent" function to have:

python xr.testing.assert_allclose(ds1.data, ds2.data) # ok np.testing.assert_allclose(ds1.data.values, ds2.data.values) # fails

What about a check_dim_order option? Also, it would be useful if information about non-matching dimension order was shown more explicitly in the assertion error message.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Shoudn't `assert_allclose` transpose datasets? 977544678
904240540 https://github.com/pydata/xarray/issues/5733#issuecomment-904240540 https://api.github.com/repos/pydata/xarray/issues/5733 IC_kwDOAMm_X8415Z2c max-sixty 5635139 2021-08-24T01:09:19Z 2021-08-24T01:09:19Z MEMBER

I agree this is another station on the journey between "byte identical" and "practically similar", which we've discovered over the past couple of years... :)

I would lean towards having it fail for assert_identical, with no view on all_close; definitely open to it coercing transpose.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Shoudn't `assert_allclose` transpose datasets? 977544678

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 388.295ms · About: xarray-datasette