home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

12 rows where issue = 505493879 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 5

  • chrisroat 3
  • shoyer 3
  • dcherian 3
  • pmallas 2
  • jhamman 1

author_association 2

  • MEMBER 7
  • CONTRIBUTOR 5

issue 1

  • xarray.DataArray.where always returns array of float64 regardless of input dtype · 12 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
788532882 https://github.com/pydata/xarray/issues/3390#issuecomment-788532882 https://api.github.com/repos/pydata/xarray/issues/3390 MDEyOklzc3VlQ29tbWVudDc4ODUzMjg4Mg== shoyer 1217238 2021-03-02T02:42:37Z 2021-03-02T02:42:37Z MEMBER

Shall we raise a warning in where advising the more-efficient syntax? Or shall we skip the call to where_method

I'm not sure that either of these is a good idea.

The problem with raising a warning is that this is well-defined behavior. It may not always be useful, but well defined but useless behavior arises all the time in programs, so it's annoying to raise a warning for a special case.

The problem with skipping where_method is that now we end up with a potentially inconsistent dtype, depending on the selection. These sort of special cases can be quite frustrating to program around.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray.DataArray.where always returns array of float64 regardless of input dtype 505493879
788507019 https://github.com/pydata/xarray/issues/3390#issuecomment-788507019 https://api.github.com/repos/pydata/xarray/issues/3390 MDEyOklzc3VlQ29tbWVudDc4ODUwNzAxOQ== dcherian 2448579 2021-03-02T01:42:09Z 2021-03-02T01:42:09Z MEMBER

It seems rather unlikely to me to have an example of where with drop=True where the condition is exactly aligned with the grid, such that there are no missing values.

Actually, this is a really common pattern python ds = xr.tutorial.open_dataset('air_temperature') ds.where(ds.time.dt.hour.isin([0, 12]), drop=True)

The efficient way to do this is python ds.loc[{"time": ds.time.dt.hour.isin([0, 12])}] or

python ds.sel(time=ds.time.dt.hour.isin([0, 12]))

At this point https://github.com/pydata/xarray/blob/48378c4b11c5c2672ff91396d4284743165b4fbe/xarray/core/common.py#L1270-L1273

cond is all True and applying where is basically a totally useless copy since the isel has already copied.

Shall we raise a warning in where advising the more-efficient syntax? Or shall we skip the call to where_method

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray.DataArray.where always returns array of float64 regardless of input dtype 505493879
650903191 https://github.com/pydata/xarray/issues/3390#issuecomment-650903191 https://api.github.com/repos/pydata/xarray/issues/3390 MDEyOklzc3VlQ29tbWVudDY1MDkwMzE5MQ== chrisroat 1053153 2020-06-29T04:52:11Z 2020-06-29T04:52:11Z CONTRIBUTOR

What about the case of no missing values, when other wouldn't be needed? Could the same dtype be returned then? This is my case, since I'm re-purposing where to do sel for non-dimension coordinates.

Could you give a concrete example of what this would look like?

It seems rather unlikely to me to have an example of where with drop=True where the condition is exactly aligned with the grid, such that there are no missing values.

I guess it could happen if you're trying to index out exactly one element along a dimension?

That's exactly right. I am just selecting one slice of a data array, using data.where(data.coords['stain'] == 'DAPI').

In the long term, the cleaner solution for this will be some form for support for more flexibly / multi-dimensional indexing.

Agreed. Once I actually get things running, I'll be ready to try and contribute fixes for all my TODOs that reference xarray github issues. :)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray.DataArray.where always returns array of float64 regardless of input dtype 505493879
650899323 https://github.com/pydata/xarray/issues/3390#issuecomment-650899323 https://api.github.com/repos/pydata/xarray/issues/3390 MDEyOklzc3VlQ29tbWVudDY1MDg5OTMyMw== shoyer 1217238 2020-06-29T04:34:49Z 2020-06-29T04:34:49Z MEMBER

What about the case of no missing values, when other wouldn't be needed? Could the same dtype be returned then? This is my case, since I'm re-purposing where to do sel for non-dimension coordinates.

Could you give a concrete example of what this would look like?

It seems rather unlikely to me to have an example of where with drop=True where the condition is exactly aligned with the grid, such that there are no missing values.

I guess it could happen if you're trying to index out exactly one element along a dimension?

In the long term, the cleaner solution for this will be some form for support for more flexibly / multi-dimensional indexing.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray.DataArray.where always returns array of float64 regardless of input dtype 505493879
650889746 https://github.com/pydata/xarray/issues/3390#issuecomment-650889746 https://api.github.com/repos/pydata/xarray/issues/3390 MDEyOklzc3VlQ29tbWVudDY1MDg4OTc0Ng== chrisroat 1053153 2020-06-29T03:49:27Z 2020-06-29T03:49:27Z CONTRIBUTOR

What about the case of no missing values, when other wouldn't be needed? Could the same dtype be returned then? This is my case, since I'm re-purposing where to do sel for non-dimension coordinates.

I'm capable of just recasting for my use case, if this is becoming an idea that would be difficult to maintain/document.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray.DataArray.where always returns array of float64 regardless of input dtype 505493879
649964438 https://github.com/pydata/xarray/issues/3390#issuecomment-649964438 https://api.github.com/repos/pydata/xarray/issues/3390 MDEyOklzc3VlQ29tbWVudDY0OTk2NDQzOA== shoyer 1217238 2020-06-26T04:53:32Z 2020-06-26T04:53:32Z MEMBER

The trouble with returning the same dtype for uint16 values is that there's no easy way to have a missing value for uint16.

I don't entirely remember why we don't allow other in where if drop=True, but indeed that seems like a clean solution.

I suspect it might have something to do with alignment. But as long as other is already aligned with the result of aligning self and other (e.g., if other is a scalar, which is probably typical), then it should be fine allow for the other argument.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray.DataArray.where always returns array of float64 regardless of input dtype 505493879
649861589 https://github.com/pydata/xarray/issues/3390#issuecomment-649861589 https://api.github.com/repos/pydata/xarray/issues/3390 MDEyOklzc3VlQ29tbWVudDY0OTg2MTU4OQ== chrisroat 1053153 2020-06-25T23:08:52Z 2020-06-25T23:37:47Z CONTRIBUTOR

If drop=True, would it be problematic to return the same dtype or allow other?

My use case is a simple slicing of a dataset -- no missing values. The use of where is due to one of selections being on a non-dimension coordinate (#2028).

I can workaround using astype, but will say I was mildly surprised by this feature. I now understand why it's there. Our code is old and the data is intermediate and never deeply inspected -- I only noticed this when we started using a memory-intensive algorithm and surprised how much space was taken by our supposed uint16 data. :)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray.DataArray.where always returns array of float64 regardless of input dtype 505493879
542945576 https://github.com/pydata/xarray/issues/3390#issuecomment-542945576 https://api.github.com/repos/pydata/xarray/issues/3390 MDEyOklzc3VlQ29tbWVudDU0Mjk0NTU3Ng== dcherian 2448579 2019-10-17T00:32:08Z 2019-10-17T00:32:08Z MEMBER

Looks great. You did well!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray.DataArray.where always returns array of float64 regardless of input dtype 505493879
542923697 https://github.com/pydata/xarray/issues/3390#issuecomment-542923697 https://api.github.com/repos/pydata/xarray/issues/3390 MDEyOklzc3VlQ29tbWVudDU0MjkyMzY5Nw== pmallas 6051395 2019-10-16T22:52:48Z 2019-10-16T22:52:48Z CONTRIBUTOR

@dcherian Ok, I think I proposed a change correctly - never done this before.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray.DataArray.where always returns array of float64 regardless of input dtype 505493879
541106974 https://github.com/pydata/xarray/issues/3390#issuecomment-541106974 https://api.github.com/repos/pydata/xarray/issues/3390 MDEyOklzc3VlQ29tbWVudDU0MTEwNjk3NA== dcherian 2448579 2019-10-11T15:16:48Z 2019-10-11T15:16:48Z MEMBER

@pmallas it would be nice to update the docstring to make that clear if you are up for it

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray.DataArray.where always returns array of float64 regardless of input dtype 505493879
541089865 https://github.com/pydata/xarray/issues/3390#issuecomment-541089865 https://api.github.com/repos/pydata/xarray/issues/3390 MDEyOklzc3VlQ29tbWVudDU0MTA4OTg2NQ== pmallas 6051395 2019-10-11T14:33:02Z 2019-10-11T14:33:02Z CONTRIBUTOR

Yes, I read the return type as the 'same type as caller' and at first I expected the array type to be the same. I soon realized that means a DataArray or DataSet. And for your output array to support nan values, it has to be float. My bad - sorry for the clutter.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray.DataArray.where always returns array of float64 regardless of input dtype 505493879
540834161 https://github.com/pydata/xarray/issues/3390#issuecomment-540834161 https://api.github.com/repos/pydata/xarray/issues/3390 MDEyOklzc3VlQ29tbWVudDU0MDgzNDE2MQ== jhamman 2443309 2019-10-10T23:06:14Z 2019-10-10T23:06:14Z MEMBER

@pmallas - it looks like you figured this out but I'll just report on what was likely the confusion here.

Xarray's where methods use np.nan as the default other argument, this causes the type to be cast to a float. If you want to maintain a integer type, you'll need to specify another value for other.

xref: http://xarray.pydata.org/en/stable/computation.html#missing-values, http://xarray.pydata.org/en/stable/generated/xarray.DataArray.where.html

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray.DataArray.where always returns array of float64 regardless of input dtype 505493879

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1119.313ms · About: xarray-datasette