home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

11 rows where issue = 173494017 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 3

  • shoyer 6
  • joonro 4
  • darothen 1

author_association 2

  • MEMBER 6
  • NONE 5

issue 1

  • Return a scalar instead of DataArray when the return value is a scalar · 11 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
242996797 https://github.com/pydata/xarray/issues/987#issuecomment-242996797 https://api.github.com/repos/pydata/xarray/issues/987 MDEyOklzc3VlQ29tbWVudDI0Mjk5Njc5Nw== shoyer 1217238 2016-08-28T20:19:18Z 2016-08-28T20:19:18Z MEMBER

Thanks @joonro, you are very kind!

I'm going to close this issue since I think we resolved the original question.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Return a scalar instead of DataArray when the return value is a scalar 173494017
242953090 https://github.com/pydata/xarray/issues/987#issuecomment-242953090 https://api.github.com/repos/pydata/xarray/issues/987 MDEyOklzc3VlQ29tbWVudDI0Mjk1MzA5MA== joonro 1063143 2016-08-28T02:58:24Z 2016-08-28T02:58:24Z NONE

@shoyer I think I saw ... a long time ago and must have forgotten about it. Thank you so much for reminding me - I was really hoping for something like ... for a while.

Btw, I must say not only that xarray is just so useful for many of my research, but also the devs' responses on the issues have been superb. Definitely one of the most pleasant experiences I have had with developers. Thank you.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Return a scalar instead of DataArray when the return value is a scalar 173494017
242950283 https://github.com/pydata/xarray/issues/987#issuecomment-242950283 https://api.github.com/repos/pydata/xarray/issues/987 MDEyOklzc3VlQ29tbWVudDI0Mjk1MDI4Mw== shoyer 1217238 2016-08-28T01:22:45Z 2016-08-28T01:22:45Z MEMBER

@joonro Yes, this does get messy. We'll eventually support indexing like X[X > 0] directly, which will help significantly.

In the meantime, you can still break things up onto multiple lines by saving temporary variables:

condition = X.loc[..., 'variable'].values > 0 X.loc[..., 'variable'].values[condition] = Y.loc[..., 'variable'].values[condition]

Using abbreviations like ... for :, :, : (assuming 'variable' is along the last axis) can also help.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Return a scalar instead of DataArray when the return value is a scalar 173494017
242940333 https://github.com/pydata/xarray/issues/987#issuecomment-242940333 https://api.github.com/repos/pydata/xarray/issues/987 MDEyOklzc3VlQ29tbWVudDI0Mjk0MDMzMw== joonro 1063143 2016-08-27T20:55:59Z 2016-08-27T20:59:09Z NONE

Sure. My actual usage is usually much more complicated, but basically, with

python import numpy as np import xarray as xr X = xr.DataArray(np.random.normal(size=(10, 10)), coords=[range(10), range(10)],)

if I want to choose only values larger than 0 from X, it seems I cannot do X[X > 0], I have to do X.values[X.values > 0]. You can see how this thing can quickly get long if I'm doing this for assignment with multidimensional xarrays - something like

python X.loc[:, :, :, 'variable'].values[X.loc[:, :, :, 'variable'].values > 0] = Y.loc[:, :, :, 'variable'].values[Y.loc[:, :, :, 'variable'].values > 0]

Maybe I'm mistaken and there is a way to do this more nicely, but I haven't been able to figure it out.

Thank you!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Return a scalar instead of DataArray when the return value is a scalar 173494017
242938039 https://github.com/pydata/xarray/issues/987#issuecomment-242938039 https://api.github.com/repos/pydata/xarray/issues/987 MDEyOklzc3VlQ29tbWVudDI0MjkzODAzOQ== shoyer 1217238 2016-08-27T20:07:45Z 2016-08-27T20:07:45Z MEMBER

Can you give an example of how you need to use .values in xarray operations? Within xarray, we should be able to remove the need to use that. On Sat, Aug 27, 2016 at 1:06 PM Joon Ro notifications@github.com wrote:

Thanks a lot for the discussions. I agree it is very important to be consistent and explicit. Another thing was that sometimes .values makes a line of code really long - especially when I want to index a DataArray with another DataArray with some conditions, as I often have to use .values for each of them.

Currently I do not have a good idea about how to improve this - I will report back if one occurs to me. Thanks again!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/987#issuecomment-242937958, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKS1rjz_au5Uth5UsSgZpSqTXq7sYeyks5qkJi0gaJpZM4JuQKC .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Return a scalar instead of DataArray when the return value is a scalar 173494017
242937958 https://github.com/pydata/xarray/issues/987#issuecomment-242937958 https://api.github.com/repos/pydata/xarray/issues/987 MDEyOklzc3VlQ29tbWVudDI0MjkzNzk1OA== joonro 1063143 2016-08-27T20:06:12Z 2016-08-27T20:06:12Z NONE

Thanks a lot for the discussions. I agree it is very important to be consistent and explicit. Another thing was that sometimes .values makes a line of code really long - especially when I want to index a DataArray with another DataArray with some conditions, as I often have to use .values for each of them.

Currently I do not have a good idea about how to improve this - I will report back if one occurs to me. Thanks again!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Return a scalar instead of DataArray when the return value is a scalar 173494017
242937188 https://github.com/pydata/xarray/issues/987#issuecomment-242937188 https://api.github.com/repos/pydata/xarray/issues/987 MDEyOklzc3VlQ29tbWVudDI0MjkzNzE4OA== shoyer 1217238 2016-08-27T19:48:42Z 2016-08-27T19:48:42Z MEMBER

@darothen Let's discuss this over in #988.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Return a scalar instead of DataArray when the return value is a scalar 173494017
242912131 https://github.com/pydata/xarray/issues/987#issuecomment-242912131 https://api.github.com/repos/pydata/xarray/issues/987 MDEyOklzc3VlQ29tbWVudDI0MjkxMjEzMQ== darothen 4992424 2016-08-27T11:34:28Z 2016-08-27T11:34:28Z NONE

@joonro, I think there's a strong case to be made about returning a DataArray with some metadata appended. Referring to the latest draft of the CF Metadata Conventions, there is a clear way to indicate when operations such as mean, max, or min have been applied to a variable by using the cell_methods attribute.

It might be more prudent to add this attribute whenever we apply these operations to a DataArray (or perhaps variable-wise when applied to a Dataset). That way, there is a clear reason to not return a scalar - the documentation of what operations were applied to produce that final result.

I can whip up a working example/pull request if people think this is a direction to go. I'd probably build a decorator which handles inspection of the operator name and arguments and uses that to add the cell_methods attribute, that way people can add the same functionality to homegrown methods/operators.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Return a scalar instead of DataArray when the return value is a scalar 173494017
242800119 https://github.com/pydata/xarray/issues/987#issuecomment-242800119 https://api.github.com/repos/pydata/xarray/issues/987 MDEyOklzc3VlQ29tbWVudDI0MjgwMDExOQ== shoyer 1217238 2016-08-26T17:34:37Z 2016-08-26T17:34:37Z MEMBER

I wonder if it is reasonable to return a scalar when there is neither coords nor attrs associated with the return value, or it would be too much ad-hoc thing. For example, in the original example the return value was <xarray.DataArray ()>, which does not have any useful information.

This is a bad path to go down :). Now your code might suddenly break when you add a metadata field!

In principle, we could pick some subset of operations for which to always do this and others for which to never do this (e.g., aggregating out all dimensions, but not indexing out all dimensions), but I think this inconsistency would be even more surprising. It's pretty easy to see how this could lead to bugs, too. At least now you know you always need to type .values or .item()!

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Return a scalar instead of DataArray when the return value is a scalar 173494017
242796865 https://github.com/pydata/xarray/issues/987#issuecomment-242796865 https://api.github.com/repos/pydata/xarray/issues/987 MDEyOklzc3VlQ29tbWVudDI0Mjc5Njg2NQ== joonro 1063143 2016-08-26T17:22:06Z 2016-08-26T17:22:06Z NONE

I see - thanks a lot for the quick response. I knew there was a good reason for this.

I wonder if it is reasonable to return a scalar when there is neither coords nor attrs associated with the return value, or it would be too much ad-hoc thing. For example, in the original example the return value was <xarray.DataArray ()>, which does not have any useful information.

I think this might be reasonable because I only get into this issue when I'm doing an array-wide operation and I know I'm going to get an aggregate scalar and forget to use .values.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Return a scalar instead of DataArray when the return value is a scalar 173494017
242789058 https://github.com/pydata/xarray/issues/987#issuecomment-242789058 https://api.github.com/repos/pydata/xarray/issues/987 MDEyOklzc3VlQ29tbWVudDI0Mjc4OTA1OA== shoyer 1217238 2016-08-26T16:51:02Z 2016-08-26T16:52:17Z MEMBER

I agree that this can be annoying. The downside in making this switch is that we would lose xarray specific fields like coords and attrs that are currently preserved, e.g.,

```

array = xr.DataArray([1, 2, 3], coords=[('x', ['a', 'b', 'c'])]) array <xarray.DataArray (x: 3)> array([1, 2, 3]) Coordinates: * x (x) |S1 'a' 'b' 'c' array[0] <xarray.DataArray ()> array(1) Coordinates: x |S1 'a' array[0].coords['x'].item() 'a' ```

Also, strictly from a simplicity point of view for xarray, it's nice for every function to return fixed types.

NumPy solved this problem by creating it's own scalar types (e.g., np.float64) that define fields like shape and dtype while also subclassing Python's builtin numeric types. We could do the same, but this could lead to a different set of subtle cross-compatibility issues.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Return a scalar instead of DataArray when the return value is a scalar 173494017

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.21ms · About: xarray-datasette