home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

23 rows where user = 2272878 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 6

  • Implement idxmax and idxmin functions 11
  • Add additional str accessor methods for DataArray 7
  • Allow for All-NaN in argmax, argmin 2
  • `rolling.mean` gives negative values on non-negative array. 1
  • _indexes of DataArray are not deep copied 1
  • xr.testing.assert_equal does not test for dtype 1

user 1

  • toddrjen · 23 ✖

author_association 1

  • CONTRIBUTOR 23
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
792225152 https://github.com/pydata/xarray/pull/4622#issuecomment-792225152 https://api.github.com/repos/pydata/xarray/issues/4622 MDEyOklzc3VlQ29tbWVudDc5MjIyNTE1Mg== toddrjen 2272878 2021-03-07T06:20:08Z 2021-03-07T06:20:08Z CONTRIBUTOR

All tests now pass as well.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add additional str accessor methods for DataArray 753097418
791878249 https://github.com/pydata/xarray/pull/4622#issuecomment-791878249 https://api.github.com/repos/pydata/xarray/issues/4622 MDEyOklzc3VlQ29tbWVudDc5MTg3ODI0OQ== toddrjen 2272878 2021-03-06T05:41:39Z 2021-03-06T05:41:39Z CONTRIBUTOR

The version here should be complete, in that all planned features are implemented, although of course there may be additional changes. So I removed the [WIP] part and updated whats-new.rst and others. I squashed my commits down and force-pushed to get a clean look at things. Please take a look and tell me what you think.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add additional str accessor methods for DataArray 753097418
752865024 https://github.com/pydata/xarray/pull/4622#issuecomment-752865024 https://api.github.com/repos/pydata/xarray/issues/4622 MDEyOklzc3VlQ29tbWVudDc1Mjg2NTAyNA== toddrjen 2272878 2020-12-31T06:43:42Z 2020-12-31T06:43:42Z CONTRIBUTOR

The latest version I just pushed should have the requested changes. It also has cat, join, +, *, %. I have also implemented broadcasting for many (but not all) of the functions I plan to implement it for so you can see some examples of how it works.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add additional str accessor methods for DataArray 753097418
751952795 https://github.com/pydata/xarray/pull/4622#issuecomment-751952795 https://api.github.com/repos/pydata/xarray/issues/4622 MDEyOklzc3VlQ29tbWVudDc1MTk1Mjc5NQ== toddrjen 2272878 2020-12-29T05:34:47Z 2020-12-29T05:34:47Z CONTRIBUTOR

@keewis Thanks for the suggestions. I will add everything to the relevant documentation when I have everything completed and the changes are agreed upon.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add additional str accessor methods for DataArray 753097418
750419141 https://github.com/pydata/xarray/issues/4727#issuecomment-750419141 https://api.github.com/repos/pydata/xarray/issues/4727 MDEyOklzc3VlQ29tbWVudDc1MDQxOTE0MQ== toddrjen 2272878 2020-12-23T18:23:20Z 2020-12-23T18:23:20Z CONTRIBUTOR

My concern with assert_identical is the name. It implies, to me, that there is no difference at all between the two objects. It was highly unexpected for me that it didn't do that. I think at the very least it should be clarified in the documentation that it doesn't do that.

If the default for assert_identical isn't change, I wonder whether a new function might be worthwhile. I am concerned having to append check_dtype=True for every test would hurt test clarity. And there is also the problem with

Also, just checking dtype won't be sufficient in all cases. Consider this:

```Python import numpy as np import xarray as xr

a = xr.DataArray(np.array(1.0, dtype=np.object))
b = xr.DataArray(np.array(1, dtype=np.object))

xr.testing.assert_identical(a, b)
```

I think for the purpose of testing being able to make sure the result is exactly what you expect is important.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xr.testing.assert_equal does not test for dtype 773750763
749887341 https://github.com/pydata/xarray/pull/4622#issuecomment-749887341 https://api.github.com/repos/pydata/xarray/issues/4622 MDEyOklzc3VlQ29tbWVudDc0OTg4NzM0MQ== toddrjen 2272878 2020-12-23T02:24:39Z 2020-12-23T02:24:39Z CONTRIBUTOR

@mathause One possibility might be to make xr.testing.assert_identical match dtypes. I can see different dtypes being "equal", but not "identical".

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add additional str accessor methods for DataArray 753097418
749311511 https://github.com/pydata/xarray/pull/4622#issuecomment-749311511 https://api.github.com/repos/pydata/xarray/issues/4622 MDEyOklzc3VlQ29tbWVudDc0OTMxMTUxMQ== toddrjen 2272878 2020-12-22T03:04:48Z 2020-12-22T03:04:48Z CONTRIBUTOR

@mathause

Sorry for the delay, I have been swamped at work. I probably won't have any time to work on this before Christmas.

I have finished implementing the cat and join methods, and I implemented +, *, and % operator support.

I am currently working on improving the vectorization of some of the functions. The idea is that some arguments, like for example the regular expression pattern or the number of repetitions in rep, will be able to be given an array-like, with the dimensions being broadcast against the original DataArray.

This can be useful, for example, if a DataArray combines data of different formats along a dimension (ideally this wouldn't be the case but people don't always have that much control over the data they get). Or it could be used to create an ASCII bar chart where the number of symbols is equal to the value in an array element.

However, this could lead to conflicts if the DataArray already has a dimension with that name, which would be a particular problem if people chained together multiple such operations.

That should raise a KeyError, no?

Yes, but I think it would be strange if using the default parameters once works fine, but using them twice or more in a row somehow returns an exception. I think the defaults should either work generally or not be defaults at all. That is just my opinion. More fundamentally, it is just inconsistent with how xarray works elsewhere and so I think it would be unexpected.

  • Some of the tests could probably be simplified, to make them easier to read. E.g. when you try to raise an error.

Please point out the specific cases if you haven't already done so.

  • we usually add a match to the pytest.raises. This also helps to understand what you are testing.

I will add this.

  • assert_equal should raise an error if the dtype does not match, so you should not need to add all the assert result.dtype == expected.dtype.

It doesn't work with an object dtype:

```python

import numpy as np import xarray as xr

a = xr.DataArray(np.array("a", dtype=np.str_)) b = a.astype(np.object_)
a.dtype == b.dtype
False a.equals(b) True xr.testing.assert_equal(a, b) ```

This does not raise an exception on my machine at least. I ran into several cases where I was incorrectly getting object dtypes and the tests weren't catching it, hence the dtype checks.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add additional str accessor methods for DataArray 753097418
737625900 https://github.com/pydata/xarray/pull/4622#issuecomment-737625900 https://api.github.com/repos/pydata/xarray/issues/4622 MDEyOklzc3VlQ29tbWVudDczNzYyNTkwMA== toddrjen 2272878 2020-12-03T02:42:24Z 2020-12-03T02:42:24Z CONTRIBUTOR
  • I'd set a default for the name of new dimensions e.g. group_dim: Hashable = "group". I think that's a good choice in most cases.

I thought about doing this at first. However, this could lead to conflicts if the DataArray already has a dimension with that name, which would be a particular problem if people chained together multiple such operations. So I checked what default name xarray uses elsewhere, and it doesn't seem to use default names for the most part (the main exception being DataArray creation). So I think that, in order to avoid unexpected behavior, and to keep consistency, not automatically choosing a name is a better option.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add additional str accessor methods for DataArray 753097418
605542337 https://github.com/pydata/xarray/pull/3871#issuecomment-605542337 https://api.github.com/repos/pydata/xarray/issues/3871 MDEyOklzc3VlQ29tbWVudDYwNTU0MjMzNw== toddrjen 2272878 2020-03-29T01:18:37Z 2020-03-29T01:18:37Z CONTRIBUTOR

I have gone over it one more time and made a few documentation fixes. Please take one more look before merging.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement idxmax and idxmin functions 584837010
605375373 https://github.com/pydata/xarray/pull/3871#issuecomment-605375373 https://api.github.com/repos/pydata/xarray/issues/3871 MDEyOklzc3VlQ29tbWVudDYwNTM3NTM3Mw== toddrjen 2272878 2020-03-28T01:32:00Z 2020-03-28T01:32:00Z CONTRIBUTOR

Here is a new commit with the discussed changes.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement idxmax and idxmin functions 584837010
604798935 https://github.com/pydata/xarray/pull/3871#issuecomment-604798935 https://api.github.com/repos/pydata/xarray/issues/3871 MDEyOklzc3VlQ29tbWVudDYwNDc5ODkzNQ== toddrjen 2272878 2020-03-27T03:41:56Z 2020-03-27T03:41:56Z CONTRIBUTOR

I think I have implemented all the requested changes and all tests are passing. Please take a look.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement idxmax and idxmin functions 584837010
604791067 https://github.com/pydata/xarray/issues/3899#issuecomment-604791067 https://api.github.com/repos/pydata/xarray/issues/3899 MDEyOklzc3VlQ29tbWVudDYwNDc5MTA2Nw== toddrjen 2272878 2020-03-27T03:06:32Z 2020-03-27T03:06:32Z CONTRIBUTOR

My pull request #3871 has a fix already.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 1,
    "eyes": 0
}
  _indexes of DataArray are not deep copied 588821932
604775165 https://github.com/pydata/xarray/issues/3884#issuecomment-604775165 https://api.github.com/repos/pydata/xarray/issues/3884 MDEyOklzc3VlQ29tbWVudDYwNDc3NTE2NQ== toddrjen 2272878 2020-03-27T01:57:44Z 2020-03-27T01:57:44Z CONTRIBUTOR

@shoyer xarray uses bottleneck for that if it can in xarray.nputils, so copying the numpy method would result in a performance hit. However, xarray maintains a wrapper around the numpy/bottleneck version in xarray.nanops where this could perhaps be implemented.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow for All-NaN in argmax, argmin 587062505
604462268 https://github.com/pydata/xarray/pull/3871#issuecomment-604462268 https://api.github.com/repos/pydata/xarray/issues/3871 MDEyOklzc3VlQ29tbWVudDYwNDQ2MjI2OA== toddrjen 2272878 2020-03-26T14:28:04Z 2020-03-26T14:28:04Z CONTRIBUTOR

I figured out what is going wrong. I will make a commit with a fix and include it in this pull request later today.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement idxmax and idxmin functions 584837010
604226585 https://github.com/pydata/xarray/pull/3871#issuecomment-604226585 https://api.github.com/repos/pydata/xarray/issues/3871 MDEyOklzc3VlQ29tbWVudDYwNDIyNjU4NQ== toddrjen 2272878 2020-03-26T04:46:07Z 2020-03-26T04:46:07Z CONTRIBUTOR

I am not sure why the tests are suddenly failing. The tests were all working, then I rebased on the latest master and they are failing and I can't figure out why.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement idxmax and idxmin functions 584837010
604220730 https://github.com/pydata/xarray/issues/3884#issuecomment-604220730 https://api.github.com/repos/pydata/xarray/issues/3884 MDEyOklzc3VlQ29tbWVudDYwNDIyMDczMA== toddrjen 2272878 2020-03-26T04:22:18Z 2020-03-26T04:22:18Z CONTRIBUTOR

The problem I had when implementing idxmin and idxmax is that this behavior is defined by numpy, not by xarray, and bottleneck follows the same behavior, with xarray generally delegating the computation to one of these. So you would need to somehow work around the behavior of numpy in xarray or get a fix implemented both in numpy and bottleneck.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow for All-NaN in argmax, argmin 587062505
604219688 https://github.com/pydata/xarray/pull/3871#issuecomment-604219688 https://api.github.com/repos/pydata/xarray/issues/3871 MDEyOklzc3VlQ29tbWVudDYwNDIxOTY4OA== toddrjen 2272878 2020-03-26T04:17:55Z 2020-03-26T04:17:55Z CONTRIBUTOR

Please see the newest version with the promote argument changed to fill_value.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement idxmax and idxmin functions 584837010
603900793 https://github.com/pydata/xarray/pull/3871#issuecomment-603900793 https://api.github.com/repos/pydata/xarray/issues/3871 MDEyOklzc3VlQ29tbWVudDYwMzkwMDc5Mw== toddrjen 2272878 2020-03-25T15:19:01Z 2020-03-25T15:19:01Z CONTRIBUTOR

That could work.

The corner case we would need to decide on is again promotion.

What happens if the fill value is a "higher" type in the numeric tower than the original type? What if it is lower?

  1. We could try to always convert to the fill dtype (or more often the dtype equivalent to the Python native type), and raise and exception of it doesn't work.
  2. We could promote the fill value or original data, whichever is "lower".

What if someone tries to use a string type for numeric data or vice versus? If we do option 1 that is easy. Otherwise we probably need to use numpy casting rules?

What about an object dtype fill value?

What about a date/time regard dtype?

On Mon, Mar 23, 2020, 23:49 Stephan Hoyer notifications@github.com wrote:

@shoyer commented on this pull request.

In xarray/core/dataset.py https://github.com/pydata/xarray/pull/3871#discussion_r396887940:

@@ -5914,5 +5921,169 @@ def pad(

     return self._replace_vars_and_dims(variables)
  • def idxmin(
  • self,
  • dim: Hashable = None,
  • axis: int = None,
  • skipna: bool = None,
  • promote: bool = None,

Just to throw out another API option: what about having a fill_value argument instead of promote? The default (fill_value=dtypes.NA) would do type promotion for integer dtypes and always fill with NA. Other values (e.g., fill_value=0) could be used to avoid type promotion with an integer coordinate.

Advantages:

  • No special cases to keep track of.
  • Consistent with other xarray methods that take a fill_value argument.

Disadvantages:

  • No built-in way to raise an error instead of promotion (but users could do this themselves pretty easily)
  • No built-in way to "only promote if necessary" (but this is a weird non-type stable API that doesn't work great with Dask, anyways)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/pull/3871#discussion_r396887940, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARK43Q3GISCTFURNDUDQNDRJAUTJANCNFSM4LP7WEMA .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement idxmax and idxmin functions 584837010
602134329 https://github.com/pydata/xarray/pull/3871#issuecomment-602134329 https://api.github.com/repos/pydata/xarray/issues/3871 MDEyOklzc3VlQ29tbWVudDYwMjEzNDMyOQ== toddrjen 2272878 2020-03-22T01:45:05Z 2020-03-22T01:45:05Z CONTRIBUTOR

I fixed the extra space in the docstring and moved the business logic to computation.py.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement idxmax and idxmin functions 584837010
601982212 https://github.com/pydata/xarray/pull/3871#issuecomment-601982212 https://api.github.com/repos/pydata/xarray/issues/3871 MDEyOklzc3VlQ29tbWVudDYwMTk4MjIxMg== toddrjen 2272878 2020-03-21T02:41:23Z 2020-03-21T02:41:23Z CONTRIBUTOR

@max-sixty

To what extent should this support non-index coordinates?

I am not familiar with non-index coordinates, what are those?

Do you mean non-dimension coordinates? Does that even make sense in a general way? If they are 1D and tied to just one dimension coordinate that could be done, but if they are not tied to any dimension or tied to multiple dimensions or otherwise not 1D I am not sure what it would mean to take the idxmin/idxmax of them.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement idxmax and idxmin functions 584837010
601979282 https://github.com/pydata/xarray/pull/3871#issuecomment-601979282 https://api.github.com/repos/pydata/xarray/issues/3871 MDEyOklzc3VlQ29tbWVudDYwMTk3OTI4Mg== toddrjen 2272878 2020-03-21T02:16:27Z 2020-03-21T02:16:27Z CONTRIBUTOR

@keewis @max-sixty The new commit with the requested changes has been pushed to this branch (except for the map one, pending ongoing discussion). Please take a look.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement idxmax and idxmin functions 584837010
601947199 https://github.com/pydata/xarray/pull/3871#issuecomment-601947199 https://api.github.com/repos/pydata/xarray/issues/3871 MDEyOklzc3VlQ29tbWVudDYwMTk0NzE5OQ== toddrjen 2272878 2020-03-20T23:06:45Z 2020-03-20T23:06:45Z CONTRIBUTOR

@keewis @max-sixty The map thing is purely a convenience function. I know there are other ways to do it, but since I thought this would be a useful feature for users in its own right, I did it that way. But of course I can do it another way if you disagree.

The one complication is that using DataArray.idxmax and DataArray.idxmin assumes that the Dataset would only ever contain DataArray objects. That may be mostly the case now, but I didn't want to bake that into the code. I could do it using a lambda or nested function, but as I said I thought this approach had other benefits to users.

I will address the other comments inline.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement idxmax and idxmin functions 584837010
597253762 https://github.com/pydata/xarray/issues/3855#issuecomment-597253762 https://api.github.com/repos/pydata/xarray/issues/3855 MDEyOklzc3VlQ29tbWVudDU5NzI1Mzc2Mg== toddrjen 2272878 2020-03-10T18:51:13Z 2020-03-10T18:51:13Z CONTRIBUTOR

Submitted to bottleneck.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `rolling.mean` gives negative values on non-negative array. 578736255

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 16.796ms · About: xarray-datasette
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows