home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

7 rows where issue = 182667672 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 4

  • shoyer 2
  • jhamman 2
  • chunweiyuan 2
  • stale[bot] 1

author_association 3

  • MEMBER 4
  • CONTRIBUTOR 2
  • NONE 1

issue 1

  • center=True for xarray.DataArray.rolling() · 7 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
731523316 https://github.com/pydata/xarray/issues/1046#issuecomment-731523316 https://api.github.com/repos/pydata/xarray/issues/1046 MDEyOklzc3VlQ29tbWVudDczMTUyMzMxNg== stale[bot] 26384082 2020-11-21T07:36:23Z 2020-11-21T07:36:23Z NONE

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity

If this issue remains relevant, please comment here or remove the stale label; otherwise it will be marked as closed automatically

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  center=True for xarray.DataArray.rolling() 182667672
255246621 https://github.com/pydata/xarray/issues/1046#issuecomment-255246621 https://api.github.com/repos/pydata/xarray/issues/1046 MDEyOklzc3VlQ29tbWVudDI1NTI0NjYyMQ== chunweiyuan 5572303 2016-10-20T22:32:23Z 2016-10-20T22:32:23Z CONTRIBUTOR

Let me exhaust a few other ideas first. I'll definitely share my thoughts here first before making any commit. Thanks.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  center=True for xarray.DataArray.rolling() 182667672
254347566 https://github.com/pydata/xarray/issues/1046#issuecomment-254347566 https://api.github.com/repos/pydata/xarray/issues/1046 MDEyOklzc3VlQ29tbWVudDI1NDM0NzU2Ng== jhamman 2443309 2016-10-17T22:00:32Z 2016-10-17T22:00:32Z MEMBER

I'm fine with this approach for now. It would be great if we could convince bottleneck to help us out with a keyword argument of some kind.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  center=True for xarray.DataArray.rolling() 182667672
253929482 https://github.com/pydata/xarray/issues/1046#issuecomment-253929482 https://api.github.com/repos/pydata/xarray/issues/1046 MDEyOklzc3VlQ29tbWVudDI1MzkyOTQ4Mg== shoyer 1217238 2016-10-14T21:56:42Z 2016-10-14T21:56:51Z MEMBER

@chunweiyuan I agree, this seems worth doing, and I think you have a pretty sensible approach here. For large arrays (especially with ndim > 1), this should add only minimal performance overhead. If you can fit this into the existing framework for rolling that would be awesome!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  center=True for xarray.DataArray.rolling() 182667672
253681067 https://github.com/pydata/xarray/issues/1046#issuecomment-253681067 https://api.github.com/repos/pydata/xarray/issues/1046 MDEyOklzc3VlQ29tbWVudDI1MzY4MTA2Nw== chunweiyuan 5572303 2016-10-14T00:53:54Z 2016-10-14T00:53:54Z CONTRIBUTOR

My opinion is that the nan has got to go. If we want to (1) maintain pandas-consistency and (2) use bottleneck without mucking it up, then I think we need to add some logic in either rolling.reduce() or rolling._center_result().

So here's my failed attempt:

``` def reverse_and_roll_1d(data, window_size, min_periods=1): """ Implements a concept to fix the end-of-array problem with xarray.core.rolling._center_shift(), by 1.) take slice of the back-end of the array 2.) flip it 3.) compute centered-window arithmetic 4.) flip it again 5.) replace back-end of default result with (4)

:param DataArray data: 1-D data array, with dim name 'x'.
:param int window_size: size of window.
"""
# first the default way to computing centered window
r = data.rolling(x=window_size, center=True, min_periods=min_periods)
avg = r.mean()
# now we need to fix the back-end of the array
rev_start = len(data.x) # an index
rev_end = len(data.x) - window_size - 1 \
                 if len(data.data) > window_size \
                else None  # another index
tail_slice = slice(rev_start, rev_end, -1) # back end of array, flipped
r2 = data[dict(x=tail_slice)].\
    rolling(x=window_size, center=True, min_periods=min_periods)
avg[dict(x=slice(-window_size+1, None))] = \
    r2.mean()[dict(x=slice(window_size-2, None, -1))] # replacement

return avg

```

This algorithm is consistently 8 times slower than pd.DataFrame.rolling(), for various 1d array sizes.

I'm open to ideas as well :)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  center=True for xarray.DataArray.rolling() 182667672
253408063 https://github.com/pydata/xarray/issues/1046#issuecomment-253408063 https://api.github.com/repos/pydata/xarray/issues/1046 MDEyOklzc3VlQ29tbWVudDI1MzQwODA2Mw== jhamman 2443309 2016-10-13T03:58:32Z 2016-10-13T03:58:32Z MEMBER

We do try to stay consistent with pandas except for the last position. Here's the unit test where we verify that behavior.

Using x=0 from your example in Pandas:

``` Python In [1]: import pandas as pd s In [2]: data = pd.Series([0, 3, 6])

In [3]: data.rolling(3, center=True, min_periods=1).mean() Out[3]: 0 1.5 1 3.0 2 4.5 ```

If I remember correctly, and my brain is a bit like mush right now so I could be wrong, bottleneck and pandas handle this case differently so we had to make a decision. We choose to use bottleneck (for speed) but to do our best to stay consistent with pandas. Back to your example, this time just with bottleneck:

Python In [4]: bn.move_mean(data, 3, min_count=1) Out[4]: array([ 0. , 1.5, 3. ])

So, as you can see, bottleneck does something totally different that wouldn't otherwise work with center=True unless we did our little shift trick. I'm not really sure the best way to correct for this difference in the last position except to either a) try to push a center=True option into bottleneck (may not be possible), or b) write a bunch of logic on our end bridge the gap between these two (may be laborious). Of course, I'm open to ideas.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  center=True for xarray.DataArray.rolling() 182667672
253405068 https://github.com/pydata/xarray/issues/1046#issuecomment-253405068 https://api.github.com/repos/pydata/xarray/issues/1046 MDEyOklzc3VlQ29tbWVudDI1MzQwNTA2OA== shoyer 1217238 2016-10-13T03:37:55Z 2016-10-13T03:37:55Z MEMBER

I think we mostly tried to make this consistent with pandas. To be honest I don't entirely understand the logic myself.

Cc @jhamman

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  center=True for xarray.DataArray.rolling() 182667672

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 16.08ms · About: xarray-datasette
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows