home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

19 rows where author_association = "MEMBER" and issue = 120038291 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 2

  • shoyer 11
  • jhamman 8

issue 1

  • Feature/rolling · 19 ✖

author_association 1

  • MEMBER · 19 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
186492825 https://github.com/pydata/xarray/pull/668#issuecomment-186492825 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE4NjQ5MjgyNQ== shoyer 1217238 2016-02-20T02:37:52Z 2016-02-20T02:37:52Z MEMBER

Woot!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291
186488866 https://github.com/pydata/xarray/pull/668#issuecomment-186488866 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE4NjQ4ODg2Ng== shoyer 1217238 2016-02-20T02:26:58Z 2016-02-20T02:26:58Z MEMBER

OK, this looks good to me. Merge when you're ready!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291
186065185 https://github.com/pydata/xarray/pull/668#issuecomment-186065185 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE4NjA2NTE4NQ== jhamman 2443309 2016-02-19T05:40:25Z 2016-02-19T05:40:25Z MEMBER

Okay, _full_like (formerly empty_like) has been made private. I think it will be a useful feature for a bunch of applications but its not the point here. I'd like to build out functionality akin to what @shoyer indicated (zeros_like, ones_like, and missing_like). Actually, that can all be done by _full_like as it stands but it sounds like we need to put some more thought into the API before making it public.

An example of how we differ from Pandas in the last position with center=True. This stems from how we "center" the data using shift.

``` Python In [7]: arr Out[7]: <xarray.DataArray (y: 5)> array([ 2.5, 3. , 3.5, 4. , 4.5]) Coordinates: x int64 1 * y (y) int64 0 1 2 3 4

In [8]: arr.rolling(y=3, center=True, min_periods=1).mean() Out[8]: <xarray.DataArray (y: 5)> array([ 2.75, 3. , 3.5 , 4. , nan]) Coordinates: x int64 1 * y (y) int64 0 1 2 3 4

In [9]: pd.rolling_mean(arr.to_series(), 3, center=True, min_periods=1) Out[9]: y 0 2.75 1 3.00 2 3.50 3 4.00 4 4.25 dtype: float64 ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291
185817586 https://github.com/pydata/xarray/pull/668#issuecomment-185817586 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE4NTgxNzU4Ng== shoyer 1217238 2016-02-18T17:07:11Z 2016-02-18T17:07:11Z MEMBER

I'd also still love to see an explicit example where our behavior differs from pandas (in the last position if center=True) so we can try to figure out what's going on. This might actually be a bug on the pandas side :).

Generally this PR is looking very close. We could differ some of the API design work by keeping empty_like private for now, but I'm also happy to hash this out here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291
185816575 https://github.com/pydata/xarray/pull/668#issuecomment-185816575 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE4NTgxNjU3NQ== shoyer 1217238 2016-02-18T17:03:37Z 2016-02-18T17:03:37Z MEMBER

What is the full set of functions like empty_like that we would want for the public API? zeros_like, ones_like, full_like, maybe missing_like?

empty_like (without a fill value) doesn't make a lot of sense for dask.array, but all the rest of these do, and it would be nice to have the xarray functions work by copying dask arrays by making new dask arrays.

One possibility, instead of making a separate missing_like, is to make xr.empty_like always fill with NaN. Users can always drop into numpy directly if they really want to make an array and not set the values -- this very rarely makes an actual difference for performance since memory allocation is so slow.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291
185365318 https://github.com/pydata/xarray/pull/668#issuecomment-185365318 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE4NTM2NTMxOA== jhamman 2443309 2016-02-17T19:28:05Z 2016-02-17T21:24:33Z MEMBER

@shoyer -

This could use another review from you. Failing tests have been fixed. There's a bunch more functionality that can be built out in future pull requests. This provides the basic rolling functionality we were going for.

One thing to note, with center=True, we differ from pandas in the last position in some instances. I made a note in my unit test that we're okay with that for now.

~~Lastly, this will need a squash and I'll do that once we're settled on the features.~~

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291
167563505 https://github.com/pydata/xarray/pull/668#issuecomment-167563505 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE2NzU2MzUwNQ== shoyer 1217238 2015-12-28T12:49:03Z 2015-12-28T12:49:03Z MEMBER

@jhamman how are we doing here? Are you waiting on a review from me?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291
162437327 https://github.com/pydata/xarray/pull/668#issuecomment-162437327 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE2MjQzNzMyNw== jhamman 2443309 2015-12-07T07:29:49Z 2015-12-07T07:29:49Z MEMBER

I'll give this a test, but it looks like you have all the pieces to me....

I'm getting this TypeError:

Python E TypeError: move_sum() takes at least 2 positional arguments (0 given)

which makes me think the injection is failing to pass the arguments in

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291
162031347 https://github.com/pydata/xarray/pull/668#issuecomment-162031347 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE2MjAzMTM0Nw== shoyer 1217238 2015-12-04T17:40:47Z 2015-12-04T17:40:47Z MEMBER

How did pandas land on this. To me it makes more sense as an argument to init but I'll go with whatever pandas decided for consistency.

Still unresolved, though Jeff Reback agrees with you. It's being discussed in the rolling PR currently.

Also: what about changing the default min_count to 0? I think that would be more consistent with pandas, which skips over missing values by default.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291
162030626 https://github.com/pydata/xarray/pull/668#issuecomment-162030626 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE2MjAzMDYyNg== shoyer 1217238 2015-12-04T17:37:26Z 2015-12-04T17:37:26Z MEMBER

I wanted consistency between reduce, _bottleneck_reduce and iter.

Agreed, this would be nice. But if min_count=0, this won't be the case, because you will average over partial windows at the start of the rolling iteration. For example, you apply the aggregation function to windows of size [1, 2, 3, 3, 3, 3]. And the labels are also not consistent.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291
162020564 https://github.com/pydata/xarray/pull/668#issuecomment-162020564 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE2MjAyMDU2NA== jhamman 2443309 2015-12-04T16:55:25Z 2015-12-04T16:55:25Z MEMBER

For iteration, what about only iterating over full windows? Thinking about how I might use iteration, I think this might be more useful than returning some shrunk windows.

I did consider this at first and it wouldn't be all that hard to implement but I chose not to go this route because I wanted consistency between reduce, _bottleneck_reduce and __iter__. In theory, all three of these should provide the same answer:

``` Python rolling_obj = da.rolling(time=4)

rolling_obj.mean() # bottleneck move_mean rolling_obj.reduce(np.nanmean) # numpy nanmean over each window concat([da.mean(dim='time') for _, da in rolling_obj], dim=rolling_obj.window_labels) # manual mean via iterable - same as reduce ```

I think you've done a pretty reasonable job of interpreting min_periods for iteration, but I would still vote for defining it only as an argument to the aggregation methods and not worrying about it for iteration.

How did pandas land on this. To me it makes more sense as an argument to __init__ but I'll go with whatever pandas decided for consistency.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291
161808000 https://github.com/pydata/xarray/pull/668#issuecomment-161808000 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE2MTgwODAwMA== shoyer 1217238 2015-12-03T22:29:43Z 2015-12-03T22:29:43Z MEMBER

@shoyer - would you mind taking a look at what I've just tried (and failed) in ops.py and common.py? I think I'm missing a big piece of the injection puzzle.

I'll give this a test, but it looks like you have all the pieces to me....

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291
161804311 https://github.com/pydata/xarray/pull/668#issuecomment-161804311 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE2MTgwNDMxMQ== shoyer 1217238 2015-12-03T22:22:00Z 2015-12-03T22:22:00Z MEMBER

For iteration, what about only iterating over full windows? Thinking about how I might use iteration, I think this might be more useful than returning some shrunk windows.

Concretely, this means that if you iterate over xray.DataArray(range(6), dims='x').rolling(x=3), results would have labels from 1 through 5.

I think you've done a pretty reasonable job of interpreting min_periods for iteration, but I would still vote for defining it only as an argument to the aggregation methods and not worrying about it for iteration. It keeps things a bit simpler and easier to understand. OTOH, if you can think of use cases for min_periods with iteration, I could be convinced :).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291
161792935 https://github.com/pydata/xarray/pull/668#issuecomment-161792935 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE2MTc5MjkzNQ== jhamman 2443309 2015-12-03T21:41:09Z 2015-12-03T21:41:09Z MEMBER

sounds good. Thanks. That's got to slow down work at a tech company :zzz:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291
161791877 https://github.com/pydata/xarray/pull/668#issuecomment-161791877 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE2MTc5MTg3Nw== shoyer 1217238 2015-12-03T21:36:50Z 2015-12-03T21:36:50Z MEMBER

Internet at work today is only working 20% of the time. I'm happy to take a look once things get back online :).

On Thu, Dec 3, 2015 at 1:05 PM, Joe Hamman notifications@github.com wrote:

@shoyer - would you mind taking a look at what I've just tried (and failed) in ops.py and common.py? I think I'm missing a big piece of the injection puzzle.

Reply to this email directly or view it on GitHub: https://github.com/xray/xray/pull/668#issuecomment-161784342

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291
161784342 https://github.com/pydata/xarray/pull/668#issuecomment-161784342 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE2MTc4NDM0Mg== jhamman 2443309 2015-12-03T21:05:49Z 2015-12-03T21:05:49Z MEMBER

@shoyer - would you mind taking a look at what I've just tried (and failed) in ops.py and common.py? I think I'm missing a big piece of the injection puzzle.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291
161782889 https://github.com/pydata/xarray/pull/668#issuecomment-161782889 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE2MTc4Mjg4OQ== shoyer 1217238 2015-12-03T20:59:28Z 2015-12-03T20:59:28Z MEMBER

How do you suggest we handle the bottleneck dependency? That is the reason for the failing tests at the moment.

You can either add a try/except around a top level import of bottleneck, or only import bottleneck locally inside functions which need it. I think I would prefer the later approach because it results in more intelligible error messages (ImportError: No module named bottleneck rather than NameError: name 'bottleneck' is not defined).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291
161550342 https://github.com/pydata/xarray/pull/668#issuecomment-161550342 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE2MTU1MDM0Mg== jhamman 2443309 2015-12-03T08:33:33Z 2015-12-03T08:33:33Z MEMBER

@shoyer - I made some more progress here tonight. How do you suggest we handle the bottleneck dependency? That is the reason for the failing tests at the moment.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291
161470610 https://github.com/pydata/xarray/pull/668#issuecomment-161470610 https://api.github.com/repos/pydata/xarray/issues/668 MDEyOklzc3VlQ29tbWVudDE2MTQ3MDYxMA== jhamman 2443309 2015-12-02T23:54:32Z 2015-12-02T23:54:32Z MEMBER

@shoyer - thanks for the first look. I'll give it another hack.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/rolling 120038291

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 18.91ms · About: xarray-datasette