home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

15 rows where author_association = "MEMBER" and issue = 655389649 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 3

  • fujiisoup 9
  • max-sixty 4
  • dcherian 2

issue 1

  • nd-rolling · 15 ✖

author_association 1

  • MEMBER · 15 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
670821295 https://github.com/pydata/xarray/pull/4219#issuecomment-670821295 https://api.github.com/repos/pydata/xarray/issues/4219 MDEyOklzc3VlQ29tbWVudDY3MDgyMTI5NQ== fujiisoup 6815844 2020-08-08T04:18:08Z 2020-08-08T04:18:08Z MEMBER

@max-sixty thanks for the review. merged

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  nd-rolling 655389649
670819822 https://github.com/pydata/xarray/pull/4219#issuecomment-670819822 https://api.github.com/repos/pydata/xarray/issues/4219 MDEyOklzc3VlQ29tbWVudDY3MDgxOTgyMg== max-sixty 5635139 2020-08-08T04:02:09Z 2020-08-08T04:02:09Z MEMBER

Great! Thanks @fujiisoup !

Ready to go from my POV

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  nd-rolling 655389649
670705764 https://github.com/pydata/xarray/pull/4219#issuecomment-670705764 https://api.github.com/repos/pydata/xarray/issues/4219 MDEyOklzc3VlQ29tbWVudDY3MDcwNTc2NA== fujiisoup 6815844 2020-08-07T20:45:01Z 2020-08-07T20:45:01Z MEMBER

Thanks @max-sixty .

You are completely correct. As the test pass, I was fooling myself.

The reason was that the dataset I was using for the test does not have time and x simultaneously. So I was not testing the 2d-rolling but just 1d-rolling.

Fixed. Now it correctly fails for mean and std, but passes with max and sum

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  nd-rolling 655389649
670625085 https://github.com/pydata/xarray/pull/4219#issuecomment-670625085 https://api.github.com/repos/pydata/xarray/issues/4219 MDEyOklzc3VlQ29tbWVudDY3MDYyNTA4NQ== dcherian 2448579 2020-08-07T17:26:06Z 2020-08-07T17:26:06Z MEMBER

I think it's more like da.rolling(x=2).construct("xr").rolling(y=2).construct("yr").std(["xr", "yr"])

but that gives array([[0., 5., 5.], [0., 5., 5.], [0., 5., 5.]])

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  nd-rolling 655389649
670594286 https://github.com/pydata/xarray/pull/4219#issuecomment-670594286 https://api.github.com/repos/pydata/xarray/issues/4219 MDEyOklzc3VlQ29tbWVudDY3MDU5NDI4Ng== max-sixty 5635139 2020-08-07T16:13:28Z 2020-08-07T16:13:28Z MEMBER

I'm still a bit confused. How does the "roll over each dimension in turn" approach equal the "roll over both dimension together" approach with a function like std? Here's a proposed counter example:

```python import xarray as xr import numpy as np

da = xr.DataArray(np.asarray([[0,10,0],[0,10,0], [0,10,0]]), dims=list('xy')) print(da)

<xarray.DataArray (x: 3, y: 3)> array([[ 0, 10, 0], [ 0, 10, 0], [ 0, 10, 0]]) Dimensions without coordinates: x, y

x_std = da.rolling(dict(x=2)).std() print(x_std)

<xarray.DataArray (x: 3, y: 3)> array([[nan, nan, nan], [ 0., 0., 0.], [ 0., 0., 0.]]) Dimensions without coordinates: x, y

x_then_y_std = x_std.rolling(dict(y=2)).std() print(x_then_y_std)

<xarray.DataArray (x: 3, y: 3)> array([[nan, nan, nan], [nan, 0., 0.], [nan, 0., 0.]]) Dimensions without coordinates: x, y

combined_std = da.rolling(dict(x=2, y=2)).std() print(combined_std)

<xarray.DataArray (x: 3, y: 3)> array([[nan, nan, nan], [nan, 5., 5.], [nan, 5., 5.]]) Dimensions without coordinates: x, y

```

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  nd-rolling 655389649
667411555 https://github.com/pydata/xarray/pull/4219#issuecomment-667411555 https://api.github.com/repos/pydata/xarray/issues/4219 MDEyOklzc3VlQ29tbWVudDY2NzQxMTU1NQ== fujiisoup 6815844 2020-07-31T22:25:25Z 2020-07-31T22:25:25Z MEMBER

Thanks @max-sixty for the review ;) I'll work for the update in a few days.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  nd-rolling 655389649
666841275 https://github.com/pydata/xarray/pull/4219#issuecomment-666841275 https://api.github.com/repos/pydata/xarray/issues/4219 MDEyOklzc3VlQ29tbWVudDY2Njg0MTI3NQ== fujiisoup 6815844 2020-07-31T00:42:23Z 2020-07-31T00:42:23Z MEMBER

Could anyone kindly review this?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  nd-rolling 655389649
658403527 https://github.com/pydata/xarray/pull/4219#issuecomment-658403527 https://api.github.com/repos/pydata/xarray/issues/4219 MDEyOklzc3VlQ29tbWVudDY1ODQwMzUyNw== fujiisoup 6815844 2020-07-14T20:44:12Z 2020-07-14T20:44:12Z MEMBER

I got an error for typechecking, only in CI but not in local, from the code that I didn't change.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  nd-rolling 655389649
657902895 https://github.com/pydata/xarray/pull/4219#issuecomment-657902895 https://api.github.com/repos/pydata/xarray/issues/4219 MDEyOklzc3VlQ29tbWVudDY1NzkwMjg5NQ== fujiisoup 6815844 2020-07-14T00:49:38Z 2020-07-14T00:49:38Z MEMBER

A possible improvement will be nan-reduction methods for nd-rolling. Currently, we just use numpy nan-reductions, which is memory consuming for strided arrays.

This issue can be solved by replacing nan by appropriate values and applying nonnan-reduction methods, e.g., python da.rolling(x=2, y=3).construct(x='xw', y='yw').sum(['xw', 'yw']) should be the same with python da.rolling(x=2, y=3).construct(x='xw', y='yw', fill_value=0).sum(['xw', 'yw'], skipna=False) and the latter is much more memory efficient.

I'd like to leave this improvement to future PR.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  nd-rolling 655389649
657897529 https://github.com/pydata/xarray/pull/4219#issuecomment-657897529 https://api.github.com/repos/pydata/xarray/issues/4219 MDEyOklzc3VlQ29tbWVudDY1Nzg5NzUyOQ== fujiisoup 6815844 2020-07-14T00:27:51Z 2020-07-14T00:27:51Z MEMBER

I think now it is ready for review, though I'm sure tests miss a lot of edge cases. Maybe we can fix them if pointed out.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  nd-rolling 655389649
657614995 https://github.com/pydata/xarray/pull/4219#issuecomment-657614995 https://api.github.com/repos/pydata/xarray/issues/4219 MDEyOklzc3VlQ29tbWVudDY1NzYxNDk5NQ== dcherian 2448579 2020-07-13T15:05:26Z 2020-07-13T15:05:26Z MEMBER

I think the dictionary is OK. We can allow scalars for center, stride etc. in which case the scalar applies to all dimensions. This would preserve backward compatibility.

scalar for min_counts also seems OK to me. We can update it if someone requests it.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  nd-rolling 655389649
657270068 https://github.com/pydata/xarray/pull/4219#issuecomment-657270068 https://api.github.com/repos/pydata/xarray/issues/4219 MDEyOklzc3VlQ29tbWVudDY1NzI3MDA2OA== fujiisoup 6815844 2020-07-12T20:18:28Z 2020-07-12T20:18:28Z MEMBER

Another API concern. We now use min_periods, in which we implicitly assume one-dimension cases.

With nd-dimension, I think min_counts argument is more appropriate like bottleneck, which will limit the lower bound of the number of missing entries in the n-dimensional window.

Even if we leave it, we may disallow nd-argument of min_periods but keep it a scalar.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  nd-rolling 655389649
657269189 https://github.com/pydata/xarray/pull/4219#issuecomment-657269189 https://api.github.com/repos/pydata/xarray/issues/4219 MDEyOklzc3VlQ29tbWVudDY1NzI2OTE4OQ== fujiisoup 6815844 2020-07-12T20:09:34Z 2020-07-12T20:09:34Z MEMBER

Hi @max-sixty

One alternative is to allow fluent args, like: ...but does that then seem like the second rolling is operating on the result of the first?

I couldn't think of it until just now. But yes, it sounds to me like a repeated rolling operation.

I'm being slow, but where is the nd-rolling algo? I had thought bottleneck didn't support more than one dimension?

No. With nd-rolling, we need to use numpy reductions. Its skipna=True operation is currently slow, but it can be improved replacing nan before the stride-trick.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  nd-rolling 655389649
657257847 https://github.com/pydata/xarray/pull/4219#issuecomment-657257847 https://api.github.com/repos/pydata/xarray/issues/4219 MDEyOklzc3VlQ29tbWVudDY1NzI1Nzg0Nw== max-sixty 5635139 2020-07-12T18:23:19Z 2020-07-12T18:23:28Z MEMBER

Re the API, I think the dict is probably the best option, although it does complicate as the arguments become differently typed depending on one vs multiple dimensions.

One alternative is to allow fluent args, like: python ( da .rolling(x=3, center=True) .rolling(y=5, min_periods=2) .mean() ) ...but does that then seem like the second rolling is operating on the result of the first?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  nd-rolling 655389649
657257600 https://github.com/pydata/xarray/pull/4219#issuecomment-657257600 https://api.github.com/repos/pydata/xarray/issues/4219 MDEyOklzc3VlQ29tbWVudDY1NzI1NzYwMA== max-sixty 5635139 2020-07-12T18:20:38Z 2020-07-12T18:20:38Z MEMBER

This looks very promising; I'm surprised it was possible without more code. I'm being slow, but where is the nd-rolling algo? I had thought bottleneck didn't support more than one dimension? https://bottleneck.readthedocs.io/en/latest/bottleneck.move.html, and that we'd have to implement our own in numbagg (which would be very possible)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  nd-rolling 655389649

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.201ms · About: xarray-datasette