home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where issue = 1030768250 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 2

  • chiaral 2
  • mathause 2

author_association 2

  • CONTRIBUTOR 2
  • MEMBER 2

issue 1

  • Rolling() gives values different from pd.rolling() · 4 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
948691519 https://github.com/pydata/xarray/issues/5877#issuecomment-948691519 https://api.github.com/repos/pydata/xarray/issues/5877 IC_kwDOAMm_X844i-I_ mathause 10194086 2021-10-21T14:45:47Z 2021-10-21T14:45:47Z MEMBER

AFAIK bottleneck uses a less precise algorithm for sums than numpy (pydata/bottleneck#379). However, I don't know why this yields 0 at the beginning but not at the end.

A slightly more minimal example:

```python import bottleneck as bn import numpy as np import pandas as pd

data = np.array( [ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.31, 0.91999996, 8.3, 1.42, 0.03, 1.22, 0.09999999, 0.14, 0.13, 0.0, 0.12, 0.03, 2.53, 0.0, 0.19999999, 0.19999999, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ], dtype="float32", )

bn.move_sum(data, window=3) pd.Series(data).rolling(3).mean() np.convolve(data, np.ones(3), 'valid') / 3 ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Rolling() gives values different from pd.rolling() 1030768250
947906426 https://github.com/pydata/xarray/issues/5877#issuecomment-947906426 https://api.github.com/repos/pydata/xarray/issues/5877 IC_kwDOAMm_X844f-d6 chiaral 8453445 2021-10-20T17:59:13Z 2021-10-20T17:59:13Z CONTRIBUTOR

Yup - just followed your suggestion and:

1) conda removed bottleneck and it removed xarray and pandas as well 2) conda installed xarray which installed xarray, pandas, and pytz

and now the xr.rolling(time=3).sum() yields:

array([ nan, nan, 0. , 0. , 0. , 0. , 0. , 0.31 , 1.23 , 9.530001 , 10.64 , 9.75 , 2.67 , 1.35 , 1.46 , 0.36999997, 0.26999998, 0.25 , 0.14999999, 2.68 , 2.56 , 2.73 , 0.39999998, 0.39999998, 0.19999999, 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ], dtype=float32)

could you elaborate more on the issue? is this because of some bouncing between precisions across packages? But why do I have zeros at the beginning of the rolling sum and non zeros after having calculated a sum? it is not consistent in the behaviour.

Thanks tho!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Rolling() gives values different from pd.rolling() 1030768250
947893467 https://github.com/pydata/xarray/issues/5877#issuecomment-947893467 https://api.github.com/repos/pydata/xarray/issues/5877 IC_kwDOAMm_X844f7Tb mathause 10194086 2021-10-20T17:41:04Z 2021-10-20T17:41:04Z MEMBER

Thanks for the report. Without testing anything I suspect that this is due to the use of float32 data and/ or bottleneck - see also #1346. You can test this by uninstalling bottleneck (there is an option to disable bottleneck but it's not yet released (#5560).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Rolling() gives values different from pd.rolling() 1030768250
947195221 https://github.com/pydata/xarray/issues/5877#issuecomment-947195221 https://api.github.com/repos/pydata/xarray/issues/5877 IC_kwDOAMm_X844dQ1V chiaral 8453445 2021-10-20T00:02:58Z 2021-10-20T00:02:58Z CONTRIBUTOR

Adding a few extra observations:

python ds_ex.rolling(time=3).mean().pr.values df_ex.rolling(window=3).mean().values.T have a similar behaviour, in that once again xr.rolling() doesn't have zero where it should, but pd.rolling does.

But when I switch to other operations, like var or std the behaviour is the opposite, i.e.: ds_ex.rolling(time=3).std().pr.values array([ nan, nan, 0. , 0. , 0. , 0. , 0. , 0.1461354 , 0.38218665, 3.631293 , 3.367307 , 3.6156974 , 0.61356837, 0.54522127, 0.5188016 , 0.01698606, 0.06376763, 0.05906381, 0.05098677, 1.157881 , 1.1856455 , 1.148419 , 0.09427918, 0.09427918, 0.09427926, 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ], dtype=float32)

whereas

df_ex.rolling(window=3).std().values.T gives

array([[ nan, nan, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 1.78978585e-01, 4.68081166e-01, 4.44740760e+00, 4.12409195e+00, 4.42830679e+00, 7.51465227e-01, 6.67757461e-01, 6.35400157e-01, 2.08166670e-02, 7.81024957e-02, 7.23417792e-02, 6.24499786e-02, 1.41810905e+00, 1.45211339e+00, 1.40652052e+00, 1.15470047e-01, 1.15470047e-01, 1.15470047e-01, 9.60572442e-08, 9.60572442e-08, 9.60572442e-08, 9.60572442e-08, 9.60572442e-08, 9.60572442e-08, 9.60572442e-08, 9.60572442e-08, 9.60572442e-08, 9.60572442e-08]])

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Rolling() gives values different from pd.rolling() 1030768250

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 158.305ms · About: xarray-datasette