home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where author_association = "MEMBER", issue = 1334835539 and user = 6628425 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • spencerkclark · 5 ✖

issue 1

  • regression in cftime on s390 · 5 ✖

author_association 1

  • MEMBER · 5 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1250058482 https://github.com/pydata/xarray/issues/6906#issuecomment-1250058482 https://api.github.com/repos/pydata/xarray/issues/6906 IC_kwDOAMm_X85KgmDy spencerkclark 6628425 2022-09-17T12:00:26Z 2022-09-17T12:20:41Z MEMBER

I was able to reproduce this issue in a Docker container using the s390x Debian image. After a little experimentation I narrowed it down to the following minimal example: ```

import numpy as np; import pandas as pd np.version '1.23.3' pd.version '1.4.4' pd.Series(np.array([1]).astype("<M8[h]")) Traceback (most recent call last): File "\<stdin>", line 1, in \<module> File "/usr/local/lib/python3.9/dist-packages/pandas/core/series.py", line 451, in __init__ data = sanitize_array(data, index, dtype, copy) File "/usr/local/lib/python3.9/dist-packages/pandas/core/construction.py", line 570, in sanitize_array subarr = _try_cast(data, dtype, copy, raise_cast_failure) File "/usr/local/lib/python3.9/dist-packages/pandas/core/construction.py", line 729, in _try_cast return sanitize_to_nanoseconds(arr, copy=copy) File "/usr/local/lib/python3.9/dist-packages/pandas/core/dtypes/cast.py", line 1717, in sanitize_to_nanoseconds values = conversion.ensure_datetime64ns(values) File "pandas/_libs/tslibs/conversion.pyx", line 257, in pandas._libs.tslibs.conversion.ensure_datetime64ns File "pandas/_libs/tslibs/np_datetime.pyx", line 120, in pandas._libs.tslibs.np_datetime.check_dts_bounds pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 8220291319602-05-05 16:00:00 This confirms it is an upstream issue. Interestingly if we use the native byte order (big-endian on this architecture) for the dtype, this example works: pd.Series(np.array([1]).astype("M8[h]")) 0 1970-01-01 01:00:00 dtype: datetime64[ns] or more explicitly pd.Series(np.array([1]).astype(">M8[h]")) 0 1970-01-01 01:00:00 dtype: datetime64[ns] ``` It appears the inverse of this issue (big-endian dtype leading to a failure on a little-endian system) came up once in pandas: https://github.com/pandas-dev/pandas/issues/29684. @amckinstry I'm not sure what it will take to fix this issue in pandas, but you are welcome to open an issue there. They may also have a difficult time reproducing and testing this, however (https://github.com/pandas-dev/pandas/pull/30976#issuecomment-573989082).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  regression in cftime on s390  1334835539
1241339308 https://github.com/pydata/xarray/issues/6906#issuecomment-1241339308 https://api.github.com/repos/pydata/xarray/issues/6906 IC_kwDOAMm_X85J_VWs spencerkclark 6628425 2022-09-08T23:38:29Z 2022-09-08T23:38:29Z MEMBER

Interesting. Thanks for checking that #6988 indeed solves this. I went ahead and merged it, but when I get a chance I’ll keep trying to track down the root cause of this issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  regression in cftime on s390  1334835539
1237014040 https://github.com/pydata/xarray/issues/6906#issuecomment-1237014040 https://api.github.com/repos/pydata/xarray/issues/6906 IC_kwDOAMm_X85Ju1YY spencerkclark 6628425 2022-09-05T13:18:21Z 2022-09-05T13:18:21Z MEMBER

Thanks @amckinstry. I guess my last try to produce a pandas minimal example might be: ```

import numpy as np; import pandas as pd pd.Series(np.array([np.int64(1000000).astype("<M8[h]")])) 0 2084-01-29 16:00:00 dtype: datetime64[ns] or potentially more simply: import numpy as np; import pandas as pd pd.Series(np.int64(1000000).astype("<M8[h]")) 0 2084-01-29 16:00:00 dtype: datetime64[ns] `` Somewhere something is going wrong in converting a non-nanosecond-precision datetime value to a nanosecond-precision one (maybe the cast to apd.Timestamp` in my earlier example was short-circuiting this).

I think #6988 should likely work around this issue at least on the xarray side, since it passes datetime64[ns] values into the DataArray constructor immediately. It also seems like the function where the error occurs (ensure_datetime64ns) was recently eliminated in favor of an updated implementation in pandas, so I wonder if this will be an issue there going forward.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  regression in cftime on s390  1334835539
1225514875 https://github.com/pydata/xarray/issues/6906#issuecomment-1225514875 https://api.github.com/repos/pydata/xarray/issues/6906 IC_kwDOAMm_X85JC997 spencerkclark 6628425 2022-08-24T10:13:38Z 2022-08-24T10:13:38Z MEMBER

Thanks for trying that. Maybe it has to do with casting to a pd.Series. Could you maybe also try: ```

import numpy as np; import pandas as pd pd.Series(pd.Timestamp(np.int64(1000000).astype("<M8[h]"))) 0 2084-01-29 16:00:00 dtype: datetime64[ns] ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  regression in cftime on s390  1334835539
1221528092 https://github.com/pydata/xarray/issues/6906#issuecomment-1221528092 https://api.github.com/repos/pydata/xarray/issues/6906 IC_kwDOAMm_X85Izwoc spencerkclark 6628425 2022-08-21T11:37:47Z 2022-08-21T11:37:47Z MEMBER

Apologies for taking a while to look into this. I have not been able to set up an environment to reproduce these test failures, which makes it tricky. It seems like the tests are failing in the setup step, where a DataArray of some random times is generated: data = xr.DataArray(np.random.randint(1, 1000000, size=(4, 5)).astype("<M8[h]"), dims=("x", "y")) In principle the NumPy code should not generate any times larger than 1,000,000 hours since 1970-01-01, i.e. 2084-01-29T16:00:00, which in theory should be should be representable with a nanosecond-precision pandas Timestamp.

Trying to narrow things down, I guess my first question would be: does the following fail in this environment? Is this maybe a pandas issue? ```

import numpy as np; import pandas as pd pd.Timestamp(np.int64(1000000).astype("<M8[h]")) Timestamp('2084-01-29 16:00:00') ``` I think these tests could be simplified some to remove the randomness, but that's probably a separate issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  regression in cftime on s390  1334835539

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 20.816ms · About: xarray-datasette