home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

3 rows where state = "open" and user = 57914115 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

type 2

  • pull 2
  • issue 1

state 1

  • open · 3 ✖

repo 1

  • xarray 3
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1947869312 PR_kwDOAMm_X85dCo6P 8324 Implement cftime vectorization as discussed in PR #8322 antscloud 57914115 open 0     0 2023-10-17T17:01:25Z 2023-10-23T05:11:11Z   CONTRIBUTOR   0 pydata/xarray/pulls/8324

As discussed in #8322, here is the test for implementing the vectorization

Only this test seems to fail in test_coding_times.py : https://github.com/pydata/xarray/blob/f895dc1a748b41d727c5e330e8d664a8b8780800/xarray/tests/test_coding_times.py#L1061-L1071

I don't really understand why though if you have an idea

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8324/reactions",
    "total_count": 2,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 2,
    "eyes": 0
}
    xarray 13221727 pull
1941775048 I_kwDOAMm_X85zvSLI 8302 Rust-based cftime Implementation for Python antscloud 57914115 open 0     2 2023-10-13T11:33:20Z 2023-10-22T16:35:50Z   CONTRIBUTOR      

Is your feature request related to a problem?

I developped a rust based project with python bindings code to parse the CF time conventions and deal with datetime operations.

You can find the project on GitHub at https://github.com/antscloud/cftime-rs.

It was something missing in the rust ecosystem to deal with NetCDF files, As the project in Rust hits its first working version, I wanted to explore the maturinecosystem and the Rust as a backend for python code. I ended up creating a new cftime implementation for python that have significant perfomance improvement (somewhere between 4 times to 20 times faster depending on the use case) over cftimeoriginal Cython code.

There are surely missing features compared to cftime and need to be tested more, but I think it could be interested as a replacement for some xarray operations (mainly for speed) regarding some of the issues of topic-cftime label

Please, let me know if xarray team could be interested. If you are, I can open a pull request to see it is possible, where it breaks the unit tests and if it's worth it

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8302/reactions",
    "total_count": 2,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1947508727 PR_kwDOAMm_X85dBaso 8322 Implementation of rust based cftime antscloud 57914115 open 0     1 2023-10-17T14:00:45Z 2023-10-17T22:20:31Z   CONTRIBUTOR   0 pydata/xarray/pulls/8322

As discussed in #8302, here is a first attempt to implement cftime_rs.

There are a lot of tests and I struggle to understand all the processing in coding/times.py. However, with this first attempt I've been able to make the test_cf_datetime work (ignoring one test)

https://github.com/pydata/xarray/blob/8423f2c47306cc3a4a52990818964f278179491f/xarray/tests/test_coding_times.py#L127-L131

Also there are some key differences betwwen cftime and cftime-rs : - A long int is used to represent the timestamp internally, so cftime-rs will not overflow as soon as numpy, pythonor cftime. It can go from -291,672,107,014 BC to 291,672,107,014 AD approximately and this depends on calendar. - There is no only_use_python_datetimes argument. Instead there are 4 distinct functions : - date2num() - num2date() - num2pydate() - pydate2num() - These functions only take a python list of one dimension and return a list of one dimension. A conversion should be done before. - There is no multiple datetime type (there are hidden) but instead a single object PyCFDatetime - There is no conda repository at the moment

Finally, and regardless of this PR, I guess there could be a speed improvement by vectorizing operations by replacing this : https://github.com/pydata/xarray/blob/df0ddaf2e68a6b033b4e39990d7006dc346fcc8c/xarray/coding/times.py#L622-L649

by something like this :

https://github.com/pydata/xarray/blob/8423f2c47306cc3a4a52990818964f278179491f/xarray/coding/times.py#L631-L670

We can use numpy instead of list comprehensions. It takes a bit more of memory though.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8322/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 1,
    "eyes": 0
}
    xarray 13221727 pull

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 4401.443ms · About: xarray-datasette