home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

1 row where user = 8098361 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue 1

state 1

  • closed 1

repo 1

  • xarray 1
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
659142789 MDU6SXNzdWU2NTkxNDI3ODk= 4236 Allow passing args to preprocess function in open_mfdataset prs247au 8098361 closed 0     7 2020-07-17T10:52:14Z 2023-09-12T15:59:45Z 2023-09-12T15:59:45Z NONE      

For a set of netcdf files I'm opening with open_mfdataset I'd also like to pass a couple of extra arguments to the preprocess function. At the moment the Dataset seems to be the only arg that the preprocess function accepts.

The netcdf files have dimensions (time, lat, lon). It's the time dimension I'd like to cut down during a parallel load eg. using a start/end dayofyear. Each file covers a different year and I'd like to slice out particular dayofyear range within each. I tried calculating dayofyear inside the preprocess function and doing .sel, which works perfectly fine, but not with the parallel=True option. Without the parallel option loading is much slower. However, if parallel=True a pickling error occurs, possibly because I'm using other functions like dateparse, or timedelta inside the preprocess function to calculate the dayofyear (which itself is derived from a ipywidget). I don't really understand the pickling error, it says; ``` D:\Anaconda3\lib\pickle.py in save_global(self, obj, name) 963 raise PicklingError( 964 "Can't pickle %r: it's not the same object as %s.%s" % --> 965 (obj, module_name, name)) 966 967 if self.proto >= 2:

PicklingError: Can't pickle <built-in function input>: it's not the same object as builtins.input `` I also tried setting the start/end dayofyear as global outside thepreprocess` function and using them inside the function but changing those globals (integers) after the function is defined doesn't seem to alter the reference to them inside the function. I don't think this solution is very elegant anyway.

Many other packages have functions that use callbacks with the option of passing additional arguments. Has this been considered for open_mfdataset?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4236/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 21.399ms · About: xarray-datasette