home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

3 rows where comments = 4, state = "open" and user = 2448579 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

type 1

  • issue 3

state 1

  • open · 3 ✖

repo 1

  • xarray 3
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
2259316341 I_kwDOAMm_X86Gqm51 8965 Support concurrent loading of variables dcherian 2448579 open 0     4 2024-04-23T16:41:24Z 2024-04-29T22:21:51Z   MEMBER      

Is your feature request related to a problem?

Today if users have to concurrently load multiple variables in a DataArray or Dataset, they have to use dask.

It struck me that it'd be pretty easy for .load to gain an executor kwarg that accepts anything that follows the concurrent.futures executor interface, and parallelize this loop.

https://github.com/pydata/xarray/blob/b0036749542145794244dee4c4869f3750ff2dee/xarray/core/dataset.py#L853-L857

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8965/reactions",
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
2027147099 I_kwDOAMm_X854089b 8523 tree-reduce the combine for `open_mfdataset(..., parallel=True, combine="nested")` dcherian 2448579 open 0     4 2023-12-05T21:24:51Z 2023-12-18T19:32:39Z   MEMBER      

Is your feature request related to a problem?

When parallel=True and a distributed client is active, Xarray reads every file in parallel, constructs a Dataset per file with indexed coordinates loaded, and then sends all of that back to the "head node" for the combine.

Instead we can tree-reduce the combine (example) by switching to dask.bag instead of dask.delayed and skip the overhead of shipping 1000s of copies of an indexed coordinate back to the head node.

  1. The downside is the dask graph is "worse" but perhaps that shouldn't stop us.
  2. I think this is only feasible for combine="nested"

cc @TomNicholas

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8523/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1760733017 I_kwDOAMm_X85o8qdZ 7924 Migrate from nbsphinx to myst, myst-nb dcherian 2448579 open 0     4 2023-06-16T14:17:41Z 2023-06-20T22:07:42Z   MEMBER      

Is your feature request related to a problem?

I think we should switch to MyST markdown for our docs. I've been using MyST markdown and MyST-NB in docs in other projects and it works quite well.

Advantages: 1. We get HTML reprs in the docs (example) which is a big improvement. (#6620) 2. I think many find markdown a lot easier to write than RST

There's a tool to migrate RST to MyST (RTD's migration guide).

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7924/reactions",
    "total_count": 5,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 39.662ms · About: xarray-datasette