home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

2 rows where comments = 13, state = "open" and user = 1217238 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date)

These facets timed out: state

type 1

  • issue 2

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1376109308 I_kwDOAMm_X85SBcL8 7045 Should Xarray stop doing automatic index-based alignment? shoyer 1217238 open 0     13 2022-09-16T15:31:03Z 2023-08-23T07:42:34Z   MEMBER      

What is your issue?

I am increasingly thinking that automatic index-based alignment in Xarray (copied from pandas) may have been a design mistake. Almost every time I work with datasets with different indexes, I find myself writing code to explicitly align them:

  1. Automatic alignment is hard to predict. The implementation is complicated, and the exact mode of automatic alignment (outer vs inner vs left join) depends on the specific operation. It's also no longer possible to predict the shape (or even the dtype) resulting from most Xarray operations purely from input shape/dtype.
  2. Automatic alignment brings unexpected performance penalty. In some domains (analytics) this is OK, but in others (e.g,. numerical modeling or deep learning) this is a complete deal-breaker.
  3. Automatic alignment is not useful for float indexes, because exact matches are rare. In practice, this makes it less useful in Xarray's usual domains than it for pandas.

Would it be insane to consider changing Xarray's behavior to stop doing automatic alignment? I imagine we could roll this out slowly, first with warnings and then with an option for disabling it.

If you think this is a good or bad idea, consider responding to this issue with a 👍 or 👎 reaction.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7045/reactions",
    "total_count": 13,
    "+1": 9,
    "-1": 2,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 2
}
    xarray 13221727 issue
294241734 MDU6SXNzdWUyOTQyNDE3MzQ= 1887 Boolean indexing with multi-dimensional key arrays shoyer 1217238 open 0     13 2018-02-04T23:28:45Z 2021-04-22T21:06:47Z   MEMBER      

Originally from https://github.com/pydata/xarray/issues/974

For boolean indexing: - da[key] where key is a boolean labelled array (with any number of dimensions) is made equivalent to da.where(key.reindex_like(ds), drop=True). This matches the existing behavior if key is a 1D boolean array. For multi-dimensional arrays, even though the result is now multi-dimensional, this coupled with automatic skipping of NaNs means that da[key].mean() gives the same result as in NumPy. - da[key] = value where key is a boolean labelled array can be made equivalent to da = da.where(*align(key.reindex_like(da), value.reindex_like(da))) (that is, the three argument form of where). - da[key_0, ..., key_n] where all of key_i are boolean arrays gets handled in the usual way. It is an IndexingError to supply multiple labelled keys if any of them are not already aligned with as the corresponding index coordinates (and share the same dimension name). If they want alignment, we suggest users simply write da[key_0 & ... & key_n].

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1887/reactions",
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 3360.763ms · About: xarray-datasette