home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

3 rows where type = "issue" and user = 2560426 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: updated_at, created_at (date), updated_at (date)

type 1

  • issue · 3 ✖

state 1

  • open 3

repo 1

  • xarray 3
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
713834297 MDU6SXNzdWU3MTM4MzQyOTc= 4482 Allow skipna in .dot() heerad 2560426 open 0     13 2020-10-02T18:52:41Z 2020-10-20T22:21:14Z   NONE      

Is your feature request related to a problem? Please describe. Right now there's no efficient way to do a dot product that skips over nan elements.

Describe the solution you'd like I want to be able to treat the summation in dot as a nansum, controlled by a skipna option. Either this can be implemented directly, or an additional ufunc can be added: xarray.unfuncs.nan_to_num that can be called on the inputs to dot. Unfortunately using numpy's nan_to_num will initiate eager execution.

Describe alternatives you've considered It's possible to implement this by hand, but it ends up being extremely inefficient in one of my use-cases: - (x*y).sum('dot_prod_dim', skipna=True) takes 30 seconds - x.dot(y) takes 1 second

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4482/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
712052219 MDU6SXNzdWU3MTIwNTIyMTk= 4474 Implement rolling_exp for dask arrays heerad 2560426 open 0     7 2020-09-30T15:31:50Z 2020-10-15T16:32:03Z   NONE      

Is your feature request related to a problem? Please describe. I use dask-based chunking on my arrays regularly and would like to leverage the efficient numbagg implementation of move_exp_nanmean() with rolling_exp().

Describe the solution you'd like It's possible to compute a rolling exp mean as a function of rolling exp means of contiguous, non-overlapping subsets (chunks). You just need to first "un-normalize" the rolling_exps of each chunk in order to split them into their corresponding numerators and denominators (see the ewm definition here under adjust=True). The normalization factor (denominator) to multiply back in to the chunk's move_exp_nanmean() in order to un-normalize it (numerator) is just the move_exp_nanmean() of 1's, replaced with NA's wherever the underlying data was also NA.

Then, scale each chunk's numerator and denominator series (derived from their move_exp_nanmean() series via above) down according to how many "lags-ago" they were, sum the rescaled numerators and denominators across chunks, and finally divide the total summed numerators and denominators.

Describe alternatives you've considered I implemented my own inefficient weighted rolling mean using xarray's rolling(). This requires a bunch of duplicate computation as the window gets shifted.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4474/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
712189206 MDU6SXNzdWU3MTIxODkyMDY= 4475 Preprocess function for save_mfdataset heerad 2560426 open 0     9 2020-09-30T18:47:06Z 2020-10-15T16:32:03Z   NONE      

Is your feature request related to a problem? Please describe. I would like to supply a preprocess argument to save_mfdataset that gets applied to each dataset before getting written to disk, similar to how open_mfdataset gives you such option. Specifically, have a dataset that I want to split by unique values along dimension, apply some further logic to each sub-dataset, then save each sub-dataset to a different file. Currently I'm able to split and save using the following code provided in the API docs:

years, datasets = zip(*ds.groupby("time.year")) paths = ["%s.nc" % y for y in years] xr.save_mfdataset(datasets, paths) What's missing is the ability to insert further logic to each of the sub-datasets given by the groupby object. If I try iterating through datasets here and chain further operations to each element, the calculations begin to execute serially even though ds is a dask array:

save_mfdataset([ds.foo() for ds in datasets], paths)

Describe the solution you'd like Instead, I'd like the ability to do:

xr.save_mfdataset(datasets, paths, preprocess=lambda ds: ds.foo())

Describe alternatives you've considered Not sure.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4475/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 36.415ms · About: xarray-datasette