issues
3 rows where repo = 13221727, state = "closed" and user = 5308236 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date), closed_at (date)
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
365678022 | MDU6SXNzdWUzNjU2NzgwMjI= | 2452 | DataArray.sel extremely slow | mschrimpf 5308236 | closed | 0 | 5 | 2018-10-01T23:09:47Z | 2018-10-02T16:15:00Z | 2018-10-02T15:58:21Z | NONE | Problem description
Code Sample, a copy-pastable example if possible```python import timeit setup = """ import itertools import numpy as np import xarray as xr import string a = list(string.printable) b = list(string.ascii_lowercase) d = xr.DataArray(np.random.rand(len(a), len(b)), coords={'a': a, 'b': b}, dims=['a', 'b']) d.load() """ run = """ for _a, _b in itertools.product(a, b): d.sel(a=_a, b=_b) """ running_times = timeit.repeat(run, setup, repeat=3, number=10) print("xarray", running_times) # e.g. [14.792144000064582, 15.19372400001157, 15.345327000017278] ``` Expected OutputI would have expected the above code to run in milliseconds.
However, it takes over 10 seconds!
Adding an additional For reference, a naive dict-indexing implementation in Python takes 0.01 seconds: ```python setup = """ import itertools import numpy as np import string a = list(string.printable) b = list(string.ascii_lowercase) d = np.random.rand(len(a), len(b)) indexers = {'a': {coord: index for (index, coord) in enumerate(a)}, 'b': {coord: index for (index, coord) in enumerate(b)}} """ run = """ for _a, _b in itertools.product(a, b): index_a, index_b = indexers['a'][_a], indexers['b'][_b] item = d[index_a][index_b] """ running_times = timeit.repeat(run, setup, repeat=3, number=10) print("dicts", running_times) # e.g. [0.015355999930761755, 0.01466800004709512, 0.014295000000856817] ``` Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2452/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
363629186 | MDU6SXNzdWUzNjM2MjkxODY= | 2438 | Efficient workaround to group by multiple dimensions | mschrimpf 5308236 | closed | 0 | 3 | 2018-09-25T15:11:38Z | 2018-10-02T15:56:53Z | 2018-10-02T15:56:53Z | NONE | Grouping by multiple dimensions is not yet supported (#324):
An inefficient solution is to run the for loops manually: ```python a, b = np.unique(d['a'].values), np.unique(d['b'].values) result = xr.DataArray(np.zeros([len(a), len(b)]), coords={'a': a, 'b': b}, dims=['a', 'b']) for a, b in itertools.product(a, b): cells = d.sel(a=a, b=b) merge = cells.mean() result.loc[{'a': a, 'b': b}] = merge result = DataArray (a: 2, b: 2)> array([[2., 3.], [5., 6.]])Coordinates:* a (a) <U1 'x' 'y'* b (b) int64 0 1``` This is however horribly slow for larger arrays. Is there a more efficient / straight-forward work-around? Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2438/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
319085244 | MDU6SXNzdWUzMTkwODUyNDQ= | 2095 | combine complementary DataArrays | mschrimpf 5308236 | closed | 0 | 1 | 2018-05-01T01:02:26Z | 2018-05-02T01:34:53Z | 2018-05-02T01:34:52Z | NONE | I have a list of DataArrays with three dimensions. For each item in the list, two of the dimensions are a single value but the combination of all items would yield the full combinatorial values. Code Sample```python import itertools import numpy as np import xarray as xr
``` Expected OutputI then want to combine these complimentary The following do not work: ```python # does not automatically infer dimensions and fails with # "ValueError: conflicting sizes for dimension 'concat_dim': length 2 on 'concat_dim' and length 6 on <this-array>" ds = xr.concat(ds, dim=['dim1', 'dim2'])
``` Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2095/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [state] TEXT, [locked] INTEGER, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [comments] INTEGER, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [author_association] TEXT, [active_lock_reason] TEXT, [draft] INTEGER, [pull_request] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [state_reason] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [type] TEXT ); CREATE INDEX [idx_issues_repo] ON [issues] ([repo]); CREATE INDEX [idx_issues_milestone] ON [issues] ([milestone]); CREATE INDEX [idx_issues_assignee] ON [issues] ([assignee]); CREATE INDEX [idx_issues_user] ON [issues] ([user]);