issues
3 rows where comments = 13, state = "closed" and user = 5635139 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date), closed_at (date)
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1913983402 | I_kwDOAMm_X85yFRGq | 8233 | numbagg & flox | max-sixty 5635139 | closed | 0 | 13 | 2023-09-26T17:33:32Z | 2023-10-15T07:48:56Z | 2023-10-09T15:40:29Z | MEMBER | What is your issue?I've been doing some work recently on our old friend numbagg, improving the ewm routines & adding some more. I'm keen to get numbagg back in shape, doing the things that it does best, and trimming anything it doesn't. I notice that it has grouped calcs. Am I correct to think that flox does this better? I haven't been up with the latest. flox looks like it's particularly focused on dask arrays, whereas numpy_groupies, one of the inspirations for this, was applicable to numpy arrays too. At least from the xarray perspective, are we OK to deprecate these numbagg functions, and direct folks to flox? |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8233/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
365973662 | MDU6SXNzdWUzNjU5NzM2NjI= | 2459 | Stack + to_array before to_xarray is much faster that a simple to_xarray | max-sixty 5635139 | closed | 0 | 13 | 2018-10-02T16:13:26Z | 2020-07-02T20:39:01Z | 2020-07-02T20:39:01Z | MEMBER | I was seeing some slow performance around To reproduce: Create a series with a MultiIndex, ensuring the MultiIndex isn't a simple product: ```python s = pd.Series( np.random.rand(100000), index=pd.MultiIndex.from_product([ list('abcdefhijk'), list('abcdefhijk'), pd.DatetimeIndex(start='2000-01-01', periods=1000, freq='B'), ])) cropped = s[::3] cropped.index=pd.MultiIndex.from_tuples(cropped.index, names=list('xyz')) cropped.head() x y za a 2000-01-03 0.9939892000-01-06 0.8505182000-01-11 0.0689442000-01-14 0.2371972000-01-19 0.784254dtype: float64``` Two approaches for getting this into xarray;
1 - Simple ```python current_method = cropped.to_xarray()<xarray.DataArray (x: 10, y: 10, z: 1000)> array([[[0.993989, nan, ..., nan, 0.721663], [ nan, nan, ..., 0.58224 , nan], ..., [ nan, 0.369382, ..., nan, nan], [0.98558 , nan, ..., nan, 0.403732]],
Coordinates: * x (x) object 'a' 'b' 'c' 'd' 'e' 'f' 'h' 'i' 'j' 'k' * y (y) object 'a' 'b' 'c' 'd' 'e' 'f' 'h' 'i' 'j' 'k' * z (z) datetime64[ns] 2000-01-03 2000-01-04 ... 2003-10-30 2003-10-31 ``` This takes 536 ms 2 - unstack in pandas first, and then use This takes 17.3 ms To confirm these are identical: ``` proposed_version_adj = ( proposed_version .assign_coords(y=proposed_version['y'].astype(object)) .transpose(*current_version.dims) ) proposed_version_adj.equals(current_version) True``` Problem descriptionA default operation is much slower than a (potentially) equivalent operation that's not the default. I need to look more at what's causing the issues. I think it's to do with the Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2459/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue | ||||||
115210260 | MDU6SXNzdWUxMTUyMTAyNjA= | 645 | Display of PeriodIndex | max-sixty 5635139 | closed | 0 | 13 | 2015-11-05T05:01:22Z | 2015-12-30T05:59:05Z | 2015-12-30T05:59:05Z | MEMBER | Not the greatest issue but:
While coordinates that are given as Or correct me if I'm making some basic mistake. ``` python In [23]: data_array = xray.DataArray( data=pd.Series(np.random.rand(20), index=pd.period_range(start='2000', periods=20, name='Date')) ) data_array Out[23]: <xray.DataArray (Date: 20)> array([ 0.95861189, 0.3607297 , 0.9890032 , 0.77674314, 0.39461886, 0.98425749, 0.79044973, 0.81376587, 0.07091318, 0.02757213, 0.87366025, 0.0496346 , 0.45433931, 0.3339866 , 0.67261248, 0.91684965, 0.60889737, 0.33469611, 0.94966724, 0.50328461]) Coordinates: * Date (Date) int64 10957 10958 10959 10960 10961 10962 10963 10964 ... In [25]: data_array.to_series() Out[25]: Date 2000-01-01 0.958612 2000-01-02 0.360730 2000-01-03 0.989003 2000-01-04 0.776743 2000-01-05 0.394619 2000-01-06 0.984257 2000-01-07 0.790450 2000-01-08 0.813766 2000-01-09 0.070913 2000-01-10 0.027572 2000-01-11 0.873660 2000-01-12 0.049635 2000-01-13 0.454339 2000-01-14 0.333987 2000-01-15 0.672612 2000-01-16 0.916850 2000-01-17 0.608897 2000-01-18 0.334696 2000-01-19 0.949667 2000-01-20 0.503285 Freq: D, dtype: float64 ``` |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/645/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | xarray 13221727 | issue |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] ( [id] INTEGER PRIMARY KEY, [node_id] TEXT, [number] INTEGER, [title] TEXT, [user] INTEGER REFERENCES [users]([id]), [state] TEXT, [locked] INTEGER, [assignee] INTEGER REFERENCES [users]([id]), [milestone] INTEGER REFERENCES [milestones]([id]), [comments] INTEGER, [created_at] TEXT, [updated_at] TEXT, [closed_at] TEXT, [author_association] TEXT, [active_lock_reason] TEXT, [draft] INTEGER, [pull_request] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [state_reason] TEXT, [repo] INTEGER REFERENCES [repos]([id]), [type] TEXT ); CREATE INDEX [idx_issues_repo] ON [issues] ([repo]); CREATE INDEX [idx_issues_milestone] ON [issues] ([milestone]); CREATE INDEX [idx_issues_assignee] ON [issues] ([assignee]); CREATE INDEX [idx_issues_user] ON [issues] ([user]);