home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

5 rows where comments = 7, type = "issue" and user = 5635139 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

state 2

  • closed 4
  • open 1

type 1

  • issue · 5 ✖

repo 1

  • xarray 5
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
2026963757 I_kwDOAMm_X8540QMt 8522 Test failures on `main` max-sixty 5635139 closed 0     7 2023-12-05T19:22:01Z 2023-12-06T18:48:24Z 2023-12-06T17:28:13Z MEMBER      

What is your issue?

Any ideas what could be causing these? I can't immediately reproduce locally.

https://github.com/pydata/xarray/actions/runs/7105414268/job/19342564583

``` Error: TestDataArray.test_computation_objects[int64-method_groupby_bins-data]

AssertionError: Left and right DataArray objects are not close

Differing values: L <Quantity([[ nan nan 1. 1. ] [2. 2. 3. 3. ] [4. 4. 5. 5. ] [6. 6. 7. 7. ] [8. 8. 9. 9.333333]], 'meter')> R <Quantity([[0. 0. 1. 1. ] [2. 2. 3. 3. ] [4. 4. 5. 5. ] [6. 6. 7. 7. ] [8. 8. 9. 9.333333]], 'meter')> ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8522/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 1,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
521754870 MDU6SXNzdWU1MjE3NTQ4NzA= 3514 Should we cache some small properties? max-sixty 5635139 open 0     7 2019-11-12T19:28:21Z 2019-11-16T04:32:11Z   MEMBER      

I was doing some profiling on isel, and see there are some properties that (I think) never change, but are called frequently. Should we cache these on their object?

Pandas uses cache_readonly for these cases.

Here's a case: we call LazilyOuterIndexedArray.shape frequently when doing a simple indexing operation. Each call takes ~150µs. An attribute lookup on a python object takes ~50ns (i.e. 3000x faster). IIUC the result on that property should never change.

I don't think this is the solution to performance issues, and there's some additional complexity. Could they be easy & small wins, though?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3514/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
267826297 MDU6SXNzdWUyNjc4MjYyOTc= 1651 ENH: Forward & back fill methods max-sixty 5635139 closed 0     7 2017-10-23T21:39:18Z 2018-02-09T17:36:30Z 2018-02-09T17:36:29Z MEMBER      

I think with np.flip and bn.push, this should be simple. They're both fairly new and so would require version checks / upgrading the minimums.

One small issue, I wonder if anyone has come across this: bottleneck returns the numpy array rather than the DataArray - is that because it's not operating with the correct numpy interface?

Forward fill: array.values = bn.push(array.values, axis=array.get_axis_num(axis_name))

Backfill: ``` axis = array.get_axis_num(axis_name)

reverse for bfill

array = np.flip(array, axis=axis)

fill

array.values = bn.push(array.values, axis=axis)

reverse back to original

result = np.flip(scaling, axis=date_axis) ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1651/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
156591025 MDU6SXNzdWUxNTY1OTEwMjU= 859 BUG: Rolling on Dataset max-sixty 5635139 closed 0     7 2016-05-24T19:35:32Z 2017-03-31T03:10:45Z 2017-03-31T03:10:45Z MEMBER      

This looks like it's available with dir / tab complete, but actually isn't:

``` python In [13]: xr.DataArray(np.random.rand(10,3)).to_dataset('dim_1').rolling


AttributeError Traceback (most recent call last) <ipython-input-13-438d3638a0d0> in <module>() ----> 1 xr.DataArray(np.random.rand(10,3)).to_dataset('dim_1').rolling

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/xarray/core/common.py in getattr(self, name) 135 return source[name] 136 raise AttributeError("%r object has no attribute %r" % --> 137 (type(self).name, name)) 138 139 def setattr(self, name, value):

AttributeError: 'Dataset' object has no attribute 'rolling' ```

I think this could be easy to implement as an .apply operation? (indeed, that could be a reasonable path for a whole host of operations - i.e. try and apply them to each array in the ds?)

Also, as a very narrow point, I'm not sure why .rolling_cls is public? Probably should be private?

Finally, the Rolling implementation is pretty sweet. I've been getting my hands dirty in the pandas one recently, and that we can have something as well featured as that with so few lines of code 👍

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/859/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
124441012 MDU6SXNzdWUxMjQ0NDEwMTI= 692 Transpose modifies dtype of index, when a PeriodIndex max-sixty 5635139 closed 0     7 2015-12-31T07:11:56Z 2016-01-03T18:41:04Z 2016-01-02T01:48:48Z MEMBER      

This is very peculiar & specific, but also fairly impactful for us.

If you - Create a Dataset with a coord that is a PeriodIndex - Transpose that coord - Add a variable to the Dataset that needs to be reindexed

...then the type of the index changes from object to int64. This then causes other arrays added to that dataset to show up as NaNs throughout.

Here's an example. Note the dtype('O')) at the end of each output.

``` python In [61]: series = pd.Series(np.random.rand(10),index=pd.period_range(start='2000', periods=10,name='date')) ​ ds = xray.Dataset({'number 1':series}) ds['number 2'] = ds['number 1'] ds, ds.date.dtype Out[61]: (<xray.Dataset> Dimensions: (date: 10) Coordinates: * date (date) object 10957 10958 10959 10960 10961 10962 10963 10964 ... Data variables: number 1 (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ... number 2 (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ..., dtype('O')) In [62]:

ds, ds.date.dtype ds=ds.transpose('date') ds, ds.date.dtype Out[62]: (<xray.Dataset> Dimensions: (date: 10) Coordinates: * date (date) object 10957 10958 10959 10960 10961 10962 10963 10964 ... Data variables: number 1 (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ... number 2 (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ..., dtype('O')) In [63]:

ds ds['number 3'] = ds['number 1'] ds, ds.date.dtype Out[63]: (<xray.Dataset> Dimensions: (date: 10) Coordinates: * date (date) object 10957 10958 10959 10960 10961 10962 10963 10964 ... Data variables: number 1 (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ... number 2 (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ... number 3 (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ..., dtype('O')) In [64]:

ds ds['number 4'] = ds['number 1'][:5] ds, ds.date.dtype Out[64]: (<xray.Dataset> Dimensions: (date: 10) Coordinates: * date (date) int64 10957 10958 10959 10960 10961 10962 10963 10964 ... Data variables: number 1 (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ... number 2 (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ... number 3 (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ... number 4 (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 nan nan nan ..., dtype('int64')) ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/692/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 35.26ms · About: xarray-datasette