home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

6 rows where type = "issue" and user = 5572303 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)

state 2

  • closed 4
  • open 2

type 1

  • issue · 6 ✖

repo 1

  • xarray 6
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
182667672 MDU6SXNzdWUxODI2Njc2NzI= 1046 center=True for xarray.DataArray.rolling() chunweiyuan 5572303 open 0     8 2016-10-13T00:37:25Z 2024-04-04T21:06:57Z   CONTRIBUTOR      

The logic behind setting center=True confuses me. Say window size = 3. The default behavior (center=False) sets the window to go from i-2 to i, so I would've expected center=True to set the window from i-1 to i+1. But that's not what I see.

For example, this is what data looks like:

```

data = xr.DataArray(np.arange(27).reshape(3, 3, 3), coords=[('x', ['a', 'b', 'c']), ('y', [-2, 0, 2]), ('z', [0, 1 ,2])])

data xarray.DataArray (x: 3, y: 3, z: 3), array([[[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8]],

   [[ 9, 10, 11],
    [12, 13, 14],
    [15, 16, 17]],

   [[18, 19, 20],
    [21, 22, 23],
    [24, 25, 26]]])

Coordinates: * x (x) |S1 'a' 'b' 'c' * y (y) int64 -2 0 2 * z (z) int64 0 1 2 ```

Now, if I set y-window size = 3, center = False, min # of entries = 1, I get

```

r = data.rolling(y=3, center=False, min_periods=1) r.mean() <xarray.DataArray (x: 3, y: 3, z: 3)> array([[[ 0. , 1. , 2. ], [ 1.5, 2.5, 3.5], [ 3. , 4. , 5. ]],

   [[  9. ,  10. ,  11. ],
    [ 10.5,  11.5,  12.5],
    [ 12. ,  13. ,  14. ]],

   [[ 18. ,  19. ,  20. ],
    [ 19.5,  20.5,  21.5],
    [ 21. ,  22. ,  23. ]]])

Coordinates: * x (x) |S1 'a' 'b' 'c' * y (y) int64 -2 0 2 * z (z) int64 0 1 2 ```

Which essentially gives me a "trailing window" of size 3, meaning the window goes from i-2 to i. This is not explained in the doc but can be understood empirically.

On the other hand, setting center = True gives

```

r = data.rolling(y=3, center=True, min_periods=1) r.mean() <xarray.DataArray (x: 3, y: 3, z: 3)> array([[[ 1.5, 2.5, 3.5], [ 3. , 4. , 5. ], [ nan, nan, nan]],

   [[ 10.5,  11.5,  12.5],
    [ 12. ,  13. ,  14. ],
    [  nan,   nan,   nan]],

   [[ 19.5,  20.5,  21.5],
    [ 21. ,  22. ,  23. ],
    [  nan,   nan,   nan]]])

Coordinates: * x (x) |S1 'a' 'b' 'c' * y (y) int64 -2 0 2 * z (z) int64 0 1 2 ```

In other words, it just pushes every cell up the y-dim by 1, using nan to represent things coming off the edge of the universe. If you look at _center_result() of xarray/core/rolling.py, that's exactly what it does with .shift().

I would've expected center=True to change the window to go from i-1 to i+1. In which case, with min_periods=1, would not render any nan value in r.mean().

Could someone explain the logical flow to me?

Much obliged,

Chun

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1046/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
230529125 MDU6SXNzdWUyMzA1MjkxMjU= 1420 .equals() on a coordinate takes attributes into comparison chunweiyuan 5572303 closed 0     6 2017-05-22T21:48:44Z 2019-05-23T03:11:33Z 2019-05-23T03:11:33Z CONTRIBUTOR      

Is the following the right behavior? ```

da = xr.DataArray(range(3), [('x', range(2000, 2003))]) ws = xr.DataArray([1 for i in range(3)], [('x', range(2000, 2003))]) da.coords['x'].equals(ws.coords['x']) True da['some_attr'] = 0 da.coords['x'].equals(ws.coords['x']) False `` I'm just trying to see if the coordinates are the same, but somehow the dataarray's attribute becomes part of the comparison. I'd expect that in.identical(), but not in.equals()`.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1420/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
221366244 MDU6SXNzdWUyMjEzNjYyNDQ= 1371 Weighted quantile chunweiyuan 5572303 open 0     8 2017-04-12T19:29:04Z 2019-03-20T22:34:22Z   CONTRIBUTOR      

For our work we frequently need to compute weighted quantiles. This is especially important when we need to weigh data from recent years more heavily in making predictions.

I've put together a function (called weighted_quantile) largely based on the source code of np.percentile. It allows one to input weights along a single dimension, as a dict w_dict. Below are some manual tests:

When all weights = 1, it's identical to using np.nanpercentile: ```

ar0 <xarray.DataArray (x: 3, y: 4)> array([[3, 4, 8, 1], [5, 3, 7, 9], [4, 9, 6, 2]]) Coordinates: * x (x) |S1 'a' 'b' 'c' * y (y) int64 0 1 2 3 ar0.quantile(q=[0.25, 0.5, 0.75], dim='y') <xarray.DataArray (quantile: 3, x: 3)> array([[ 2.5 , 4.5 , 3.5 ], [ 3.5 , 6. , 5. ], [ 5. , 7.5 , 6.75]]) Coordinates: * x (x) |S1 'a' 'b' 'c' * quantile (quantile) float64 0.25 0.5 0.75 weighted_quantile(da=ar0, q=[0.25, 0.5, 0.75], dim='y', w_dict={'y': [1,1,1,1]}) <xarray.DataArray (quantile: 3, x: 3)> array([[ 2.5 , 4.5 , 3.5 ], [ 3.5 , 6. , 5. ], [ 5. , 7.5 , 6.75]]) Coordinates: * x (x) |S1 'a' 'b' 'c' * quantile (quantile) float64 0.25 0.5 0.75 ```

Now different weights: ```

weighted_quantile(da=ar0, q=[0.25, 0.5, 0.75], dim='y', w_dict={'y': [1,2,3,4.0]}) <xarray.DataArray (quantile: 3, x: 3)> array([[ 3.25 , 5.666667, 4.333333], [ 4. , 7. , 5.333333], [ 6. , 8. , 6.75 ]]) Coordinates: * x (x) |S1 'a' 'b' 'c' * quantile (quantile) float64 0.25 0.5 0.75 ```

Also handles nan values like np.nanpercentile: ```

ar <xarray.DataArray (x: 2, y: 2, z: 2)> array([[[ nan, 3.], [ nan, 5.]],

   [[  8.,   1.],
    [ nan,   0.]]])

Coordinates: * x (x) |S1 'a' 'b' * y (y) int64 0 1 * z (z) int64 8 9

da_stacked = ar.stack(mi=['x', 'y']) out = weighted_quantile(da=ar, q=[0.25, 0.5, 0.75], dim=['x', 'y'], w_dict={'x': [1, 1]}) out <xarray.DataArray (quantile: 3, z: 2)> array([[ 8. , 0.75], [ 8. , 2. ], [ 8. , 3.5 ]]) Coordinates: * z (z) int64 8 9 * quantile (quantile) float64 0.25 0.5 0.75 da_stacked.quantile(q=[0.25, 0.5, 0.75], dim='mi') <xarray.DataArray (quantile: 3, z: 2)> array([[ 8. , 0.75], [ 8. , 2. ], [ 8. , 3.5 ]]) Coordinates: * z (z) int64 8 9 * quantile (quantile) float64 0.25 0.5 0.75 ```

Lastly, different interpolation schemes are consistent: ```

out = weighted_quantile(da=ar, q=[0.25, 0.5, 0.75], dim=['x', 'y'], w_dict={'x': [1, 1]}, interpolation='nearest') out <xarray.DataArray (quantile: 3, z: 2)> array([[ 8., 1.], [ 8., 3.], [ 8., 3.]]) Coordinates: * z (z) int64 8 9 * quantile (quantile) float64 0.25 0.5 0.75 da_stacked.quantile(q=[0.25, 0.5, 0.75], dim='mi', interpolation='nearest') <xarray.DataArray (quantile: 3, z: 2)> array([[ 8., 1.], [ 8., 3.], [ 8., 3.]]) Coordinates: * z (z) int64 8 9 * quantile (quantile) float64 0.25 0.5 0.75 ```

We wonder if it's ok to make this part of xarray. If so, the most logical place to implement it would seem to be in Variable.quantile(). Another option is to make it a utility function, to be called as xr.weighted_quantile().

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1371/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
185017914 MDU6SXNzdWUxODUwMTc5MTQ= 1059 kwarg defaults (join="inner") for DataArray._binary_ops() chunweiyuan 5572303 closed 0     5 2016-10-25T04:45:28Z 2019-01-25T16:50:56Z 2019-01-25T16:50:56Z CONTRIBUTOR      

Currently the default is join="inner". However, there can be applications where the majority of binary operations require join="outer", not "inner".

Would it be advisable to place these default values in some config object the user can set at the beginning of the run script? Or perhaps one already exists but I've failed to locate it.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1059/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
186036077 MDU6SXNzdWUxODYwMzYwNzc= 1067 Indexing turns 1D coordinates into scalar coordinates chunweiyuan 5572303 closed 0     1 2016-10-28T22:29:41Z 2019-01-22T19:22:17Z 2019-01-22T19:22:17Z CONTRIBUTOR      

Starting with

arr = xr.DataArray(np.arange(0, 7.5, 0.5).reshape(3, 5),dims=('x', 'y'))

This will drop the index on x:

arr[0, :]

but this won't:

arr[slice(0,1), :]

While the laymen would expect them both to return the same thing. Is there a reason to this design choice, or could I file a PR for it?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1067/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
186680248 MDU6SXNzdWUxODY2ODAyNDg= 1072 Allow concat() to drop/replace duplicate index labels? chunweiyuan 5572303 closed 0     26 2016-11-01T23:59:56Z 2017-01-23T22:41:22Z 2017-01-23T22:41:22Z CONTRIBUTOR      

Right now, ```

coords_l, coords_r = [0, 1, 2], [1, 2, 3] missing_3 = xr.DataArray([11, 12, 13], [(dim, coords_l)]) missing_0 = xr.DataArray([21, 22, 23], [(dim, coords_r)]) together = xr.concat([missing_3, missing_0], dim='x') together <xarray.DataArray 'missing_3' (x: 6)> array([11, 12, 13, 21, 22, 23]) Coordinates: * x (x) int64 0 1 2 1 2 3 together.sel(x=1) <xarray.DataArray 'missing_3' (x: 2)> array([12, 21]) Coordinates: * x (x) int64 1 1 ```

Would it be OK to introduce a kwarg ("replace"?) that replaces cells of identical coordinates from right to left?

That would render ```

together <xarray.DataArray 'missing_3' (x: 6)> array([11, 21, 22, 23]) Coordinates: * x (x) int64 0 1 2 3 ```

Some people might even want to drop all cells with coordinate collisions (probably not us). If that's the case then the kwarg would be ternary.....

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1072/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 20.859ms · About: xarray-datasette