home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

15 rows where author_association = "CONTRIBUTOR", issue = 186680248 and user = 5572303 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • chunweiyuan · 15 ✖

issue 1

  • Allow concat() to drop/replace duplicate index labels? · 15 ✖

author_association 1

  • CONTRIBUTOR · 15 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
271461749 https://github.com/pydata/xarray/issues/1072#issuecomment-271461749 https://api.github.com/repos/pydata/xarray/issues/1072 MDEyOklzc3VlQ29tbWVudDI3MTQ2MTc0OQ== chunweiyuan 5572303 2017-01-10T01:37:22Z 2017-01-10T01:37:22Z CONTRIBUTOR

I'm curious, if you put a from .computation import apply_ufunc inside ops.py, would you not get some circular ImportError? Seems difficult to get out of this circle......

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow concat() to drop/replace duplicate index labels? 186680248
270810132 https://github.com/pydata/xarray/issues/1072#issuecomment-270810132 https://api.github.com/repos/pydata/xarray/issues/1072 MDEyOklzc3VlQ29tbWVudDI3MDgxMDEzMg== chunweiyuan 5572303 2017-01-06T01:47:20Z 2017-01-06T01:47:20Z CONTRIBUTOR

Ok I'll give it a shot. Will touch base when I run into roadblocks.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow concat() to drop/replace duplicate index labels? 186680248
270797694 https://github.com/pydata/xarray/issues/1072#issuecomment-270797694 https://api.github.com/repos/pydata/xarray/issues/1072 MDEyOklzc3VlQ29tbWVudDI3MDc5NzY5NA== chunweiyuan 5572303 2017-01-06T00:25:54Z 2017-01-06T00:25:54Z CONTRIBUTOR

So I took at quick look at the commits in #964. It's not entirely clear to me how one can easily add a join argument to fillna. Should I keep my current commits, and just submit a PR to master once #964 is merged in, and then we could see how it goes from there?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow concat() to drop/replace duplicate index labels? 186680248
260716231 https://github.com/pydata/xarray/issues/1072#issuecomment-260716231 https://api.github.com/repos/pydata/xarray/issues/1072 MDEyOklzc3VlQ29tbWVudDI2MDcxNjIzMQ== chunweiyuan 5572303 2016-11-15T17:57:19Z 2016-11-15T17:57:19Z CONTRIBUTOR

Checking in on how to move forward from here...I feel it's pretty close to a PR...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow concat() to drop/replace duplicate index labels? 186680248
260431105 https://github.com/pydata/xarray/issues/1072#issuecomment-260431105 https://api.github.com/repos/pydata/xarray/issues/1072 MDEyOklzc3VlQ29tbWVudDI2MDQzMTEwNQ== chunweiyuan 5572303 2016-11-14T19:11:55Z 2016-11-14T19:11:55Z CONTRIBUTOR

Currently, ds0.combine_first(ds1) gives exactly the same result as xr.merge([ds0, ds1]). But for data arrays it still offers something new.

Either 1.) my combine_first should be doing something different with datasets, or 2.) we don't need a combine_first for datasets, or 3.) change xr.merge so that when applied to data arrays, it creates a new data array with outer-join and fillna (by chaining combine_first down the list of data arrays).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow concat() to drop/replace duplicate index labels? 186680248
260411626 https://github.com/pydata/xarray/issues/1072#issuecomment-260411626 https://api.github.com/repos/pydata/xarray/issues/1072 MDEyOklzc3VlQ29tbWVudDI2MDQxMTYyNg== chunweiyuan 5572303 2016-11-14T18:02:47Z 2016-11-14T18:02:47Z CONTRIBUTOR

I suppose I can save one line by getting rid of the duplicate f = _func_slash_method_wrapper(fillna), but besides that I'm not sure what's the best way to refactor this.

If the behaviors check out, then this branch might be ready for a PR.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow concat() to drop/replace duplicate index labels? 186680248
260166034 https://github.com/pydata/xarray/issues/1072#issuecomment-260166034 https://api.github.com/repos/pydata/xarray/issues/1072 MDEyOklzc3VlQ29tbWVudDI2MDE2NjAzNA== chunweiyuan 5572303 2016-11-13T04:14:38Z 2016-11-13T04:14:38Z CONTRIBUTOR

So these are my _fillna and _combine_first in ops.inject_binary_ops:

``` f = _func_slash_method_wrapper(fillna) method = cls._binary_op(f, join='left', fillna=True) setattr(cls, '_fillna', method)

f = _func_slash_method_wrapper(fillna) method = cls._binary_op(f, join='outer', fillna=True) setattr(cls, '_combine_first', method) ```

Within dataarray.py and dataset.py, combine_first(self, other) simply returns self._combine_first(other). This code path renders the test results you saw earlier.

Given this construct, I'm not sure how to do the refactor like you mentioned. Perhaps a few more pointers to the right direction? :)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow concat() to drop/replace duplicate index labels? 186680248
260137121 https://github.com/pydata/xarray/issues/1072#issuecomment-260137121 https://api.github.com/repos/pydata/xarray/issues/1072 MDEyOklzc3VlQ29tbWVudDI2MDEzNzEyMQ== chunweiyuan 5572303 2016-11-12T17:48:08Z 2016-11-12T17:48:08Z CONTRIBUTOR

hmm, so what would be an expected behavior of ds.combine_first?

If I have

```

ds0 <xarray.Dataset> Dimensions: (x: 2, y: 2) Coordinates: * x (x) |S1 'a' 'b' * y (y) int64 -1 0 Data variables: ds0 (x, y) int64 0 0 0 0 ds1 <xarray.Dataset> Dimensions: (x: 2, y: 2) Coordinates: * x (x) |S1 'b' 'c' * y (y) int64 0 1 Data variables: ds1 (x, y) int64 1 1 1 1 ```

I get

```

ds0.combine_first(ds1) <xarray.Dataset> Dimensions: (x: 3, y: 3) Coordinates: * x (x) object 'a' 'b' 'c' * y (y) int64 -1 0 1 Data variables: ds0 (x, y) float64 0.0 0.0 nan 0.0 0.0 nan nan nan nan ds1 (x, y) float64 nan nan nan nan 1.0 1.0 nan 1.0 1.0 ```

and changing the order to ds1.combine_first(ds0) just flips the order of the data_vars, but the cell values of the data_vars remain the same.

This is done essentially by adding a _combine_first to ops.py that mimics _fillna, except join='outer'.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow concat() to drop/replace duplicate index labels? 186680248
259842225 https://github.com/pydata/xarray/issues/1072#issuecomment-259842225 https://api.github.com/repos/pydata/xarray/issues/1072 MDEyOklzc3VlQ29tbWVudDI1OTg0MjIyNQ== chunweiyuan 5572303 2016-11-10T23:49:07Z 2016-11-10T23:49:07Z CONTRIBUTOR

Any suggestion on improvement?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow concat() to drop/replace duplicate index labels? 186680248
259272844 https://github.com/pydata/xarray/issues/1072#issuecomment-259272844 https://api.github.com/repos/pydata/xarray/issues/1072 MDEyOklzc3VlQ29tbWVudDI1OTI3Mjg0NA== chunweiyuan 5572303 2016-11-08T21:59:06Z 2016-11-08T22:10:54Z CONTRIBUTOR

So I spent some time looking at dataset._calculate_binary_op but couldn't quite come up with what I wanted. After banging my head against the wall a bit this is what I have:

def combine_first(left, right): la, ra = xr.align(left, right, join='outer', copy=False) # should copy=True? la, ra = la.where(ra.isnull() | ra.notnull()), ra.where(la.isnull() | la.notnull()) ra.values[la.notnull().values] = la.values[la.notnull().values] return ra

And it seems to work. My test cases are

```

l_2d <xarray.DataArray (x: 2, y: 2)> array([[1, 1], [1, 1]]) Coordinates: * x (x) |S1 'a' 'b' * y (y) int64 -2 0 r_2d <xarray.DataArray (x: 2, y: 2)> array([[0, 0], [0, 0]]) Coordinates: * x (x) |S1 'b' 'c' * y (y) int64 0 2 ar_1d <xarray.DataArray (x: 3)> array([4, 5, 6]) Coordinates: * x (x) |S1 'a' 'b' 'd' ```

and here are the results:

```

combine_first(l_2d, r_2d) <xarray.DataArray (x: 3, y: 3)> array([[ 1., 1., nan], [ 1., 1., 0.], [ nan, 0., 0.]]) Coordinates: * x (x) object 'a' 'b' 'c' * y (y) int64 -2 0 2 combine_first(r_2d, l_2d) <xarray.DataArray (x: 3, y: 3)> array([[ 1., 1., nan], [ 1., 0., 0.], [ nan, 0., 0.]]) Coordinates: * x (x) object 'a' 'b' 'c' * y (y) int64 -2 0 2 combine_first(l_2d, ar_1d) <xarray.DataArray (x: 3, y: 2)> array([[ 1., 1.], [ 1., 1.], [ 6., 6.]]) Coordinates: * x (x) object 'a' 'b' 'd' * y (y) int64 -2 0 combine_first(ar_1d, l_2d) <xarray.DataArray (x: 3, y: 2)> array([[ 4., 4.], [ 5., 5.], [ 6., 6.]]) Coordinates: * x (x) object 'a' 'b' 'd' * y (y) int64 -2 0 ```

I don't like the fact that I have to access .values, and the use of .where is slightly wonky. But this is definitely the cleanest working solution I have thus far. Any suggestion to improve is welcome.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow concat() to drop/replace duplicate index labels? 186680248
258718234 https://github.com/pydata/xarray/issues/1072#issuecomment-258718234 https://api.github.com/repos/pydata/xarray/issues/1072 MDEyOklzc3VlQ29tbWVudDI1ODcxODIzNA== chunweiyuan 5572303 2016-11-06T23:02:01Z 2016-11-06T23:15:30Z CONTRIBUTOR

Thanks for the reply. Would it make things easier if c, d = xr.align(a, b, join="outer") explicitly broadcasts missing dimensions in all returned values? Currently, if b is missing a dimension, then d would also miss a dimension:

```

a <xarray.DataArray (x: 2, y: 2)> array([[1, 1], [1, 1]]) Coordinates: * x (x) |S1 'a' 'b' * y (y) int64 -2 0 b <xarray.DataArray (x: 3)> array([4, 5, 6]) Coordinates: * x (x) |S1 'a' 'b' 'd' c, d = xr.align(a, b, join="outer") c <xarray.DataArray (x: 3, y: 2)> array([[ 1., 1.], [ 1., 1.], [ nan, nan]]) Coordinates: * x (x) object 'a' 'b' 'd' * y (y) int64 -2 0 d <xarray.DataArray (x: 3)> array([4, 5, 6]) Coordinates: * x (x) object 'a' 'b' 'd' ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow concat() to drop/replace duplicate index labels? 186680248
258567735 https://github.com/pydata/xarray/issues/1072#issuecomment-258567735 https://api.github.com/repos/pydata/xarray/issues/1072 MDEyOklzc3VlQ29tbWVudDI1ODU2NzczNQ== chunweiyuan 5572303 2016-11-04T22:58:24Z 2016-11-04T22:58:24Z CONTRIBUTOR

Another test:

```

left <xarray.DataArray (x: 2, y: 2)> array([[1, 1], [1, 1]]) Coordinates: * x (x) |S1 'a' 'b' * y (y) int64 -2 0 right <xarray.DataArray (x: 2, y: 2)> array([[0, 0], [0, 0]]) Coordinates: * x (x) |S1 'b' 'c' * y (y) int64 0 2 combine_first(left, right) <xarray.DataArray (x: 3, y: 3)> array([[ 1., 1., nan], [ 1., 1., 0.], [ nan, 0., 0.]]) Coordinates: * x (x) object 'a' 'b' 'c' * y (y) int64 -2 0 2 combine_first(right, left) <xarray.DataArray (x: 3, y: 3)> array([[ 1., 1., nan], [ 1., 0., 0.], [ nan, 0., 0.]]) Coordinates: * x (x) object 'a' 'b' 'c' * y (y) int64 -2 0 2 ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow concat() to drop/replace duplicate index labels? 186680248
258294667 https://github.com/pydata/xarray/issues/1072#issuecomment-258294667 https://api.github.com/repos/pydata/xarray/issues/1072 MDEyOklzc3VlQ29tbWVudDI1ODI5NDY2Nw== chunweiyuan 5572303 2016-11-03T22:35:47Z 2016-11-03T22:35:47Z CONTRIBUTOR

Here's my somewhat hacky stab at it, not optimized for speed/efficiency:

def combine_first(left, right): """ Takes 2 data arrays, performs an outer-concat, using values from the right array to fill in for missing coordinates in the left array. """ l_aligned, r_aligned = xr.align(left, right, join="outer") temp = l_aligned + r_aligned # hack temp.values[temp.notnull().values] = np.nan # now template is all nan # insert non-nan values from right array first temp.values[r_aligned.notnull().values] = r_aligned.values[r_aligned.notnull().values] # insert values from left array, overwriting those from right array temp.values[l_aligned.notnull().values] = l_aligned.values[l_aligned.notnull().values] return temp

And the result:

```

ar1 = xr.DataArray([4,5,6],[('x',[1,2,3])]) ar2 = xr.DataArray([[7,8,9],[10,11,12],[13,14,15]],[('x',[1,12,13]),('y',[0,5,6])]) ar1 <xarray.DataArray (x: 3)> array([4, 5, 6]) Coordinates: * x (x) int64 1 2 3 ar2 <xarray.DataArray (x: 3, y: 3)> array([[ 7, 8, 9], [10, 11, 12], [13, 14, 15]]) Coordinates: * x (x) int64 1 12 13 * y (y) int64 0 5 6 temp = combine_first(ar1, ar2) temp <xarray.DataArray (x: 5, y: 3)> array([[ 4., 5., 6.], [ 4., 5., 6.], [ 4., 5., 6.], [ 10., 11., 12.], [ 13., 14., 15.]]) Coordinates: * x (x) int64 1 2 3 12 13 * y (y) int64 0 5 6 temp = combine_first(ar2, ar1) temp <xarray.DataArray (x: 5, y: 3)> array([[ 7., 8., 9.], [ 4., 5., 6.], [ 4., 5., 6.], [ 10., 11., 12.], [ 13., 14., 15.]]) Coordinates: * x (x) int64 1 2 3 12 13 * y (y) int64 0 5 6 ```

Would this be the behavior we want from xarray's combine_first? Not entirely analogous to the Pandas function. Maybe rename it?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow concat() to drop/replace duplicate index labels? 186680248
258234611 https://github.com/pydata/xarray/issues/1072#issuecomment-258234611 https://api.github.com/repos/pydata/xarray/issues/1072 MDEyOklzc3VlQ29tbWVudDI1ODIzNDYxMQ== chunweiyuan 5572303 2016-11-03T18:35:13Z 2016-11-03T20:25:12Z CONTRIBUTOR

So I fooled around with Pandas' combine_first:

```

df1 = pd.DataFrame({'x':[1,2,3],'z':[4,5,6]}).set_index('x') df1
z x
1 4 2 5 3 6 df2 = pd.DataFrame({'x':[1,12,13],'y':[0,5,6],'z':[7,8,9]}).set_index(['x','y']) df2 z x y
1 0 7 12 5 8 13 6 9 df1.combine_first(df2) z x y
1 0 4.0 12 5 8.0 13 6 9.0 ```

and was surprised that the indexes were not "outer-joined". Is this the behavior xarray wants to emulate?

As a mockup for xr.combine_first(arr1, arr2), I was thinking about using align([arr1, arr2], join="outer") to set up the template, and then go into the template to set the values right. Is that a sound approach? Muchas gracias.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow concat() to drop/replace duplicate index labels? 186680248
258025537 https://github.com/pydata/xarray/issues/1072#issuecomment-258025537 https://api.github.com/repos/pydata/xarray/issues/1072 MDEyOklzc3VlQ29tbWVudDI1ODAyNTUzNw== chunweiyuan 5572303 2016-11-02T23:02:03Z 2016-11-02T23:02:03Z CONTRIBUTOR

Is combine_first already implemented? I can't find it in the source code, neither could I find ops.choose.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow concat() to drop/replace duplicate index labels? 186680248

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 12.108ms · About: xarray-datasette