home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

6 rows where author_association = "MEMBER" and issue = 411755105 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 2

  • shoyer 3
  • TomNicholas 3

issue 1

  • Improved default behavior when concatenating DataArrays · 6 ✖

author_association 1

  • MEMBER · 6 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
469056156 https://github.com/pydata/xarray/pull/2777#issuecomment-469056156 https://api.github.com/repos/pydata/xarray/issues/2777 MDEyOklzc3VlQ29tbWVudDQ2OTA1NjE1Ng== shoyer 1217238 2019-03-03T19:39:05Z 2019-03-03T19:39:05Z MEMBER

I guess we should probably roll back the "name to scalar coordinates" part of this change.

@Zac-HD do you want to do that here or should we go with @TomNicholas's PR?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Improved default behavior when concatenating DataArrays 411755105
467993188 https://github.com/pydata/xarray/pull/2777#issuecomment-467993188 https://api.github.com/repos/pydata/xarray/issues/2777 MDEyOklzc3VlQ29tbWVudDQ2Nzk5MzE4OA== TomNicholas 35968931 2019-02-27T19:16:11Z 2019-02-28T10:05:39Z MEMBER

I've just submitted a PR which solves this issue in the way I just suggested instead #2792.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Improved default behavior when concatenating DataArrays 411755105
468088653 https://github.com/pydata/xarray/pull/2777#issuecomment-468088653 https://api.github.com/repos/pydata/xarray/issues/2777 MDEyOklzc3VlQ29tbWVudDQ2ODA4ODY1Mw== TomNicholas 35968931 2019-02-28T00:38:11Z 2019-02-28T00:38:11Z MEMBER

@Zac-HD there's actually another way to get the indexing behaviour you wanted with the current API:

python colors = "blue green red".split() das = [xr.DataArray(np.random.random((2, 2)), dims="x y".split(), coords={"band": k}) for k in colors] xr.concat(das, dim="band").plot.imshow(col="band") Here instead of using the name attribute to label each band I've used a scalar coordinate called "band", so that when you concat along "band" it will just stack along that coordinate.

This never touches the names so actually gives the desired output without the need for #2792:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Improved default behavior when concatenating DataArrays 411755105
467945628 https://github.com/pydata/xarray/pull/2777#issuecomment-467945628 https://api.github.com/repos/pydata/xarray/issues/2777 MDEyOklzc3VlQ29tbWVudDQ2Nzk0NTYyOA== TomNicholas 35968931 2019-02-27T17:04:39Z 2019-02-27T17:16:53Z MEMBER

@Zac-HD forgive me for this but I think this PR is unnecessary because what you need basically already exists in the API.

Going back to your original example, you could have got the same indexing by creating a DataArray to use as a coordinate to concatenate over: ```python colors = "blue green red".split() ds = xr.Dataset({ k: xr.DataArray(np.random.random((2, 2)), dims="x y".split(), name=k) for k in colors })

band = xr.DataArray(colors, name="band", dims=["band"]) xr.concat([ds.blue, ds.green, ds.red], dim=band).plot.imshow(col="band")

```

This still leaves the wrong label on the colorbar, but that could be fixed separately and has to do with concat using the attrs of the first dataset in the list for the final dataset (a similar problem to #2382). I think it would be easier to change that behaviour instead (perhaps to if all names the same, use that name, else name of result = None, but this also relates to #1614).

Creating a new coordinate using a DataArray is in the docstring for xr.concat:

If dimension is provided as a DataArray or Index, its name is used as the dimension to concatenate along and the values are added as a coordinate.

but I think there should be an example there too. (Also I think this is relevant to #1646)

I'm not entirely sure this is a deal-breaker but it makes me a little nervous

@shoyer I agree, although I like the idea then I think this could introduce all sorts of complex concatentation edge cases.

At the very least the new API should have symmetry properties something like:

```python da1 = DataArray(name='a', data=[[0]], dims=['x', 'y']) da2 = DataArray(name='b', data=[[1]], dims=['x', 'y']) da3 = DataArray(name='a', data=[[2]], dims=['x', 'y']) da4 = DataArray(name='b', data=[[3]], dims=['x', 'y'])

xr.manual_combine([[da1, da2], [da3, da4]], concat_dim=['x', 'y'])

should give the same result as

xr.manual_combine([[da1, da3], [da2, da4]], concat_dim=['y', 'x']) `` but with this PR I don't think it would. In the first case the x coord would be created with values['a', 'b']`, and no y coord would be created, while in the second case no y coord would be created, and the intermediate DataSet would be nameless, so then no x coord would be created either.

I think my suggestion for naming would pass this test because the result would be nameless and have no coords either way.

I might have got that wrong but I definitely think this kind of change should be carefully considered :confused:

(EDIT: I just added this example as a test to #2616)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Improved default behavior when concatenating DataArrays 411755105
467648344 https://github.com/pydata/xarray/pull/2777#issuecomment-467648344 https://api.github.com/repos/pydata/xarray/issues/2777 MDEyOklzc3VlQ29tbWVudDQ2NzY0ODM0NA== shoyer 1217238 2019-02-26T22:52:28Z 2019-02-26T22:52:28Z MEMBER

@pep8speaks seems to have gone hay-wire -- maybe you have a syntax error?

Thinking about this a little more, one hazard of converting names into index labels is that we lose the invariant that you get the same result regardless of order in which you call concat, e.g., something like these expressions could now give different results: xarray.concat([a, b], dim='x') vs xarray.concat([xarray.concat([a], dim='x'), xarray.concat([b], dim='x')], dim='x') If a and b have the same name, then you'd get an index with the two duplicate entries in the second case but not the first case.

I'm not entirely sure this is a deal-breaker but it makes me a little nervous reluctant. In particular, it might break some the invariants we're relying upon for the next version of open_mfdataset (https://github.com/pydata/xarray/pull/2616, cc @TomNicholas )

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Improved default behavior when concatenating DataArrays 411755105
466434007 https://github.com/pydata/xarray/pull/2777#issuecomment-466434007 https://api.github.com/repos/pydata/xarray/issues/2777 MDEyOklzc3VlQ29tbWVudDQ2NjQzNDAwNw== shoyer 1217238 2019-02-22T15:23:06Z 2019-02-22T15:23:06Z MEMBER

I'll check on the doc build failure, but I would rather not remove load_dataset yet -- we didn't deprecate it that long ago. On Fri, Feb 22, 2019 at 5:28 AM Zac Hatfield-Dodds notifications@github.com wrote:

@Zac-HD commented on this pull request.

In xarray/core/combine.py https://github.com/pydata/xarray/pull/2777#discussion_r259341037:

  • elif name != arr.name:
  • if compat == 'identical':
  • raise ValueError('array names not identical')
  • else:
  • arr = arr.rename(name)
  • datasets.append(arr._to_temp_dataset())
  • name = result_name(arrays)
  • names = [arr.name for arr in arrays]
  • if compat == 'identical' and len(set(names)) != 1:
  • raise ValueError(
  • "compat='identical', but array names {!r} are not identical"
  • .format(names if len(names) <= 10 else sorted(set(names)))
  • )
  • datasets = [arr.rename(name)._to_temp_dataset() for arr in arrays] +
  • if isinstance(dim, str) and len(set(names) - {None}) == len(names) \

Not for this - we're checking that names is a list of unique strings.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/pull/2777#discussion_r259341037, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKS1nbwyPzYE3HzT30pWhWignDUrbBSks5vP_B_gaJpZM4bCN3h .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Improved default behavior when concatenating DataArrays 411755105

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 11.74ms · About: xarray-datasette