home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where issue = 494210818 and user = 35968931 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • TomNicholas · 4 ✖

issue 1

  • convert DataArray to DataSet before combine · 4 ✖

author_association 1

  • MEMBER 4
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
652363949 https://github.com/pydata/xarray/pull/3312#issuecomment-652363949 https://api.github.com/repos/pydata/xarray/issues/3312 MDEyOklzc3VlQ29tbWVudDY1MjM2Mzk0OQ== TomNicholas 35968931 2020-07-01T11:30:44Z 2020-07-01T11:30:44Z MEMBER

@shoyer I just re-encountered this bug whilst doing actual work - I would like to get it fixed, but need your input. In particular on

Do we want consistency with arithmetic, or consistency with merge?

(if this is particularly complex we could perhaps discuss it in the bi-weekly meeting?)

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  convert DataArray to DataSet before combine 494210818
568789721 https://github.com/pydata/xarray/pull/3312#issuecomment-568789721 https://api.github.com/repos/pydata/xarray/issues/3312 MDEyOklzc3VlQ29tbWVudDU2ODc4OTcyMQ== TomNicholas 35968931 2019-12-24T18:39:21Z 2019-12-24T18:39:21Z MEMBER

Also separately I think I've discovered a related weird bug: python da1 = xr.DataArray([0], name='a') da2 = xr.DataArray([1], name='b') xr.combine_by_coords([da1, da2]) returns /xarray/xarray/core/dataarray.py:682: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison return key in self.data /xarray/xarray/core/dataarray.py:682: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison return key in self.data Out[5]: <xarray.Dataset> Dimensions: (dim_0: 1) Dimensions without coordinates: dim_0 Data variables: a (dim_0) int64 0 b (dim_0) int64 1 which it shouldn't do, it should have just immediately failed because there are no dimension coordinates, and otherwise single-element arrays should obviously behave the same as the other examples above. I think it's because it's reading the single element data 0 or 1 as a boolean at some point...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  convert DataArray to DataSet before combine 494210818
568789678 https://github.com/pydata/xarray/pull/3312#issuecomment-568789678 https://api.github.com/repos/pydata/xarray/issues/3312 MDEyOklzc3VlQ29tbWVudDU2ODc4OTY3OA== TomNicholas 35968931 2019-12-24T18:39:01Z 2019-12-24T18:39:01Z MEMBER

If we gave all the DataArray objects the same name when converted into Dataset objects, then I think the result could always be a single DataArray?

I suppose so, but this seems like an odd way to handle it to me. You're throwing away data (the names) which in other circumstances would be used.

This would be consistent with how we do arithmetic with a DataArrays: the names get ignored, and then we assign a name to the result only if there are no conflicting names on the inputs.

Do we want consistency with arithmetic, or consistency with merge? I strongly feel it should be the latter, as combine wraps merge (In fact combine_nested(dataarrays, concat_dim=None) == merge(dataarrays)). More generally, I think we should try to make the behaviour of all our "combining" functions (i.e. merge, concat, update, combine_nested, and combine_by_coords) be name-aware.

Let me try to clarify by summarizing. Currently, merge and combine_nested both do the same thing for named DataArrays (return a Dataset) and un-named DataArrays (throw an error). python da1 = xr.DataArray(name='foo', data=np.random.rand(3,3), coords=[('x', [1, 2, 3]), ('y', [1, 2, 3])]) da2 = xr.DataArray(name='foo2', data=np.random.rand(3,3), coords=[('x', [5, 6, 7]), ('y', [5, 6, 7])]) merge([da1, da2]) and python da1 = xr.DataArray(name='foo', data=np.random.rand(3,3), coords=[('x', [1, 2, 3]), ('y', [1, 2, 3])]) da2 = xr.DataArray(name='foo2', data=np.random.rand(3,3), coords=[('x', [5, 6, 7]), ('y', [5, 6, 7])]) xr.combine_nested([da1, da2], concat_dim=None) both return <xarray.Dataset> Dimensions: (x: 6, y: 6) Coordinates: * x (x) int64 1 2 3 5 6 7 * y (y) int64 1 2 3 5 6 7 Data variables: foo (x, y) float64 0.5235 0.4114 0.7112 nan nan ... nan nan nan nan nan foo2 (x, y) float64 nan nan nan nan nan ... nan 0.08344 0.8844 0.7462 This all makes intuitive sense to me.

combine_by_coords is basically the same operation as combine, just with automated rather than manual ordering. It will also merge different variables together, so it should do the same thing as merge and combine_nested: fill the gaps in the hypercube up with NaNs (as per #3649) and return a Dataset with two variables, set by the names of the input DataArrays. (and throw an error for un-named DataArrays).

However, as shown above, combine_by_coords is not consistent with merge or combine_nested, which this PR will fix.

This is all different to the arithmetic logic, but I think it makes way more intuitive sense. It's okay for arithmetic and combining logic to be different, as they are used in different contexts and it's an unambiguous delineation to ignore names in arithmetic, and use them in top-level combining functions.

Also, to complete the consistency of the "combining" functions, I think we should make concat name-aware, as described in #3315.

In short: I propose that "combining" isn't arithmetic, and should be treated separately (and consistently across all types of combine functions).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  convert DataArray to DataSet before combine 494210818
535013577 https://github.com/pydata/xarray/pull/3312#issuecomment-535013577 https://api.github.com/repos/pydata/xarray/issues/3312 MDEyOklzc3VlQ29tbWVudDUzNTAxMzU3Nw== TomNicholas 35968931 2019-09-25T13:08:00Z 2019-09-25T13:27:18Z MEMBER

would it make sense for the combine methods to always return an object of the same type as the inputs? E.g., list of DataArray -> DataArray, list of Dataset -> Dataset?

I don't think so. Even for an input of only DataArrays, depending on the actual names and values in the DataArrays, the result of a combine could be a DataArray or a Dataset. So would it not it be simpler to

1) Promote all inputs to Datasets (or @friedrichknuth's "dict_like_objects") 2) Do the combining 3) If the result has only a single variable then demote from Dataset to DataArray?

That way the result is always the simplest object that can hold the result of combining that particular set of inputs, and the combining internals only have to handle DataSet objects.

EDIT: Oh wait but that won't work in the case of unnamed DataArrays right? EDIT2: Actually _to_temp_dataset will handle that case too?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  convert DataArray to DataSet before combine 494210818

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 50.747ms · About: xarray-datasette