home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

13 rows where issue = 911513701 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 4

  • keewis 6
  • lanougue 4
  • caenrigen 2
  • spencerkclark 1

author_association 3

  • MEMBER 7
  • NONE 4
  • CONTRIBUTOR 2

issue 1

  • bug or unclear definition of combine_attrs with xr.merge() · 13 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1255120923 https://github.com/pydata/xarray/issues/5436#issuecomment-1255120923 https://api.github.com/repos/pydata/xarray/issues/5436 IC_kwDOAMm_X85Kz6Ab keewis 14808389 2022-09-22T14:36:23Z 2022-09-22T17:27:13Z MEMBER

instead of using the Dataset constructor it's actually better to manually convert to a Dataset before passing to xr.merge: ```python arr1 = xr.DataArray([0, 1], dims="x", attrs={"units": "m"}, name="a") arr2 = xr.DataArray([2, 3, 4], dims="y", attrs={"units": "s"})

xr.merge([arr1.to_dataset(), arr2.to_dataset(name="b")], ...) `` that way, the merge is configurable but we avoid thepromote_attrs=True`

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug or unclear definition of combine_attrs with xr.merge() 911513701
1231701908 https://github.com/pydata/xarray/issues/5436#issuecomment-1231701908 https://api.github.com/repos/pydata/xarray/issues/5436 IC_kwDOAMm_X85JakeU spencerkclark 6628425 2022-08-30T13:53:11Z 2022-08-30T13:53:11Z MEMBER

I encountered the promote_attrs / merge issue recently in an example similar to @lanougue's above. I'm with @keewis that I would also tentatively support setting promote_attrs=False in merge, but I'm also not that familiar with that part of the codebase and the decisions that went into it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug or unclear definition of combine_attrs with xr.merge() 911513701
872221809 https://github.com/pydata/xarray/issues/5436#issuecomment-872221809 https://api.github.com/repos/pydata/xarray/issues/5436 MDEyOklzc3VlQ29tbWVudDg3MjIyMTgwOQ== keewis 14808389 2021-07-01T12:54:40Z 2021-07-01T12:54:40Z MEMBER

I want it to basically do nothing to the attrs

I think your workaround should work fine for the example you gave. As an alternative you could also use the Dataset constructor: python xr.Dataset({x0.name: x0, y0.name: y0})

Is it possible at the moment to pass promote_attrs=False to xr.merge()?

that's hard-coded, unfortunately.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug or unclear definition of combine_attrs with xr.merge() 911513701
872210975 https://github.com/pydata/xarray/issues/5436#issuecomment-872210975 https://api.github.com/repos/pydata/xarray/issues/5436 MDEyOklzc3VlQ29tbWVudDg3MjIxMDk3NQ== caenrigen 31376402 2021-07-01T12:37:52Z 2021-07-01T12:37:52Z CONTRIBUTOR

@keewis thank you for the reply

To fix this, I would vote for not using promote_attrs=True in merge (low confidence vote, though, I'm sure there was a reason for that). We could also try to allow specifying separate strategies for main object and variables, but that looks somewhat complicated (and I'm still not sure what syntax we could use for that).

Does this mean that my workaround is not fully working as I would like it to? (I want it to basically do nothing to the attrs)

Is it possible at the moment to pass promote_attrs=False to xr.merge()?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug or unclear definition of combine_attrs with xr.merge() 911513701
872205035 https://github.com/pydata/xarray/issues/5436#issuecomment-872205035 https://api.github.com/repos/pydata/xarray/issues/5436 MDEyOklzc3VlQ29tbWVudDg3MjIwNTAzNQ== keewis 14808389 2021-07-01T12:28:40Z 2021-07-01T12:28:54Z MEMBER

because it will not (?) affect the attributes of the variables

that's a misconception: every combine_attrs strategy will affect the attrs on both the variables and the main object. However, "override" was the hard-coded default for variables before the change.

To fix this, I would vote for not using promote_attrs=True in merge (low confidence vote, though, I'm sure there was a reason for that). We could also try to allow specifying separate strategies for main object and variables, but that looks somewhat complicated (and I'm still not sure what syntax we could use for that).

@pydata/xarray, any ideas / opinions?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug or unclear definition of combine_attrs with xr.merge() 911513701
871681764 https://github.com/pydata/xarray/issues/5436#issuecomment-871681764 https://api.github.com/repos/pydata/xarray/issues/5436 MDEyOklzc3VlQ29tbWVudDg3MTY4MTc2NA== caenrigen 31376402 2021-06-30T19:49:24Z 2021-06-30T19:49:24Z CONTRIBUTOR

Hey guys! First of all thank you for the work on maintaining this package 😃

I am running into the same issue and it is partially blocking an open-source package (used in experimental quantum computing) from adopting the latest version of xarray.

For our typical dataset this happens:

python import xarray as xr import numpy as np x0 = xr.DataArray(data=np.ones(5), name="x0", attrs={ 'name': 'x', 'long_name': 'X position', 'units': 'm', 'batched': False}) y0 = xr.DataArray(data=np.ones(5), name="y0", attrs={ 'name': 'y', 'long_name': 'Y position', 'units': 'm', 'batched': True}) ds = xr.merge([x0, y0]) print(ds)

bash <xarray.Dataset> Dimensions: (dim_0: 5) Dimensions without coordinates: dim_0 Data variables: x0 (dim_0) float64 1.0 1.0 1.0 1.0 1.0 y0 (dim_0) float64 1.0 1.0 1.0 1.0 1.0 Attributes: name: x long_name: X position units: m batched: False

As default behavior this is totally unexpected because the variables attributes have nothing to do with the dataset itself. I am just trying to put all my data in a single container. (Am i using the wrong function for this?)

And second, combine_attrs has no option for "do nothing" which was the behavior in xarray 0.17.0. I noticed the current master brach allows passing a function but not sure if that solves the problem. The "drop" option indeed acts in very weird way, I tried it in the hope that it will drop attrs on the resulting dataset instead of affecting the attributes of the variables.

My suggestion is to do nothing by default, i.e. combine_attrs = None, simply does not try to combine the attribute in any way.

I understand there might be cases where combining the attributes makes sense, so maybe my suggestion only applies when only DataArrays objects are merged into a Dataset (If i recall correctly the "what's new" list, this will be already enforced in 0.18.3).

Hope my use case helps this discussion.


In the meantime the workaround seems to be: combine_attrs = "override" (because it will not (?) affect the attributes of the variables) + and wipe attrs afterwards with dataset.attrs = dict():

python import xarray as xr import numpy as np x0 = xr.DataArray(data=np.ones(5), name="x0", attrs={ 'name': 'x', 'long_name': 'X position', 'units': 'm', 'batched': False}) y0 = xr.DataArray(data=np.ones(5), name="y0", attrs={ 'name': 'y', 'long_name': 'Y position', 'units': 'm', 'batched': True}) ds = xr.merge([x0, y0]) ds.attrs = dict() print(ds) print("===") print(ds.x0)

```bash

<xarray.Dataset> Dimensions: (dim_0: 5) Dimensions without coordinates: dim_0 Data variables: x0 (dim_0) float64 1.0 1.0 1.0 1.0 1.0 y0 (dim_0) float64 1.0 1.0 1.0 1.0 1.0 === <xarray.DataArray 'x0' (dim_0: 5)> array([1., 1., 1., 1., 1.]) Dimensions without coordinates: dim_0 Attributes: name: x long_name: X position units: m batched: False ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug or unclear definition of combine_attrs with xr.merge() 911513701
861792425 https://github.com/pydata/xarray/issues/5436#issuecomment-861792425 https://api.github.com/repos/pydata/xarray/issues/5436 MDEyOklzc3VlQ29tbWVudDg2MTc5MjQyNQ== lanougue 32069530 2021-06-15T20:00:29Z 2021-06-15T20:00:29Z NONE

an additional flag like "keep_attrs" is not feasible ? It would be a boolean

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug or unclear definition of combine_attrs with xr.merge() 911513701
854839898 https://github.com/pydata/xarray/issues/5436#issuecomment-854839898 https://api.github.com/repos/pydata/xarray/issues/5436 MDEyOklzc3VlQ29tbWVudDg1NDgzOTg5OA== keewis 14808389 2021-06-04T16:05:58Z 2021-06-04T16:05:58Z MEMBER

that makes sense. I'm not sure what syntax we can use for that, though. Maybe also accept a 2-tuple?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug or unclear definition of combine_attrs with xr.merge() 911513701
854812439 https://github.com/pydata/xarray/issues/5436#issuecomment-854812439 https://api.github.com/repos/pydata/xarray/issues/5436 MDEyOklzc3VlQ29tbWVudDg1NDgxMjQzOQ== lanougue 32069530 2021-06-04T15:24:25Z 2021-06-04T15:24:25Z NONE

I understand but I still beleive that we should be able to control separately the attrs of the final dataset and the attrs of the merged dataArray inside (whatever the way they are passed to the merge function)

Thanks for the pint-xarray suggestion! I didn't know about it. I will look into it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug or unclear definition of combine_attrs with xr.merge() 911513701
854786005 https://github.com/pydata/xarray/issues/5436#issuecomment-854786005 https://api.github.com/repos/pydata/xarray/issues/5436 MDEyOklzc3VlQ29tbWVudDg1NDc4NjAwNQ== keewis 14808389 2021-06-04T14:48:49Z 2021-06-04T14:58:08Z MEMBER

It sounds like we should try to add a page to the user-guide section of the docs explaining what combine_attrs (and keep_attrs) do.

When merging an object, all the attrs of the objects to be merge (i.e. all Dataset / DataArray objects that were passed, and separately all variables with the same name) will be passed to a function which does the merging of the attrs. combine_attrs="drop" always makes it return an empty dict while "override" always returns the first attrs dict from the list of dicts it was passed. "drop_conflicts" will go through all those dicts and keeps only those attributes where the values don't mismatch. In other words: that's by design, we simply changed the rules to also apply them to the variables. For examples of how they work:

comparison of the different values for <tt>combine_attrs</tt> ```python In [6]: ds1 = xr.Dataset(attrs={"created": "2020-06-05", "a": 0}) ...: ds2 = xr.Dataset(attrs={"created": "2021-06-05", "b": 1}) ...: ...: xr.merge([ds1, ds2], combine_attrs="drop") Out[6]: <xarray.Dataset> Dimensions: () Data variables: *empty* In [7]: ds1 = xr.Dataset(attrs={"created": "2020-06-05", "a": 0}) ...: ds2 = xr.Dataset(attrs={"created": "2021-06-05", "b": 1}) ...: ...: xr.merge([ds1, ds2], combine_attrs="override") Out[7]: <xarray.Dataset> Dimensions: () Data variables: *empty* Attributes: created: 2020-06-05 a: 0 In [8]: ds1 = xr.Dataset(attrs={"created": "2020-06-05", "a": 0}) ...: ds2 = xr.Dataset(attrs={"created": "2020-06-05", "b": 1}) ...: ...: xr.merge([ds1, ds2], combine_attrs="no_conflicts") Out[8]: <xarray.Dataset> Dimensions: () Data variables: *empty* Attributes: created: 2020-06-05 a: 0 b: 1 In [9]: ds1 = xr.Dataset(attrs={"created": "2020-06-05", "a": 0}) ...: ds2 = xr.Dataset(attrs={"created": "2021-06-05", "b": 1}) ...: ...: xr.merge([ds1, ds2], combine_attrs="identical") --------------------------------------------------------------------------- MergeError Traceback (most recent call last) <ipython-input-9-3fda11a2986d> in <module> 2 ds2 = xr.Dataset(attrs={"created": "2021-06-05", "b": 1}) 3 ----> 4 xr.merge([ds1, ds2], combine_attrs="identical") .../xarray/core/merge.py in merge(objects, compat, join, fill_value, combine_attrs) 883 dict_like_objects.append(obj) 884 --> 885 merge_result = merge_core( 886 dict_like_objects, 887 compat, .../xarray/core/merge.py in merge_core(objects, compat, join, combine_attrs, priority_arg, explicit_coords, indexes, fill_value) 645 ) 646 --> 647 attrs = merge_attrs( 648 [var.attrs for var in coerced if isinstance(var, (Dataset, DataArray))], 649 combine_attrs, .../xarray/core/merge.py in merge_attrs(variable_attrs, combine_attrs) 546 for attrs in variable_attrs[1:]: 547 if not dict_equiv(result, attrs): --> 548 raise MergeError( 549 f"combine_attrs='identical', but attrs differ. First is {str(result)} " 550 f", other is {str(attrs)}." MergeError: combine_attrs='identical', but attrs differ. First is {'created': '2020-06-05', 'a': 0} , other is {'created': '2021-06-05', 'b': 1}. In [10]: ds1 = xr.Dataset(attrs={"created": "2020-06-05", "a": 0}) ...: ds2 = xr.Dataset(attrs={"created": "2021-06-05", "b": 1}) ...: ...: xr.merge([ds1, ds2], combine_attrs="drop_conflicts") Out[10]: <xarray.Dataset> Dimensions: () Data variables: *empty* Attributes: a: 0 b: 1 ```

What I was referring to before was something different: ```python In [2]: arr = xr.DataArray([0, 1], dims="x", attrs={"units": "m"}) ...: arr Out[2]: <xarray.DataArray (x: 2)> array([0, 1]) Dimensions without coordinates: x Attributes: units: m

In [3]: arr.to_dataset(name="a") Out[3]: <xarray.Dataset> Dimensions: (x: 2) Dimensions without coordinates: x Data variables: a (x) int64 0 1

In [4]: arr.to_dataset(name="a", promote_attrs=True) Out[4]: <xarray.Dataset> Dimensions: (x: 2) Dimensions without coordinates: x Data variables: a (x) int64 0 1 Attributes: units: m `` why doesxr.mergeconvert withpromote_attrs=True`?

(I do not understand your comment: how to keep the units on the data instead of in the attributes ?)

have a look at pint-xarray for that (still experimental, but I very much welcome feedback)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug or unclear definition of combine_attrs with xr.merge() 911513701
854768921 https://github.com/pydata/xarray/issues/5436#issuecomment-854768921 https://api.github.com/repos/pydata/xarray/issues/5436 MDEyOklzc3VlQ29tbWVudDg1NDc2ODkyMQ== lanougue 32069530 2021-06-04T14:27:07Z 2021-06-04T14:27:07Z NONE

Ok, I understand your point of view. My question (or what you think could be a bug) thus becomes: why "drop" option removes attrs from the variables in the merged dataset while "drop_conflicts" and "override" keep them ?

It should thus be some way to say the merging to keep or not the attrs of each variables in the final dataset. (I do not understand your comment: how to keep the units on the data instead of in the attributes ?)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug or unclear definition of combine_attrs with xr.merge() 911513701
854745944 https://github.com/pydata/xarray/issues/5436#issuecomment-854745944 https://api.github.com/repos/pydata/xarray/issues/5436 MDEyOklzc3VlQ29tbWVudDg1NDc0NTk0NA== keewis 14808389 2021-06-04T14:01:50Z 2021-06-04T14:05:55Z MEMBER

yes, I think so: it is strange to merge the data of a variable but not the attributes (or rather, that the merge strategy of variables is hard-coded to "override").

The only thing I would think of as a bug is that the converted datasets would copy the DataArray's attributes to ds.attrs in addition to keeping it on the variable (not sure if I'm missing something, though?). In the example you gave that does not make sense at all: the dataset object does not have units on its own. Edit: you can work around this by keeping the units on the data instead of in the attributes.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug or unclear definition of combine_attrs with xr.merge() 911513701
854739959 https://github.com/pydata/xarray/issues/5436#issuecomment-854739959 https://api.github.com/repos/pydata/xarray/issues/5436 MDEyOklzc3VlQ29tbWVudDg1NDczOTk1OQ== lanougue 32069530 2021-06-04T13:52:44Z 2021-06-04T13:52:44Z NONE

@keewis , do you think this behaviour to be the expected one ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  bug or unclear definition of combine_attrs with xr.merge() 911513701

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.452ms · About: xarray-datasette