html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2180#issuecomment-1113990595,https://api.github.com/repos/pydata/xarray/issues/2180,1113990595,IC_kwDOAMm_X85CZiXD,26384082,2022-04-30T13:37:47Z,2022-04-30T13:37:47Z,NONE,"In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity
If this issue remains relevant, please comment here or remove the `stale` label; otherwise it will be marked as closed automatically
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,326205036
https://github.com/pydata/xarray/issues/2180#issuecomment-619325660,https://api.github.com/repos/pydata/xarray/issues/2180,619325660,MDEyOklzc3VlQ29tbWVudDYxOTMyNTY2MA==,26384082,2020-04-25T05:39:47Z,2020-04-25T05:39:47Z,NONE,"In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity
If this issue remains relevant, please comment here or remove the `stale` label; otherwise it will be marked as closed automatically
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,326205036
https://github.com/pydata/xarray/issues/2180#issuecomment-391932929,https://api.github.com/repos/pydata/xarray/issues/2180,391932929,MDEyOklzc3VlQ29tbWVudDM5MTkzMjkyOQ==,1217238,2018-05-25T03:46:40Z,2018-05-25T03:46:40Z,MEMBER,"Looking at @crusaderky's example of different coordinate labels again, I finally remember why it works this way.
The logic of `ds.update(other)` is that (1) variables explicitly listed in `other` should take precedence over the original object and (2) mutating a Dataset should not change its dimensions or indexes.
This is pretty clearly expressed in the original code:
```
return merge_core([dataset, other], priority_arg=1,
indexes=dataset.indexes)
```
In @crusaderky's example with `fridge.update(shopping)`, `shopping` first gets reindexed to `fridge` (which means it ends up only holding NaN), and is then used to override the original dataset:
```
Dimensions: (fruit: 1)
Coordinates:
* fruit (fruit) object 'apples'
quality (fruit) object nan
Data variables:
fruits (fruit) float64 nan
```
It would probably make sense to keep values from the original variables rather than blindly replacing them with the new NaNs from `shopping`, but in general I do think the general approach of ""right join on variables"" and ""left join on indexes"" makes sense for `update()`.
For most use cases, the true outer join makes more sense -- which is why `xarray.merge()` works that way.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,326205036
https://github.com/pydata/xarray/issues/2180#issuecomment-391919607,https://api.github.com/repos/pydata/xarray/issues/2180,391919607,MDEyOklzc3VlQ29tbWVudDM5MTkxOTYwNw==,6815844,2018-05-25T02:06:47Z,2018-05-25T02:19:45Z,MEMBER,"For referene the original issue in #2068 was
```python
In [4]: ds = xr.Dataset()
...: ds.coords['source'] = (['a', 'b', 'c'], np.random.random((2, 3, 4)))
...: ds.coords['unrelated'] = (['a', 'c'], np.random.random((2, 4)))
...: ds
...:
Out[4]:
Dimensions: (a: 2, b: 3, c: 4)
Coordinates:
source (a, b, c) float64 0.4158 0.07152 0.4258 0.4382 0.6616 0.142 ...
unrelated (a, c) float64 0.9318 0.03723 0.4226 0.9472 0.8753 0.7022 ...
Dimensions without coordinates: a, b, c
Data variables:
*empty*
In [5]: ds['dest-2'] = xr.ones_like(ds['source'].isel(c=0))
...: ds
...:
Out[5]:
Dimensions: (a: 2, b: 3)
Coordinates:
source (a, b) float64 0.4158 0.6616 0.1583 0.7821 0.221 0.2555
unrelated (a) float64 0.9318 0.8753
Dimensions without coordinates: a, b
Data variables:
dest-2 (a, b) float64 1.0 1.0 1.0 1.0 1.0 1.0
```
where `ds['unrelated']` drops dimension `c`.
We changed this behavior in #2087, but I think it was a wrong direction.
The previous behavior might be OK as long as `unrelated` is a coordinate variable.
EDIT:
I still feel something strange both in the previous and current behavior of `__setitem__` with coord.
Generally, as @crusaderky has pointed out, the right join will be a better choice.
But In the above example, dropping the dimension of `c` of 'unrelated' looks also awkward.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,326205036
https://github.com/pydata/xarray/issues/2180#issuecomment-391915849,https://api.github.com/repos/pydata/xarray/issues/2180,391915849,MDEyOklzc3VlQ29tbWVudDM5MTkxNTg0OQ==,6815844,2018-05-25T01:41:27Z,2018-05-25T01:41:27Z,MEMBER,"Thanks, @crusaderky.
The first behavior you pointed out is a bug I think.
I raised an issue in #2184, and maybe it should be discussed there.
For the second example,
> I think this should be a right join.
I agree with this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,326205036
https://github.com/pydata/xarray/issues/2180#issuecomment-391914654,https://api.github.com/repos/pydata/xarray/issues/2180,391914654,MDEyOklzc3VlQ29tbWVudDM5MTkxNDY1NA==,6213168,2018-05-25T01:32:51Z,2018-05-25T01:33:25Z,MEMBER,"> If there are conflicts in dimension coordinate, should it be outer join?
Consider this example:
```
a = Dataset({
'x': [10, 20],
'd1': ('x', [100, 200]),
'd2': ('x', [300, 400])
})
b = Dataset({
'x': [15],
'd1': ('x', [500]),
})
a.update(b)
```
In the above, with anything but an outer join you're destroying d2 - which doesn't even exist in the rhs dataset! A sane, desirable outcome should be
```
Dataset({
'x': [10, 20, 15],
'd1': ('x', [nan, nan, 500]),
'd2': ('x', [300, 400, nan])
})
```
> If there are no conflicts in dimension coordinate, but there are conflicts in non dimension coordinate, whether left or right should be prioritized?
I think this should be a right join. I alway think of non-index coords as N-to-1 properties of the index. For example,
```
a = Dataset(
coords={
'country': ('country', ['UK', 'France', 'Greece']),
'currency': ('country', ['GBP', 'EUR', 'EUR']),
},
data_vars={
'GDP': ('country', [1000, 2000, 3000]),
'Debt': ('country', [100, 200, 300]),
})
b = Dataset( # Greece exits the Eurozone
coords={
'country': ('country', ['UK', 'France', 'Greece']),
'currency': ('country', ['GBP', 'EUR', 'GRD']),
},
data_vars={
'GDP': ('country', [1000, 2000, 150000]),
})
a.update(b)
```
In the above example, I just broke the Debt variable - as I forgot to perform a currency conversion for the greek debt, which has been silently changed from 300 EUR to 300 GRD.
However I can't see any elegant way to avoid this. I *definitely* would not like to duplicate the 'country' index.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,326205036
https://github.com/pydata/xarray/issues/2180#issuecomment-391910682,https://api.github.com/repos/pydata/xarray/issues/2180,391910682,MDEyOklzc3VlQ29tbWVudDM5MTkxMDY4Mg==,6815844,2018-05-25T01:05:13Z,2018-05-25T01:30:47Z,MEMBER,"> So maybe we can leave the current behavior as is for now (but remove the warning).
Agreed.
~@shoyer, how do you think about the current `__setitem__` behavior with conflict `dimension coordinate`?
Should it be outer join as @crusaderky pointed out?~
EDIT:
I did not noticed the above comment. I will raise an issue for this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,326205036
https://github.com/pydata/xarray/issues/2180#issuecomment-391908821,https://api.github.com/repos/pydata/xarray/issues/2180,391908821,MDEyOklzc3VlQ29tbWVudDM5MTkwODgyMQ==,1217238,2018-05-25T00:50:58Z,2018-05-25T00:51:10Z,MEMBER,@crusaderky this behavior you show is indeed really strange. I don't know why alignment of dimensions works that way currently.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,326205036
https://github.com/pydata/xarray/issues/2180#issuecomment-391908588,https://api.github.com/repos/pydata/xarray/issues/2180,391908588,MDEyOklzc3VlQ29tbWVudDM5MTkwODU4OA==,1217238,2018-05-25T00:49:15Z,2018-05-25T00:49:15Z,MEMBER,"OK, looking at this more carefully `ds.update(other)` didn't actually change when other is a `Dataset`, because `ds[k] = ds[k].drop(coord_names)` doesn't actually drop coordinates from a Dataset. It just shows a warning now, due to iteration over a Dataset.
So maybe we can leave the current behavior as is for now (but remove the warning).
What did change is how we handle conflicts in `__setitem__` (which was intentional), and how we handle conflicts in `update` when the new value is a dictionary (which was *not* intentional, but at least remained consistent with `__setitem__`).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,326205036
https://github.com/pydata/xarray/issues/2180#issuecomment-391908483,https://api.github.com/repos/pydata/xarray/issues/2180,391908483,MDEyOklzc3VlQ29tbWVudDM5MTkwODQ4Mw==,6815844,2018-05-25T00:48:27Z,2018-05-25T00:48:27Z,MEMBER,"#2087 changed the second behavior.
```python
In [1]: import xarray
...:
...: fridge = xarray.Dataset(
...: data_vars={
...: 'var1': ('fruit', [10]),
...: },
...: coords={
...: 'fruit': ('fruit', [1]),
...: 'quality': ('fruit', ['Red Velvet']),
...: })
...: shopping = xarray.Dataset(
...: data_vars={
...: 'var1': ('fruit', [20]),
...: },
...: coords={
...: 'fruit': ('fruit', [1]),
...: 'quality': ('fruit', ['Tangerine']),
...: })
...:
...: fridge['var1'] = shopping['var1']
...:
```
with v10.3
```python
In [2]: fridge
Out[2]:
Dimensions: (fruit: 1)
Coordinates:
* fruit (fruit) int64 1
quality (fruit)
Dimensions: (fruit: 1)
Coordinates:
* fruit (fruit) int64 1
quality (fruit)
Dimensions: (fruit: 1)
Coordinates:
* fruit (fruit) object 'apples'
quality (fruit) object nan
Data variables:
fruits (fruit) float64 nan
```
The above doesn't make any sense to me. I wanted to replace the fruits variable with brand new content, and instead I lost both the old and the new?!?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,326205036
https://github.com/pydata/xarray/issues/2180#issuecomment-391902317,https://api.github.com/repos/pydata/xarray/issues/2180,391902317,MDEyOklzc3VlQ29tbWVudDM5MTkwMjMxNw==,6815844,2018-05-25T00:04:58Z,2018-05-25T00:04:58Z,MEMBER,"I think we should discuss *dimension coordinte* and *non-dimenson coordinate* separately.
I guess @shoyer meant *non-dimenson coordinate* here.
For *dimension coordinte*, it is always outer join, if I understand correctly.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,326205036
https://github.com/pydata/xarray/issues/2180#issuecomment-391900937,https://api.github.com/repos/pydata/xarray/issues/2180,391900937,MDEyOklzc3VlQ29tbWVudDM5MTkwMDkzNw==,6213168,2018-05-24T23:56:55Z,2018-05-24T23:56:55Z,MEMBER,"I'm of the strong opinion that _all_ joins should be outer joins unless the user explicitly says otherwise, as it's the approach least prone to do damage. I would humbly suggest considering the change for a future major release (0.11 / 0.12), with several minor releases before that printing futurewarnings.
This said, I think that changing from a right join (0.10.3) to a left join (0.10.4) will only cause breakages without providing any actual benefit in terms of user-friendliness, so we should retain the previous behaviour. A right join _vaguely_ makes more sense IMHO as it follows the general phylosophy of ``dict.update()`` where rhs wins in case of collision.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,326205036
https://github.com/pydata/xarray/issues/2180#issuecomment-391899432,https://api.github.com/repos/pydata/xarray/issues/2180,391899432,MDEyOklzc3VlQ29tbWVudDM5MTg5OTQzMg==,6815844,2018-05-24T23:47:52Z,2018-05-24T23:49:15Z,MEMBER,"I think `dataset.update(other)` should be equivalent with
```python
for key, value in other.items():
dataset[key] = value
```
as similar to python native `dict`.
Our `.item()` ony iterates over data_vars not coordinate.
So I think even in `dataset.update(other)` coordinates from other should be dropped if there is a conflict.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,326205036
https://github.com/pydata/xarray/issues/2180#issuecomment-391898293,https://api.github.com/repos/pydata/xarray/issues/2180,391898293,MDEyOklzc3VlQ29tbWVudDM5MTg5ODI5Mw==,1217238,2018-05-24T23:40:34Z,2018-05-24T23:40:41Z,MEMBER,cc @fujiisoup @crusaderky ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,326205036