html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2340#issuecomment-410488222,https://api.github.com/repos/pydata/xarray/issues/2340,410488222,MDEyOklzc3VlQ29tbWVudDQxMDQ4ODIyMg==,90008,2018-08-05T01:15:39Z,2018-08-05T01:15:49Z,CONTRIBUTOR,"Finishing up this line of though: without the assumption that the relative order of dimensions is maintained across arrays in a set, this feature is impossible to implement as a neat function call. You would have to specify exactly how to expand each of the coordinates which can get pretty long.
I wrote some code, that I think should have worked if relative ordering was a valid assumption:
Here it is for reference https://github.com/hmaarrfk/xarray/pull/1
To obtain the desired effect, you have to expand the dimensions of the coordinates individually:
```python
import xarray as xr
import numpy as np
# Setup an array with coordinates
n = np.arange(1, 13).reshape(3, 2, 2)
coords={'y': np.arange(1, 4),
'x': np.arange(1, 3),
'xi': np.arange(2)}
# %%
z = xr.DataArray(n[..., 0]*2, dims=['y', 'x'])
a = xr.DataArray(n, dims=['y', 'x', 'xi'], coords={**coords, 'z': z})
sliced = a[0]
print(""The original xarray"")
print(a.z)
print(""The sliced xarray"")
print(sliced.z)
# %%
expanded = sliced.expand_dims('y', 0)
expanded['z'] = expanded.z.expand_dims('y', 0)
print(expanded)
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,347558405
https://github.com/pydata/xarray/issues/2340#issuecomment-410420312,https://api.github.com/repos/pydata/xarray/issues/2340,410420312,MDEyOklzc3VlQ29tbWVudDQxMDQyMDMxMg==,90008,2018-08-04T03:39:49Z,2018-08-04T03:39:49Z,CONTRIBUTOR,"@shoyer Thank you for your first detailed response. It definitely took me a while to understand all the intricacies of your words.
In fact, if you do use a Dataset, as opposed to a DataArray, you are able to expand dims for multiple simultaneous arrays together.
For my particular case, a Dataset is probably more appropriate, but I liked the fact that the DataArray had one of the `sets` as the `main set` and that I could do math on that one.
Here is an example showing how if you use a Dataset, the behaviour I expect is obtained:
```python
# %%
import xarray as xa
import numpy as np
n = np.zeros((3, 2))
data = xa.DataArray(n, dims=['y', 'x'], coords={'y':range(3), 'x':range(2)})
z=xa.DataArray(np.arange(6).reshape((3, 2)),
dims=['y', 'x'])
my_set = xa.Dataset({'mydata':data, 'z':z})
print('Original Data')
print('=============')
print(my_set)
# %%
my_slice = my_set[{'x': 0, 'y':1}]
print(""Sliced data"")
print(""==========="")
print(""z coordinate remembers it's own x value"")
print(f'x = {my_slice.z.x}')
# %%
expanded_slice = my_slice.expand_dims('x')
print(""expanded slice"")
print(""=============="")
print(""forgot that 'z' had 'x' coordinates"")
print(""but remembered it had a 'y' coordinate"")
print(f""z = {expanded_slice.z}"")
print(expanded_slice.z.x)
```
Output:
```python
Original Data
=============
Dimensions: (x: 2, y: 3, z2: 0)
Coordinates:
* y (y) int64 0 1 2
* x (x) int64 0 1
* z2 (z2) float64
Data variables:
mydata (y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0
z (y, x) int32 0 1 2 3 4 5
Sliced data
===========
z coordinate remembers it's own x value
x =
array(0, dtype=int64)
Coordinates:
y int64 1
x int64 0
expanded slice
==============
z =
array([2])
Coordinates:
y int64 1
* x (x) int64 0
array([0], dtype=int64)
Coordinates:
y int64 1
* x (x) int64 0
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,347558405
https://github.com/pydata/xarray/issues/2340#issuecomment-410412290,https://api.github.com/repos/pydata/xarray/issues/2340,410412290,MDEyOklzc3VlQ29tbWVudDQxMDQxMjI5MA==,90008,2018-08-04T01:26:08Z,2018-08-04T03:21:50Z,CONTRIBUTOR,"I guess I probably erroneously assumed a ""scalar"" was a 1x1x1x1....x1 array. It really isn't. It is just a scalar.
When the number of dimensions hits `0`, an `IndexVariable`, it automatically gets demoted to a Variable somewhere here https://github.com/pydata/xarray/blob/56381ef444c5e699443e8b4e08611060ad5c9507/xarray/core/variable.py#L465
I guess maybe:
1. Maybe what I'm looking for is a Dataset
2. Maybe sticky dimensions would be nice to have? We could maybe flag those dimensions as being sticky, and even if you select a single element, it would be maintained as a dimension. Is this already possible?
My old thinking
@shoyer I feel like this solution: https://github.com/hmaarrfk/xarray/pull/1/files#diff-921db548d18a549f6381818ed08298c9R2144 should work. But I kinda understand why it doesn't.
That solution moves the insertion to the ""dataset level"", then uses list comprehension to create the right order for the arrays contained within the dataset.
should work, but when I try it, xarray won't even issue the repr
```python
Traceback (most recent call last):
File ""C:\Users\Mark\Miniconda3\envs\owl\lib\site-packages\IPython\core\formatters.py"", line 702, in __call__
printer.pretty(obj)
File ""C:\Users\Mark\Miniconda3\envs\owl\lib\site-packages\IPython\lib\pretty.py"", line 400, in pretty
return _repr_pprint(obj, self, cycle)
File ""C:\Users\Mark\Miniconda3\envs\owl\lib\site-packages\IPython\lib\pretty.py"", line 695, in _repr_pprint
output = repr(obj)
File ""c:\users\mark\git\xarray\xarray\core\formatting.py"", line 66, in __repr__
return ensure_valid_repr(self.__unicode__())
File ""c:\users\mark\git\xarray\xarray\core\dataset.py"", line 1190, in __unicode__
return formatting.dataset_repr(self)
File ""c:\users\mark\git\xarray\xarray\core\formatting.py"", line 455, in dataset_repr
summary.append(coords_repr(ds.coords, col_width=col_width))
File ""c:\users\mark\git\xarray\xarray\core\formatting.py"", line 350, in coords_repr
summarizer=summarize_coord, col_width=col_width)
File ""c:\users\mark\git\xarray\xarray\core\formatting.py"", line 332, in _mapping_repr
summary += [summarizer(k, v, col_width) for k, v in mapping.items()]
File ""c:\users\mark\git\xarray\xarray\core\formatting.py"", line 332, in
summary += [summarizer(k, v, col_width) for k, v in mapping.items()]
File ""c:\users\mark\git\xarray\xarray\core\formatting.py"", line 279, in summarize_coord
coord = var.variable.to_index_variable()
File ""c:\users\mark\git\xarray\xarray\core\variable.py"", line 406, in to_index_variable
encoding=self._encoding, fastpath=True)
File ""c:\users\mark\git\xarray\xarray\core\variable.py"", line 1624, in __init__
type(self).__name__)
ValueError: IndexVariable objects must be 1-dimensional
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,347558405
https://github.com/pydata/xarray/issues/2340#issuecomment-410407863,https://api.github.com/repos/pydata/xarray/issues/2340,410407863,MDEyOklzc3VlQ29tbWVudDQxMDQwNzg2Mw==,90008,2018-08-04T00:32:20Z,2018-08-04T03:15:55Z,CONTRIBUTOR,"@shoyer thank you for the quick response.
My old thinking
I gave a really simplistic example, but `my_slice` is also technically scalar, but contains information that it has many non-dimensional coordinates. It is still able to understand that it should expand along those dimensions. and retain their values.
`z` should understand that it needs to keep its value for that non-dimensional coordinate when it gets promoted to a dimensional coordinate.
I'm not convinced that this is the expected behaviour. The xarray structure seems to understand that the coordinates have coordinates (though read below I think there is a bug there too).
Something else peculiar that I found was:
Before slicing, `y` doesn't have additional coordinates:
```python
data['y']
Out[38]:
array([0, 1, 2])
Coordinates:
* y (y) int32 0 1 2
```
but after slicing:
```python
data[{'x':1}]['y']
Out[39]:
array([0, 1, 2])
Coordinates:
* y (y) int32 0 1 2
x int32 1
z (y) int32 1 3 5
```
All of a sudden, `y` has inherited all of the coordinates from the top level array.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,347558405
https://github.com/pydata/xarray/issues/2340#issuecomment-410403455,https://api.github.com/repos/pydata/xarray/issues/2340,410403455,MDEyOklzc3VlQ29tbWVudDQxMDQwMzQ1NQ==,1217238,2018-08-03T23:50:21Z,2018-08-03T23:50:21Z,MEMBER,"Thanks for the very clear report!
I believe this is intentional behavior here. Let me try to clarify how xarray determines coordinates for a variable like `expanded_slice.z`. Xarray requires that dimensions of *coordinates* are a subset of dimensions for a DataArray. So when you pull out a coordinate from a DataArray, the rule xarray uses for determining coordinates on the new DataArray object is to include every coordinate from the original DataArray for which all dimensions still exist on new DataArray.
More specifically, let's look at what `expanded_slice` looks like:
```
array([0.])
Coordinates:
y int64 0
* x (x) int64 1
z int64 1
```
`expanded_slice.z` is a scalar DataArray, since `z` is a scalar:
```
array(1)
Coordinates:
y int64 0
z int64 1
```
The coordinate `x` from `expanded_slice` is lost, because it goes along the `x` dimensions which is not found on `expanded_slice.z`.
To answer your specific questions:
> is the relative order of dimensions maintained between data in the same dataset/dataarray?
No, not necessarily. It's possible to define coordinates with different dimension order than the DataArray itself (although this isn't recommended). For example:
```
>>> xarray.DataArray(np.zeros((2, 2)), dims=['x', 'y'], coords={'foo': (('y', 'x'), np.zeros((2, 2)))})
array([[0., 0.],
[0., 0.]])
Coordinates:
foo (y, x) float64 0.0 0.0 0.0 0.0
Dimensions without coordinates: x, y
```
> Can coordinates have MORE dimensions than the array itself?
No, this isn't allowed by xarray's data model.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,347558405