html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1914#issuecomment-435717326,https://api.github.com/repos/pydata/xarray/issues/1914,435717326,MDEyOklzc3VlQ29tbWVudDQzNTcxNzMyNg==,10928117,2018-11-04T23:07:56Z,2018-11-04T23:07:56Z,NONE,"@jcmgray I had to miss your reply to this issue, I saw it just now. I love your code! I will definitely include xyzpy in my tools from now on ;-).","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,297560256
https://github.com/pydata/xarray/issues/1914#issuecomment-396745650,https://api.github.com/repos/pydata/xarray/issues/1914,396745650,MDEyOklzc3VlQ29tbWVudDM5Njc0NTY1MA==,8982598,2018-06-12T21:48:31Z,2018-06-12T22:40:31Z,CONTRIBUTOR,"Indeed, this is exactly the kind of situation I wrote ``xyzpy`` for. As a quick demo:
```python
import numpy as np
import xyzpy as xyz
def some_function(x, y, z):
return x * np.random.randn(3, 4) + y / z
# Define how to label the function's output
runner_opts = {
'fn': some_function,
'var_names': ['output'],
'var_dims': {'output': ['a', 'b']},
'var_coords': {'a': [10, 20, 30]},
}
runner = xyz.Runner(**runner_opts)
# set the parameters we want to explore (combos <-> cartesian product)
combos = {
'x': np.linspace(1, 2, 11),
'y': np.linspace(2, 3, 21),
'z': np.linspace(4, 5, 31),
}
# run them
runner.run_combos(combos)
```
Should produce:
```
100%|###################| 7161/7161 [00:00<00:00, 132654.11it/s]
Dimensions: (a: 3, b: 4, x: 11, y: 21, z: 31)
Coordinates:
* x (x) float64 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0
* y (y) float64 2.0 2.05 2.1 2.15 2.2 2.25 2.3 2.35 2.4 2.45 2.5 ...
* z (z) float64 4.0 4.033 4.067 4.1 4.133 4.167 4.2 4.233 4.267 4.3 ...
* a (a) int32 10 20 30
Dimensions without coordinates: b
Data variables:
output (x, y, z, a, b) float64 0.6942 -0.3348 -0.9156 -0.517 -0.834 ...
```
And there are options for [merging successive, disjoint sets of data](http://xyzpy.readthedocs.io/en/latest/generate.html#aggregating-data-harvester) (``combos2, combos3, ...``) and [parallelizing/distributing the work](http://xyzpy.readthedocs.io/en/latest/gen_parallel.html).
There are also multiple ways to define functions inputs/outputs (the easiest of which is just to actually return a ``xr.Dataset``), but do let me know if your use case is beyond them or unclear.","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",,297560256
https://github.com/pydata/xarray/issues/1914#issuecomment-396738702,https://api.github.com/repos/pydata/xarray/issues/1914,396738702,MDEyOklzc3VlQ29tbWVudDM5NjczODcwMg==,1217238,2018-06-12T21:23:09Z,2018-06-12T21:23:09Z,MEMBER,"[xyzpy](http://xyzpy.readthedocs.io) (by @jcmgray) looks like it might be a nice way to solve this problem, e.g., see http://xyzpy.readthedocs.io/en/latest/examples/complex%20output%20example.html
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,297560256
https://github.com/pydata/xarray/issues/1914#issuecomment-396737241,https://api.github.com/repos/pydata/xarray/issues/1914,396737241,MDEyOklzc3VlQ29tbWVudDM5NjczNzI0MQ==,6897215,2018-06-12T21:18:18Z,2018-06-12T21:18:18Z,NONE,"[This StackOverflow question](https://stackoverflow.com/questions/40503807/take-a-1d-list-of-results-and-convert-it-to-a-n-d-xarray-dataarray) is related to this ""issue"". ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,297560256
https://github.com/pydata/xarray/issues/1914#issuecomment-367677038,https://api.github.com/repos/pydata/xarray/issues/1914,367677038,MDEyOklzc3VlQ29tbWVudDM2NzY3NzAzOA==,10928117,2018-02-22T13:15:11Z,2018-02-22T13:15:11Z,NONE,"@shoyer Thanks for your suggestions and linking the other issue. I think this one can also be labelled as the ""usage question"".","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,297560256
https://github.com/pydata/xarray/issues/1914#issuecomment-367578341,https://api.github.com/repos/pydata/xarray/issues/1914,367578341,MDEyOklzc3VlQ29tbWVudDM2NzU3ODM0MQ==,1217238,2018-02-22T06:13:58Z,2018-02-22T06:13:58Z,MEMBER,"This issue has brought up a lot of the same issues: https://github.com/pydata/xarray/issues/1773
Clearly, we need better documentation here at the very least.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,297560256
https://github.com/pydata/xarray/issues/1914#issuecomment-366884882,https://api.github.com/repos/pydata/xarray/issues/1914,366884882,MDEyOklzc3VlQ29tbWVudDM2Njg4NDg4Mg==,1217238,2018-02-20T07:02:37Z,2018-02-20T07:02:37Z,MEMBER,"`xarray.broadcast()` could also be helpful for generating a cartesian product. Something like `xarray.broadcast(*data.coords.values())` would get you three 3D DataArray objects.
`apply_ufunc` with `vectorize=True` could also achieve what you're looking for here:
```python
import xarray as xr
import numpy as np
data = xr.Dataset(coords={'x': np.linspace(-1, 1), 'y': np.linspace(0, 10), 'a': 1, 'b': 5})
def some_function(x, y):
return float(x) * float(y)
xr.apply_ufunc(some_function, data['x'], data['y'], vectorize=True)
```
Results in:
```
array([[ -0. , -0.204082, -0.408163, ..., -9.591837, -9.795918, -10. ],
[ -0. , -0.195752, -0.391504, ..., -9.200333, -9.396085,
-9.591837],
[ -0. , -0.187422, -0.374844, ..., -8.80883 , -8.996252,
-9.183673],
...,
[ 0. , 0.187422, 0.374844, ..., 8.80883 , 8.996252,
9.183673],
[ 0. , 0.195752, 0.391504, ..., 9.200333, 9.396085,
9.591837],
[ 0. , 0.204082, 0.408163, ..., 9.591837, 9.795918, 10. ]])
Coordinates:
* x (x) float64 -1.0 -0.9592 -0.9184 -0.8776 -0.8367 -0.7959 ...
a int64 1
b int64 5
* y (y) float64 0.0 0.2041 0.4082 0.6122 0.8163 1.02 1.224 1.429 ...
```
You can even do this with dask arrays if you set `dask='parallelized'`.
That said, it does feel like there's some missing functionality here for the xarray equivalent of `ndenumerate`. I'm not entirely sure what the right API is, yet.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,297560256
https://github.com/pydata/xarray/issues/1914#issuecomment-366833780,https://api.github.com/repos/pydata/xarray/issues/1914,366833780,MDEyOklzc3VlQ29tbWVudDM2NjgzMzc4MA==,10928117,2018-02-20T00:27:36Z,2018-02-20T00:27:36Z,NONE,"After preparing list similar to ``[{'x': 0, 'y': 'a'}, {'x': 1, 'y': 'a'}, ...]`` interaction with cluster is quite efficient. One can easily pass such a thing to ``async_map`` of ``ipyparallel``.
Thanks for your suggestion, I need to try few things. I also want to try to extend it to function that computes few different things that could be multi-valued, e.g.
```python
def dummy(x, y):
ds = xr.Dataset(
{'out1': ('n', [1*x, 2*x, 3*x]), 'out2': ('m', [x, y])},
coords = {'x': x, 'y': y, 'n': range(3), 'm': range(2)}
)
return ds
```
and then group together such outputs... Ok, I know. I go from simple problem to much more complicated one, but isn't it the case usually?
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,297560256
https://github.com/pydata/xarray/issues/1914#issuecomment-366825366,https://api.github.com/repos/pydata/xarray/issues/1914,366825366,MDEyOklzc3VlQ29tbWVudDM2NjgyNTM2Ng==,6815844,2018-02-19T23:21:05Z,2018-02-19T23:34:58Z,MEMBER,"I am not sure if it is efficient to interact with a cluster, but I often use `MultiIndex` for make a cartesian product,
```python
In [1]: import xarray as xr
...: import numpy as np
...: data = xr.DataArray(np.full((3, 4), np.nan), dims=('x', 'y'),
...: coords={'x': [0, 1, 2], 'y': ['a', 'b', 'c', 'd']})
...:
...: data
...:
Out[1]:
array([[ nan, nan, nan, nan],
[ nan, nan, nan, nan],
[ nan, nan, nan, nan]])
Coordinates:
* x (x) int64 0 1 2
* y (y)
array([ nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan])
Coordinates:
* xy (xy) MultiIndex
- x (xy) int64 0 0 0 0 1 1 1 1 2 2 2 2
- y (xy) object 'a' 'b' 'c' 'd' 'a' 'b' 'c' 'd' 'a' 'b' 'c' 'd'
```
For the above example, `data` becomes 1-dimensional with coordinate `xy`, where `xy` is a product of `x` and `y`.
Each entry of `xy` is tuple of 'x' and 'y' value,
```python
In [3]: data1[0]
Out[3]:
array(np.nan)
Coordinates:
xy object (0, 'a')
```
and we can assign a value for given coordinate values by `loc` method,
```python
In [5]: # Assuming we found the result with (1, 'a') is 2.0
...: data1.loc[(1, 'a'), ] = 2.0
In [6]: data1
Out[6]:
array([ nan, nan, nan, nan, 2., nan, nan, nan, nan, nan, nan, nan])
Coordinates:
* xy (xy) MultiIndex
- x (xy) int64 0 0 0 0 1 1 1 1 2 2 2 2
- y (xy) object 'a' 'b' 'c' 'd' 'a' 'b' 'c' 'd' 'a' 'b' 'c' 'd'
```
Note that we need to access via `data1.loc[(1, 'a'), ]`, rather than `data1.loc[(1, 'a')]` (last comma in the bracket is needed.)
EDIT: I modified my previous comment to take the partial assignment into accout.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,297560256
https://github.com/pydata/xarray/issues/1914#issuecomment-366819497,https://api.github.com/repos/pydata/xarray/issues/1914,366819497,MDEyOklzc3VlQ29tbWVudDM2NjgxOTQ5Nw==,10928117,2018-02-19T22:40:17Z,2018-02-19T22:58:02Z,NONE,"For ""get done"" I had for example the following (similar to what I linked as my initial attempt)
```python
coordinates = {
'x': np.linspace(-1, 1),
'y': np.linspace(0, 10),
}
constants = {
'a': 1,
'b': 5
}
inps = [{**constants, **{k: v for k, v in zip(coordinates.keys(), x)}}
for x in list(it.product(*coordinates.values()))]
def f(x, y, a, b):
""""""Some dummy function.""""""
v = a * x**2 + b * y**2
return xr.DataArray(v, {'x': x, 'y': y, 'a': a, 'b': b})
# simulate computation on cluster
values = list(map(lambda s: f(**s), inps))
# gather and unstack the inputs
ds = xr.concat(values, dim='new', coords='all')
ds = ds.set_index(new=list(set(ds.coords) - set(ds.dims)))
ds = ds.unstack('new')
```
It is very close to what you suggest. My main question is if this can be done better. Mainly I am wondering if
1. Is there any built-in iterator over the Cartesian product of coordinates. If no, are there people that also think it would be useful?
2. Gathering together / unstacking of the data. My 3 line combo of ``concat``, ``set_index`` and ``unstack`` seems to do the trick but it seems a bit like over complication. Ideally I'd expect to have some mechanism that works similar to:
```python
inputs = cartesian_product(coordinates) # list similar to ``inps`` above
values = [function(inp) for inp in inputs] # or using ipypparallel map
xarray_data = ... # some empty xarray object
for inp, val in zip(inputs, values):
xarray_data[inp] = val
```
I asked how to generate product of coordinates from xarray object because I was expecting that I can create ``xarray_data`` as an empty object with all coordinates set and then fill it.
---
### Added comment
Having an empty, as filled with ``nan``s, object to start with would have this benefit that one could save partial results and have clean information what was already computed.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,297560256
https://github.com/pydata/xarray/issues/1914#issuecomment-366791162,https://api.github.com/repos/pydata/xarray/issues/1914,366791162,MDEyOklzc3VlQ29tbWVudDM2Njc5MTE2Mg==,5635139,2018-02-19T20:05:53Z,2018-02-19T20:05:53Z,MEMBER,"I _think_ that this shouldn't be too hard to 'get done' but also that xarray may not give you much help natively. (I'm not sure though, so take this as hopefully helpful contribution rather than a definitive answer)
Specifically, can you do (2) by generating a product of the coords? Either using numpy, stacking, or some simple python:
```python
In [3]: list(product(*((data[x].values) for x in data.dims)))
Out[3]:
[(0.287706062977495, 0.065327131503921),
(0.287706062977495, 0.17398282388217068),
(0.287706062977495, 0.1455022501442349),
(0.42398126102299216, 0.065327131503921),
(0.42398126102299216, 0.17398282388217068),
(0.42398126102299216, 0.1455022501442349),
(0.13357153947234057, 0.065327131503921),
(0.13357153947234057, 0.17398282388217068),
(0.13357153947234057, 0.1455022501442349),
(0.42347765161572537, 0.065327131503921),
(0.42347765161572537, 0.17398282388217068),
(0.42347765161572537, 0.1455022501442349)]
```
then distribute those out to a cluster if you need, and then unstack them back into a dataset?
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,297560256
https://github.com/pydata/xarray/issues/1914#issuecomment-366740505,https://api.github.com/repos/pydata/xarray/issues/1914,366740505,MDEyOklzc3VlQ29tbWVudDM2Njc0MDUwNQ==,10928117,2018-02-19T16:20:15Z,2018-02-19T16:23:31Z,NONE,"Let me give a bit of a background what I would like to do:
1. Create an empty ``Dataset`` of coordinates I want to explore, i.e. two np.arrays ``x`` and ``y``, and two scalars ``a`` and ``b``.
2. Generate an list of the Cartesian product of all the coordinates, i.e. ``[ {'x': -1, 'y': 0, 'a': 1, 'b': 5}, ...]`` (data format doesn't really matter).
3. For each item of the iterator compute some function: ``f = f(x, y, a, b)``. In principle this function can be expensive to compute, therefore I'd compute it for each item of list from 2. separately on the cluster.
4. ""merge"" it all together into a single xarray object
In principle ``f`` should be allowed to return e.g. ``np.array``.
An related issue in [holoviews](https://github.com/ioam/holoviews/issues/2341#issuecomment-365925725) and the notebook with my initial [attempt](https://gitlab.kwant-project.org/qt/cookbook/blob/2f03d563342be6f5b85190faac24656457f1647f/xarray_holoviews_gridded.ipynb).
In the linked notebook I managed to achieve the goal however without starting with an ``xarray`` object containing coordinates. Also combining the data seems a bit inefficient as it takes more time than generating it for a larger datasets.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,297560256