html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/3258#issuecomment-529168271,https://api.github.com/repos/pydata/xarray/issues/3258,529168271,MDEyOklzc3VlQ29tbWVudDUyOTE2ODI3MQ==,2448579,2019-09-08T04:20:19Z,2019-09-08T04:20:19Z,MEMBER,Closing in favour of #3276,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,484752930
https://github.com/pydata/xarray/pull/3258#issuecomment-527187603,https://api.github.com/repos/pydata/xarray/issues/3258,527187603,MDEyOklzc3VlQ29tbWVudDUyNzE4NzYwMw==,306380,2019-09-02T15:37:18Z,2019-09-02T15:37:18Z,MEMBER,"I'm glad to see progress here.  FWIW, I think that many people would be quite happy with a version that just worked for DataArrays, in case that's faster to get in than the full solution with DataSets.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,484752930
https://github.com/pydata/xarray/pull/3258#issuecomment-527186872,https://api.github.com/repos/pydata/xarray/issues/3258,527186872,MDEyOklzc3VlQ29tbWVudDUyNzE4Njg3Mg==,2448579,2019-09-02T15:34:21Z,2019-09-02T15:34:21Z,MEMBER,Thanks. That worked. I have a new version up in #3276 that works with both DataArrays and Datasets.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,484752930
https://github.com/pydata/xarray/pull/3258#issuecomment-526756738,https://api.github.com/repos/pydata/xarray/issues/3258,526756738,MDEyOklzc3VlQ29tbWVudDUyNjc1NjczOA==,306380,2019-08-30T21:31:49Z,2019-08-30T21:32:02Z,MEMBER,"Then you can construct a tuple as a task `(1, 2, 3)` -> `(tuple, [1, 2, 3])`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,484752930
https://github.com/pydata/xarray/pull/3258#issuecomment-526751676,https://api.github.com/repos/pydata/xarray/issues/3258,526751676,MDEyOklzc3VlQ29tbWVudDUyNjc1MTY3Ng==,2448579,2019-08-30T21:11:28Z,2019-08-30T21:11:28Z,MEMBER,"Thanks @mrocklin. Unfortunately that doesn't work with the Dataset constructor. With a list it treats it as array-like

```
    The following notations are accepted:

    - mapping {var name: DataArray}
    - mapping {var name: Variable}
    - mapping {var name: (dimension name, array-like)}
    - mapping {var name: (tuple of dimension names, array-like)}
    - mapping {dimension name: array-like}
      (it will be automatically moved to coords, see below)
```

Unless @shoyer has another idea, I guess I can insert creating a DataArray into the graph and then refer to those keys in the Dataset constructor.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,484752930
https://github.com/pydata/xarray/pull/3258#issuecomment-525966384,https://api.github.com/repos/pydata/xarray/issues/3258,525966384,MDEyOklzc3VlQ29tbWVudDUyNTk2NjM4NA==,306380,2019-08-28T23:54:48Z,2019-08-28T23:54:48Z,MEMBER,"Dask doesn't traverse through tuples to find possible keys, so the keys here are hidden from view:

```python
   {'a': (('x', 'y'), ('xarray-a-f178df193efafa67203f3862b3f9f0f4', 0, 0)),
```

I recommend changing wrapping tuples with lists:

```diff
-   {'a': (('x', 'y'), ('xarray-a-f178df193efafa67203f3862b3f9f0f4', 0, 0)),
+   {'a': [('x', 'y'), ('xarray-a-f178df193efafa67203f3862b3f9f0f4', 0, 0)],
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,484752930
https://github.com/pydata/xarray/pull/3258#issuecomment-525965607,https://api.github.com/repos/pydata/xarray/issues/3258,525965607,MDEyOklzc3VlQ29tbWVudDUyNTk2NTYwNw==,2448579,2019-08-28T23:51:43Z,2019-08-28T23:53:46Z,MEMBER,"I started prototyping a Dataset version. Here's what I have:

``` python
import dask
import xarray as xr

darray = xr.DataArray(np.ones((10, 20)), 
                  dims=['x', 'y'], 
                  coords={'x': np.arange(10), 'y': np.arange(100, 120)})
dset = darray.to_dataset(name='a')
dset['b'] = dset.a + 50
dset['c'] = (dset.x + 20)
dset = dset.chunk({'x': 4, 'y': 5})
```

The function I'm applying takes a dataset and returns a DataArray because that's easy to test without figuring out how to assemble everything back into a dataset.
``` python
import itertools

# function takes dataset and returns dataarray so that I can check that things work without reconstructing a dataset
def function(ds):
    return ds.a + 10

dataset_dims = list(dset.dims)

graph = {}
gname = 'dsnew'

# map dims to list of chunk indexes
# If different variables have different chunking along the same dim
# the call to .chunks will raise an error.
ichunk = {dim: range(len(dset.chunks[dim])) for dim in dataset_dims}

# iterate over all possible chunk combinations
for v in itertools.product(*ichunk.values()):
    chunk_index_dict = dict(zip(dataset_dims, v))
    data_vars = {}
    for name, variable in dset.data_vars.items():
        # why do does dask_keys have an extra level?
        # the [0] is not required for dataarrays
        var_dask_keys = variable.__dask_keys__()[0]
        
        # recursively index into dask_keys nested list
        chunk = var_dask_keys
        for dim in variable.dims:
            chunk = chunk[chunk_index_dict[dim]]
            
        # I have key corresponding to chunk
        # this tuple is in a dictionary passed to xr.Dataset()
        # dask doesn't seem to replace this with a numpy array at execution time.
        data_vars[name] = (variable.dims, chunk)
        
    graph[(gname, ) + v] = (function, (xr.Dataset, data_vars))

final_graph = dask.highlevelgraph.HighLevelGraph.from_collections(name, graph, dependencies=[dset])
```

Elements of the graph look like
```
('dsnew', 0, 0): (<function __main__.function(ds)>,
  (xarray.core.dataset.Dataset,
   {'a': (('x', 'y'), ('xarray-a-f178df193efafa67203f3862b3f9f0f4', 0, 0)),
    'b': (('x', 'y'), ('xarray-b-e2d8d06bb9e5c1f351671a94816bd331', 0, 0)),
    'c': (('x',), ('xarray-c-d90f8b2af715b53f4c170be391239655', 0))}))
```

This doesn't work because dask doesn't replace the keys by numpy arrays when the `xr.Dataset` call is executed. 

```
result = dask.array.Array(final_graph, name=gname, chunks=dset.a.data.chunks, meta=dset.a.data._meta)
dask.compute(result)
```

```
ValueError: Could not convert tuple of form (dims, data[, attrs, encoding]): (('x', 'y'), ('xarray-a-f178df193efafa67203f3862b3f9f0f4', 0, 0)) to Variable.
```

The graph is ""disconnected"": 
![image](https://user-images.githubusercontent.com/2448579/63900034-c0780000-c9ee-11e9-9b40-22e88f5c6208.png)

I'm not sure what I'm doing wrong here. An equivalent version for DataArrays works perfectly.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,484752930
https://github.com/pydata/xarray/pull/3258#issuecomment-525427300,https://api.github.com/repos/pydata/xarray/issues/3258,525427300,MDEyOklzc3VlQ29tbWVudDUyNTQyNzMwMA==,1217238,2019-08-27T18:30:34Z,2019-08-27T18:30:34Z,MEMBER,"> apply_ufunc is extremely powerful, and when you need to cope with all possible shape transformations, I suspect its verbosity is quite necessary.
> It's just that, when all you need to do is apply an elementwise, embarassingly parallel function (80% of the times in my real life experience), apply_ufunc is overkill.

Yes, 100% agreed! There is a real need for a simpler version of `apply_ufunc`.

> The thing I have against the name map_blocks is that backends other than dask have no notion of blocks...

I think the functionality in this PR is fundamentally dask specific. We shouldn't make a habit of adding backend specific features, but it makes sense in limited cases.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,484752930
https://github.com/pydata/xarray/pull/3258#issuecomment-525425560,https://api.github.com/repos/pydata/xarray/issues/3258,525425560,MDEyOklzc3VlQ29tbWVudDUyNTQyNTU2MA==,6213168,2019-08-27T18:26:17Z,2019-08-27T18:26:17Z,MEMBER,"@shoyer let me rephrase it - apply_ufunc is extremely powerful, and when you need to cope with all possible shape transformations, I suspect its verbosity is quite necessary.
It's just that, when all you need to do is apply an elementwise, embarassingly parallel function (80% of the times in my real life experience), apply_ufunc is overkill.

The thing I have against the name map_blocks is that backends other than dask have no notion of blocks...","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,484752930
https://github.com/pydata/xarray/pull/3258#issuecomment-525384446,https://api.github.com/repos/pydata/xarray/issues/3258,525384446,MDEyOklzc3VlQ29tbWVudDUyNTM4NDQ0Ng==,1217238,2019-08-27T16:40:32Z,2019-08-27T16:40:32Z,MEMBER,"> * could we call it just ""map""? It makes sense as this thing would be very useful for non-dask based arrays too. Working routinely with scipy (chiefly with scipy.stats transforms), I tire a lot of writing very verbose `xarray.apply_ufunc` calls.

I agree that `apply_ufunc` is overly verbose. See https://github.com/pydata/xarray/issues/1074 and https://github.com/pydata/xarray/issues/1618 (and issues linked therein) for discussion about alternative APIs.

I still think this particular set of functionality should be called `map_blocks`, because it works by applying functions over each block, very similar to dask's `map_blocks`.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,484752930
https://github.com/pydata/xarray/pull/3258#issuecomment-525298264,https://api.github.com/repos/pydata/xarray/issues/3258,525298264,MDEyOklzc3VlQ29tbWVudDUyNTI5ODI2NA==,6213168,2019-08-27T13:21:44Z,2019-08-27T13:21:44Z,MEMBER,"Hi,

A few design opinions:

1. could we call it just ""map""? It makes sense as this thing would be very useful for non-dask based arrays too. Working routinely with scipy (chiefly with scipy.stats transforms), I tire a lot of writing very verbose ``xarray.apply_ufunc`` calls.

2. could we have it as a method of DataArray and Dataset, to allow for method chaining?

e.g.
```python
myarray.map(func1).chunk().map(func2).sum().compute()
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,484752930