id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 713834297,MDU6SXNzdWU3MTM4MzQyOTc=,4482,Allow skipna in .dot(),2560426,open,0,,,13,2020-10-02T18:52:41Z,2020-10-20T22:21:14Z,,NONE,,,," **Is your feature request related to a problem? Please describe.** Right now there's no efficient way to do a dot product that skips over nan elements. **Describe the solution you'd like** I want to be able to treat the summation in `dot` as a `nansum`, controlled by a skipna option. Either this can be implemented directly, or an additional ufunc can be added: `xarray.unfuncs.nan_to_num` that can be called on the inputs to `dot`. Unfortunately using numpy's `nan_to_num` will initiate eager execution. **Describe alternatives you've considered** It's possible to implement this by hand, but it ends up being extremely inefficient in one of my use-cases: - `(x*y).sum('dot_prod_dim', skipna=True)` takes 30 seconds - `x.dot(y)` takes 1 second","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4482/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 712052219,MDU6SXNzdWU3MTIwNTIyMTk=,4474,Implement rolling_exp for dask arrays,2560426,open,0,,,7,2020-09-30T15:31:50Z,2020-10-15T16:32:03Z,,NONE,,,," **Is your feature request related to a problem? Please describe.** I use dask-based chunking on my arrays regularly and would like to leverage the efficient `numbagg` implementation of `move_exp_nanmean()` with `rolling_exp()`. **Describe the solution you'd like** It's possible to compute a rolling exp mean as a function of rolling exp means of contiguous, non-overlapping subsets (chunks). You just need to first ""un-normalize"" the rolling_exps of each chunk in order to split them into their corresponding numerators and denominators (see the `ewm` definition [here](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.ewm.html) under `adjust=True`). The normalization factor (denominator) to multiply back in to the chunk's `move_exp_nanmean()` in order to un-normalize it (numerator) is just the `move_exp_nanmean()` of 1's, replaced with NA's wherever the underlying data was also NA. Then, scale each chunk's numerator and denominator series (derived from their `move_exp_nanmean()` series via above) down according to how many ""lags-ago"" they were, sum the rescaled numerators and denominators across chunks, and finally divide the total summed numerators and denominators. **Describe alternatives you've considered** I implemented my own inefficient weighted rolling mean using xarray's `rolling()`. This requires a bunch of duplicate computation as the window gets shifted. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4474/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 712189206,MDU6SXNzdWU3MTIxODkyMDY=,4475,Preprocess function for save_mfdataset,2560426,open,0,,,9,2020-09-30T18:47:06Z,2020-10-15T16:32:03Z,,NONE,,,," **Is your feature request related to a problem? Please describe.** I would like to supply a `preprocess` argument to `save_mfdataset` that gets applied to each dataset before getting written to disk, similar to how `open_mfdataset` gives you such option. Specifically, have a dataset that I want to split by unique values along dimension, apply some further logic to each sub-dataset, then save each sub-dataset to a different file. Currently I'm able to split and save using the following code provided in the API docs: ``` years, datasets = zip(*ds.groupby(""time.year"")) paths = [""%s.nc"" % y for y in years] xr.save_mfdataset(datasets, paths) ``` What's missing is the ability to insert further logic to each of the sub-datasets given by the groupby object. If I try iterating through `datasets` here and chain further operations to each element, the calculations begin to execute serially even though `ds` is a dask array: `save_mfdataset([ds.foo() for ds in datasets], paths)` **Describe the solution you'd like** Instead, I'd like the ability to do: `xr.save_mfdataset(datasets, paths, preprocess=lambda ds: ds.foo())` **Describe alternatives you've considered** Not sure. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4475/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue