id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 102177256,MDU6SXNzdWUxMDIxNzcyNTY=,542,issue with xray.open_mfdataset and binary operations,1177508,closed,0,,,5,2015-08-20T16:28:12Z,2015-09-03T08:41:00Z,2015-09-01T22:05:02Z,NONE,,,,"example: ``` python with xray.open_mfdataset(...) as ds: a = ds['x'] * ds['y'] ``` gives: `NotImplementedError: Dask.array operations only work on dask arrays, not numpy arrays.` If I do `ds.load()` first then all is good... I guess this is an `xray` issue, not `dask`. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/542/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 91184107,MDU6SXNzdWU5MTE4NDEwNw==,444,segmentation fault with `open_mfdataset`,1177508,closed,0,,,26,2015-06-26T07:57:58Z,2015-07-16T21:40:22Z,2015-07-16T21:40:22Z,NONE,,,,"This is super strange. Does anyone have any idea why a segmentation fault might be happening here? ``` Python 3.4.3 (default, Jun 26 2015, 00:02:21) [GCC 4.3.4 [gcc-4_3-branch revision 152973]] on linux Type ""help"", ""copyright"", ""credits"" or ""license"" for more information. >>> import xray >>> xray.open_mfdataset('2*.nc', concat_dim='time') Traceback (most recent call last): File """", line 1, in File ""/ichec/home/users/razvan/.local/lib/python3.4/site-packages/xray/backends/api.py"", line 205, in open_mfdataset Segmentation fault (core dumped) ``` I stay it's strange because I ended up tracking down the bug to `xray.core.ops.array_equiv`. I have no idea what's going on, but by mistake I found out that if I introduce `isnull(arr1 & arr2)` just before the `return` statement then I don't get the error any more... So my `xray.core.ops.array_equiv` is now: ``` def array_equiv(arr1, arr2): """"""Like np.array_equal, but also allows values to be NaN in both arrays """""" arr1, arr2 = as_like_arrays(arr1, arr2) if arr1.shape != arr2.shape: return False # segmentation fault if we don't call this here... isnull(arr1 & arr2) return bool(((arr1 == arr2) | (isnull(arr1) & isnull(arr2))).all()) ``` Thanks... ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/444/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 94030797,MDU6SXNzdWU5NDAzMDc5Nw==,458,groupby / apply and dask (`open_mfdataset`),1177508,closed,0,,,2,2015-07-09T12:10:04Z,2015-07-15T00:02:40Z,2015-07-15T00:02:34Z,NONE,,,,"It seems that, when working with `open_mfdataset`, things are not consistent. Trying the following: ``` python import numpy as np import xray a = xray.open_mfdataset('*.nc', concat_dim='time', preprocess=lambda x: x.assign_coords(agl=('mean_height_agl', range(x.dims['mean_height_agl']))).swap_dims({'mean_height_agl': 'agl'}).squeeze('time')) a.groupby('time').apply(np.sum) ``` gives an error with a huge traceback that ends with: ``` python IndexError: Exception in remote process tuple index out of range Traceback: File ""/home/razvan/.local/lib/python3.4/site-packages/dask/async.py"", line 260, in execute_task result = _execute_task(task, data) File ""/home/razvan/.local/lib/python3.4/site-packages/dask/async.py"", line 243, in _execute_task return func(*args2) File ""/home/razvan/.local/lib/python3.4/site-packages/toolz/functoolz.py"", line 378, in __call__ ret = fns[0](*args, **kwargs) File ""/home/razvan/.local/lib/python3.4/site-packages/dask/array/core.py"", line 377, in _concatenate2 ``` but if I do: ``` python a.load() a.groupby('time').apply(np.sum) ``` there's no error. The files I'm using for this are at [this dropbox place](https://www.dropbox.com/sh/wnyana6rgefevmg/AAAzmm37L34A7xKTFbyiy3Aoa?dl=0). ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/458/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 91109966,MDU6SXNzdWU5MTEwOTk2Ng==,443,multiple files - variable X not equal across datasets,1177508,closed,0,,,9,2015-06-26T00:18:21Z,2015-06-29T18:06:54Z,2015-06-29T18:06:54Z,NONE,,,,"The other day I was playing with `xray.open_mfdataset` and I noticed you can get this error, when opening multiple files at the same time. I think there is a pretty easy solution to this: ``` python import glob as g from toolz.curried import curry, map, pipe import xray def get_ds(glob): def _get_ds(file_path): dim = 'mean_height_agl' dim_new = 'agl' with xray.open_dataset(file_path) as _ds: _ds.load() return (_ds.assign_coords(**{dim_new: (dim, range(_ds.coords[dim].size))}) .swap_dims({dim: dim_new})) return pipe(g.glob(glob), sorted, map(_get_ds), curry(xray.concat)(dim='time')) ``` Of course, this is for a particular variable I was having trouble with, but the idea is to swap dimensions, that is create a dummy dimension with the same length as the troublesome variable and then swap the two. This can be done for any number of troublesome variables. I don't know how feasible this is though. Just thought to share my idea... ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/443/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue