html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/2946#issuecomment-490421774,https://api.github.com/repos/pydata/xarray/issues/2946,490421774,MDEyOklzc3VlQ29tbWVudDQ5MDQyMTc3NA==,10809480,2019-05-08T09:44:25Z,2019-05-08T09:49:02Z,NONE,"interesting fact i just learned. when you have to process over a huge dataset, first export it as a complete single netcdf file, then calculate its aggregation function. Its a workaround, i suppose bottleneck or dask needs to have its complete set first. For mean it just simply works because of the easy calculation method, for std i think dask or bottleneck assume a nan as a zero for calculation purposes. ```python data = xr.open_mfdataset(list_to_input_files, parallel=True, concat_dim=""time"") (...) data.to_netcdf(""help_netcdf_file.nc"") data.close() data = xr.open_dataset(""help_netcdf_file.nc"") data.mean(...).to_netcdf(""mean_netcdf_file.nc"") data.std(...).to_netcdf(""mean_netcdf_file.nc"") ``` It could be problematic by huuuuge datasets in the tb size. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,441222339 https://github.com/pydata/xarray/issues/2946#issuecomment-490394601,https://api.github.com/repos/pydata/xarray/issues/2946,490394601,MDEyOklzc3VlQ29tbWVudDQ5MDM5NDYwMQ==,10809480,2019-05-08T08:18:21Z,2019-05-08T09:01:56Z,NONE,"fixed: synthetic dataset of the polar region -60 - -90, in the mean calculation everything is proper and nans are ignored. std still looks suspicious. ```python import xarray as xr import glob import numpy as np data = xr.open_dataset(r""test.nc"") data.mean(dim=""time"", skipna=True).to_netcdf(r""mean_test.nc"") ``` ```python-traceback C:\Users\atraumue\AppData\Local\Continuum\anaconda3\lib\site-packages\dask\array\numpy_compat.py:28: RuntimeWarning: invalid value encountered in true_divide x = np.divide(x1, x2, out) ``` ```python data.std(dim=""time"", skipna=True,ddof=1).astype(np.float64).to_netcdf(r""std_test.nc"") ``` ```python-traceback C:\Users\atraumue\AppData\Local\Continuum\anaconda3\lib\site-packages\dask\array\reductions.py:386: RuntimeWarning: invalid value encountered in true_divide u = total / n ``` Dropbox to files: https://www.dropbox.com/sh/yuf114u143mj2l3/AABuQfC5wu4nrWDH4GsGgFyJa?dl=0 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,441222339