id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 309100522,MDU6SXNzdWUzMDkxMDA1MjI=,2018,MemoryError when using save_mfdataset(),1117224,closed,0,,,1,2018-03-27T19:22:28Z,2020-03-28T07:51:17Z,2020-03-28T07:51:17Z,NONE,,,,"#### Code Sample, a copy-pastable example if possible ```python import xarray as xr import dask # Dummy data that on disk is about ~200GB da = xr.DataArray(dask.array.random.normal(0, 1, size=(12,408,1367,304,448), chunks=(1, 1, 1, 304, 448)), dims=('ensemble', 'init_time', 'fore_time', 'x', 'y')) # Perform some calculation on the dask data da_sum = da.sum(dim='x').sum(dim='y')*(25*25)/(10**6) # Write to multiple files c_e, datasets = zip(*da_sum.to_dataset(name='sic').groupby('ensemble')) paths = ['file_%s.nc' % e for e in c_e] xr.save_mfdataset(datasets, paths) ``` #### Problem description Results in a MemoryError, when dask should handle writing this OOM DataArray to multiple within-memory-sized netcdf files. [ Related SO post here](https://stackoverflow.com/questions/49501206/killed-trying-to-use-save-mfdataset?noredirect=1#comment86015596_49501206) #### Expected Output 12 netcdf files (grouped by the ensemble dim). #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.6.4.final.0 python-bits: 64 OS: Linux OS-release: 4.14.12 machine: x86_64 processor: byteorder: little LC_ALL: C LANG: C LOCALE: None.None xarray: 0.10.2 pandas: 0.22.0 numpy: 1.14.1 scipy: 1.0.0 netCDF4: 1.3.1 h5netcdf: 0.5.0 h5py: 2.7.1 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: 0.17.1 distributed: 1.21.1 matplotlib: 2.2.2 cartopy: None seaborn: 0.8.1 setuptools: 38.5.1 pip: 9.0.1 conda: None pytest: None IPython: 6.2.1 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2018/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 186326698,MDExOlB1bGxSZXF1ZXN0OTE2Mzk0OTY=,1070,Feature/rasterio,1117224,closed,0,,,11,2016-10-31T16:14:55Z,2017-05-22T08:47:40Z,2017-05-22T08:47:40Z,NONE,,0,pydata/xarray/pulls/1070,"@jhamman started a backend for RasterIO that I have been working on. There are two issues I am stuck on that I could use some help: 1) Lat/long coords are not being decoded correctly (missing from output dataset). Lat/lon projection are correctly calculated and added here (https://github.com/NicWayand/xray/blob/feature/rasterio/xarray/backends/rasterio_.py#L117). But, it appears (with my limited knowledge of xarray) that the lat/long coords contained within `obj` are lost at this line (https://github.com/NicWayand/xray/blob/feature/rasterio/xarray/conventions.py#L930). 2) Lazy-loading needs to be enabled. How can I setup/test this? Are there examples from other backends I could follow? #790 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1070/reactions"", ""total_count"": 4, ""+1"": 4, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 170688064,MDExOlB1bGxSZXF1ZXN0ODA5ODgxNzA=,961,Update time-series.rst,1117224,closed,0,,,3,2016-08-11T16:26:58Z,2017-04-03T05:31:06Z,2017-04-03T05:31:06Z,NONE,,0,pydata/xarray/pulls/961,"Thought it would be helpful to users to know that timezones are not handled here, rather than googling and finding this: https://github.com/pydata/xarray/issues/552 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/961/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 171504099,MDU6SXNzdWUxNzE1MDQwOTk=,970,Multiple preprocessing functions in open_mfdataset?,1117224,closed,0,,,3,2016-08-16T20:01:22Z,2016-08-17T07:01:02Z,2016-08-16T21:46:43Z,NONE,,,,"I would like to have multiple functions applied during a open_mfdataset call. Using one works great: ``` Python ds = xr.open_mfdataset(files,concat_dim='time',engine='pynio', preprocess=lambda x: x.load()) ``` Does the current behavior include multiple calls? (apologizes if this is defined somewhere, I couldn't find any multiple calls examples) Something like: ``` Python ds = xr.open_mfdataset(files,concat_dim='time',engine='pynio', preprocess=[lambda x: x.load(),lambda y: y['time']=100]) ``` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/970/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue