html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2662#issuecomment-454439392,https://api.github.com/repos/pydata/xarray/issues/2662,454439392,MDEyOklzc3VlQ29tbWVudDQ1NDQzOTM5Mg==,22245117,2019-01-15T15:45:03Z,2019-01-15T15:45:03Z,CONTRIBUTOR,I checked PR #2678 with the data that originated the issue and it fixes the problem! ,"{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,397063221
https://github.com/pydata/xarray/issues/2662#issuecomment-454086847,https://api.github.com/repos/pydata/xarray/issues/2662,454086847,MDEyOklzc3VlQ29tbWVudDQ1NDA4Njg0Nw==,22245117,2019-01-14T17:20:03Z,2019-01-14T17:20:03Z,CONTRIBUTOR,"I've created a little script to reproduce the problem.
@TomNicholas it looks like datasets are opened correctly. The problem arises when `open_mfdatasets` calls `_auto_combine`. Indeed, `_auto_combine` was introduced in v0.11.1
```python
import numpy as np
import xarray as xr
import os
Tsize=100; T = np.arange(Tsize);
Xsize=900; X = np.arange(Xsize);
Ysize=800; Y = np.arange(Ysize)
data = np.random.randn(Tsize, Xsize, Ysize)
for i in range(2):
# Create 2 datasets with different variables
dsA = xr.Dataset({'A': xr.DataArray(data, coords={'T': T+i*Tsize}, dims=('T', 'X', 'Y'))})
dsB = xr.Dataset({'B': xr.DataArray(data, coords={'T': T+i*Tsize}, dims=('T', 'X', 'Y'))})
# Save datasets in one folder
dsA.to_netcdf('dsA'+str(i)+'.nc')
dsB.to_netcdf('dsB'+str(i)+'.nc')
# Save datasets in two folders
dirname='rep'+str(i)
os.mkdir(dirname)
dsA.to_netcdf(dirname+'/'+'dsA'+str(i)+'.nc')
dsB.to_netcdf(dirname+'/'+'dsB'+str(i)+'.nc')
```
### Fast if netCDFs are stored in one folder:
```python
%%time
ds_1folder = xr.open_mfdataset('*.nc', concat_dim='T')
```
CPU times: user 49.9 ms, sys: 5.06 ms, total: 55 ms
Wall time: 59.7 ms
### Slow if netCDFs are stored in several folders:
```python
%%time
ds_2folders = xr.open_mfdataset('rep*/*.nc', concat_dim='T')
```
CPU times: user 8.6 s, sys: 5.95 s, total: 14.6 s
Wall time: 10.3 s
### Fast if files containing different variables are opened separately, then merged:
```python
%%time
ds_A = xr.open_mfdataset('rep*/dsA*.nc', concat_dim='T')
ds_B = xr.open_mfdataset('rep*/dsB*.nc', concat_dim='T')
ds_merged = xr.merge([ds_A, ds_B])
```
CPU times: user 33.8 ms, sys: 3.7 ms, total: 37.5 ms
Wall time: 34.5 ms
","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",,397063221