html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/1823#issuecomment-768600657,https://api.github.com/repos/pydata/xarray/issues/1823,768600657,MDEyOklzc3VlQ29tbWVudDc2ODYwMDY1Nw==,9200184,2021-01-27T21:51:24Z,2021-01-27T21:52:11Z,CONTRIBUTOR,"> PS @rabernat > > ``` > %%time > ds = xr.open_mfdataset(""/glade/p/cesm/community/ASD-HIGH-RES-CESM1/hybrid_v5_rel04_BC5_ne120_t12_pop62/ocn/proc/tseries/monthly/*.nc"", > parallel=True, coords=""minimal"", data_vars=""minimal"", compat='override') > ``` > > This completes in 40 seconds with 10 workers on cheyenne. @dcherian, thanks for your solution. In my experience with 34013 NetCDF files, I could open 117 Gib in 13min 14s. Can I decrease this time?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,288184220 https://github.com/pydata/xarray/issues/1823#issuecomment-531945252,https://api.github.com/repos/pydata/xarray/issues/1823,531945252,MDEyOklzc3VlQ29tbWVudDUzMTk0NTI1Mg==,14314623,2019-09-16T20:29:35Z,2019-09-16T20:29:35Z,CONTRIBUTOR,Wooooow. Thanks. Ill have to give this a whirl soon.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,288184220 https://github.com/pydata/xarray/issues/1823#issuecomment-489064553,https://api.github.com/repos/pydata/xarray/issues/1823,489064553,MDEyOklzc3VlQ29tbWVudDQ4OTA2NDU1Mw==,3404817,2019-05-03T11:26:06Z,2019-05-03T11:36:44Z,CONTRIBUTOR,"The original issue of this thread is that you sometimes might want to *disable* alignment checks for coordinates other than the `concat_dim` and only check for same dimensions and dimension shapes. When you `xr.merge` with `join='exact'`, it still checks for alignment (see https://github.com/pydata/xarray/pull/1330#issuecomment-302711852), but does not join the coordinates if they are not aligned. This behavior (not joining) is also included in what @rabernat envisioned here, but his suggestion goes beyond that: you don't even load coordinate values from all but the first dataset and just blindly trust that they are aligned. So `xr.open_mfdataset(join='exact', coords='minimal')` does not fix this issue here, I think.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,288184220 https://github.com/pydata/xarray/issues/1823#issuecomment-373123959,https://api.github.com/repos/pydata/xarray/issues/1823,373123959,MDEyOklzc3VlQ29tbWVudDM3MzEyMzk1OQ==,14314623,2018-03-14T18:16:38Z,2018-03-14T18:16:38Z,CONTRIBUTOR,"Awesome, thanks for the clarification. I just looked at #1981 and it seems indeed very elegant (in fact I just now used this approach to parallelize printing of movie frames!) Thanks for that! ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,288184220 https://github.com/pydata/xarray/issues/1823#issuecomment-372856076,https://api.github.com/repos/pydata/xarray/issues/1823,372856076,MDEyOklzc3VlQ29tbWVudDM3Mjg1NjA3Ng==,14314623,2018-03-13T23:40:54Z,2018-03-13T23:40:54Z,CONTRIBUTOR,"Would these two options be necessarily mutually exclusive? I think parallelizing the read in sounds amazing. But isnt there some merit in skipping some of the checks all together, if the user is sure about the structure of the data contained in the many files? I am often working with the aforementioned type of data (many files either contain a new timestep or a different variable, but most of the dimensions/coordinates are the same). In some cases I am finding that reading the data ""lazily"" consumes a significant amount of the time in my workflow. I am unsure how hard this would be to achieve, and perhaps it is not worth it after all. Just putting out a few ideas, while I wait for my `xr.open_mfdataset` to finish :-)","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 1, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,288184220 https://github.com/pydata/xarray/issues/1823#issuecomment-359069753,https://api.github.com/repos/pydata/xarray/issues/1823,359069753,MDEyOklzc3VlQ29tbWVudDM1OTA2OTc1Mw==,14314623,2018-01-19T19:45:00Z,2018-01-19T19:45:00Z,CONTRIBUTOR,"I did not really find an elegant solution. What I did was just specify all dims and coords as `drop_variables` and then update those from a master file with ``` ds.update(ds_master) ``` Perhaps this could be generalized in a sense, by reading all coords and dims just from the first file. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,288184220