id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1288323549,I_kwDOAMm_X85MykHd,6736,better handling of invalid files in open_mfdataset,731499,open,0,,,2,2022-06-29T08:00:18Z,2023-07-09T23:49:36Z,,CONTRIBUTOR,,,,"### Is your feature request related to a problem? Suppose I'm trying to read a large number of netCDF files with ```open_mfdataset```. Now suppose that one of those files is for some reason incorrect -- for instance there was a problem during the creation of that particular file, and its file size is zero, or it is not valid netCDF. The file exists, but it is invalid. Currently ```open_mfdataset``` will raise an exception with the message ```ValueError: did not find a match in any of xarray's currently installed IO backends``` As far as I can tell, there is currently no way to identify which one(s) of the files being read is the source of the problem. If there are several hundreds of those, finding the problematic files is a task by itself, even though xarray probably knows them. ### Describe the solution you'd like It would be most useful to this particular user if the error message could somehow identify the file(s) responsible for the exception. Apart from better reporting, I would find it very useful if I could pass to ```open_mfdataset``` some kind of argument that would make it ignore invalid files altogether (```ignore_invalid=False``` comes to mind). ### Describe alternatives you've considered _No response_ ### Additional context _No response_","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6736/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 205414496,MDU6SXNzdWUyMDU0MTQ0OTY=,1249,confusing dataset creation process,731499,open,0,,,6,2017-02-05T09:52:44Z,2022-06-26T15:07:59Z,,CONTRIBUTOR,,,,"In another issue I create a simple dataset like so: ```python lat = np.random.rand(50000) * 180 - 90 lon = np.random.rand(50000) * 360 - 180 d = xr.Dataset({'latitude':lat, 'longitude':lon}) ``` I expected `d` to contain two variables (`latitude` and `longitude`) with no coordinates. Instead `d` appears to contain two coordinates and no variables: ``` In [5]: d Out[5]: Dimensions: (latitude: 50000, longitude: 50000) Coordinates: * latitude (latitude) float64 -76.0 -84.36 26.69 66.44 -37.85 50.13 ... * longitude (longitude) float64 -148.7 -74.82 18.37 117.7 80.63 12.25 ... Data variables: *empty* ``` Is this desired behavior?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1249/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue