id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 294380515,MDU6SXNzdWUyOTQzODA1MTU=,1888,Getting DataArrays from netCDF4 files correctly and without hassle,601177,open,0,,,4,2018-02-05T12:45:42Z,2020-12-06T18:07:47Z,,NONE,,,,"#### Context Consider a netCDF4 file with a group structure. For example, the following toy: ```python import netCDF4 as nc # netCDF4 file f = nc.Dataset('simple_hierarchy.nc', 'w') # coordinates in root f.createDimension('x', 3) f.createVariable('x', 'f4', ('x',), fill_value=False) f['x'][:] = [1.1, 2.2, 3.3] f.createDimension('y', 2) f.createVariable('y', 'f4', ('y',), fill_value=False) f['y'][:] = [-0.9, -1.8] # variables in root f.createVariable('u', 'i1', (), fill_value=False) f.createVariable('v', 'u1', ('x','y'), fill_value=False) # group f.createGroup('g') g = f['g'] # new/modified coordinates in g g.createDimension('y', 3) g.createVariable('y', 'f4', ('y',), fill_value=False) g['y'][:] = [-0.9, -1.8, -2.7] # variable in g g.createVariable('w', 'u1', ('x', 'y'), fill_value=False) f.close() ``` #### Current behavior 1. It is currently a hassle to get a DataArray from variable in a group with multiple non-coordinate variables: ```python >>> xr.open_dataarray('simple_hierarchy.nc') … ValueError: Given file dataset contains more than one data variable. Please read with xarray.open_dataset and then select the variable you want. >>> xr.open_dataarray('simple_hierarchy.nc', group='v') xr.open_dataarray('simple_hierarchy.nc', group='v') … OSError: [Errno group not found: v] 'v' >>> xr.open_dataarray('simple_hierarchy.nc', drop_variables='u') array([[120, 219], [178, 172], [ 9, 127]], dtype=uint8) Coordinates: * x (x) float32 1.1 2.2 3.3 * y (y) float32 -0.9 -1.8 ``` 2. Also, coordinates defined at a group level closer tot the root are not taken into account: ```python >>> xr.open_dataarray('simple_hierarchy.nc', group='g') array([[216, 219, 178], [172, 9, 127], [ 0, 0, 64]], dtype=uint8) Coordinates: * y (y) float32 -0.9 -1.8 -2.7 Dimensions without coordinates: x ``` So the DataArray is not loaded correctly, as part of its defining coordinates are missing. #### Suggested behavior 1. Add a `variable` kwarg in the `open_dataarray` method: ```python >>> xr.open_dataarray('simple_hierarchy.nc', variable='v') array([[120, 219], [178, 172], [ 9, 127]], dtype=uint8) Coordinates: * x (x) float32 1.1 2.2 3.3 * y (y) float32 -0.9 -1.8 ``` 2. Have the function that loads variables go up the group hierarchy to see if some coordinate arrays can be found for dimensions lacking them within this group: ```python >>> xr.open_dataarray('simple_hierarchy.nc', group='g') array([[216, 219, 178], [172, 9, 127], [ 0, 0, 64]], dtype=uint8) Coordinates: * x (x) float32 1.1 2.2 3.3 * y (y) float32 -0.9 -1.8 -2.7 ``` I guess care needs to be taken as well upon writing to netCDF, to make sure no spurious dimension/coordinate definitions are added. #### Version xarray 0.9.6","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1888/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue