home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 294380515

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
294380515 MDU6SXNzdWUyOTQzODA1MTU= 1888 Getting DataArrays from netCDF4 files correctly and without hassle 601177 open 0     4 2018-02-05T12:45:42Z 2020-12-06T18:07:47Z   NONE      

Context

Consider a netCDF4 file with a group structure. For example, the following toy: ```python import netCDF4 as nc

netCDF4 file

f = nc.Dataset('simple_hierarchy.nc', 'w')

coordinates in root

f.createDimension('x', 3) f.createVariable('x', 'f4', ('x',), fill_value=False) f['x'][:] = [1.1, 2.2, 3.3] f.createDimension('y', 2) f.createVariable('y', 'f4', ('y',), fill_value=False) f['y'][:] = [-0.9, -1.8]

variables in root

f.createVariable('u', 'i1', (), fill_value=False) f.createVariable('v', 'u1', ('x','y'), fill_value=False)

group

f.createGroup('g') g = f['g']

new/modified coordinates in g

g.createDimension('y', 3) g.createVariable('y', 'f4', ('y',), fill_value=False) g['y'][:] = [-0.9, -1.8, -2.7]

variable in g

g.createVariable('w', 'u1', ('x', 'y'), fill_value=False) f.close() ```

Current behavior

  1. It is currently a hassle to get a DataArray from variable in a group with multiple non-coordinate variables: ```python >>> xr.open_dataarray('simple_hierarchy.nc') … ValueError: Given file dataset contains more than one data variable. Please read with xarray.open_dataset and then select the variable you want. >>> xr.open_dataarray('simple_hierarchy.nc', group='v') xr.open_dataarray('simple_hierarchy.nc', group='v') … OSError: [Errno group not found: v] 'v' >>> xr.open_dataarray('simple_hierarchy.nc', drop_variables='u') <xarray.DataArray 'v' (x: 3, y: 2)> array([[120, 219], [178, 172], [ 9, 127]], dtype=uint8) Coordinates:

    • x (x) float32 1.1 2.2 3.3
    • y (y) float32 -0.9 -1.8 ```
  2. Also, coordinates defined at a group level closer tot the root are not taken into account: ```python >>> xr.open_dataarray('simple_hierarchy.nc', group='g') <xarray.DataArray 'w' (x: 3, y: 3)> array([[216, 219, 178], [172, 9, 127], [ 0, 0, 64]], dtype=uint8) Coordinates:

    • y (y) float32 -0.9 -1.8 -2.7 Dimensions without coordinates: x ``` So the DataArray is not loaded correctly, as part of its defining coordinates are missing.

Suggested behavior

  1. Add a variable kwarg in the open_dataarray method: ```python >>> xr.open_dataarray('simple_hierarchy.nc', variable='v') <xarray.DataArray 'v' (x: 3, y: 2)> array([[120, 219], [178, 172], [ 9, 127]], dtype=uint8) Coordinates:

    • x (x) float32 1.1 2.2 3.3
    • y (y) float32 -0.9 -1.8 ```
  2. Have the function that loads variables go up the group hierarchy to see if some coordinate arrays can be found for dimensions lacking them within this group: ```python >>> xr.open_dataarray('simple_hierarchy.nc', group='g') <xarray.DataArray 'w' (x: 3, y: 3)> array([[216, 219, 178], [172, 9, 127], [ 0, 0, 64]], dtype=uint8) Coordinates:

    • x (x) float32 1.1 2.2 3.3
    • y (y) float32 -0.9 -1.8 -2.7 ``` I guess care needs to be taken as well upon writing to netCDF, to make sure no spurious dimension/coordinate definitions are added.

Version

xarray 0.9.6

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1888/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 4 rows from issue in issue_comments
Powered by Datasette · Queries took 0.843ms · About: xarray-datasette