home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

1 row where type = "issue" and user = 16700639 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue · 1 ✖

state 1

  • closed 1

repo 1

  • xarray 1
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1880544087 I_kwDOAMm_X85wFtNX 8144 Add support for netcdf4 enum bzah 16700639 closed 0     10 2023-09-04T15:51:45Z 2024-01-17T07:19:33Z 2024-01-17T07:19:33Z CONTRIBUTOR      

Is your feature request related to a problem?

When a netcdf file contains netcdf4 enums , xarray ignores the underlying enum type. The association between the values of the variable and their actual meaning is then lost.

MRE: ```py import netCDF4 as nc import xarray as xr

-- Create dataset with an enum using the netcdf4 lib

ds = nc.Dataset("mre.nc", "w", format="NETCDF4")
cloud_type_enum = ds.createEnumType(int,"cloud_type",{"clear":0, "cloudy":1}) print(ds.enumtypes)

{'cloud_type': <class 'netCDF4._netCDF4.EnumType'>: name = 'cloud_type', numpy dtype = int64, fields/values ={'clear': 0, 'cloudy': 1}}

ds.createVariable("cloud", cloud_type_enum) ds["cloud"][0] = 1 ds.close()

-- Open dataset with xarray

xr_ds = xr.open_dataset("./mre.nc") print(xr_ds.cloud)

<xarray.DataArray 'cloud' ()> \n [1 values with dtype=int64]

--> We get no metadata about the cloud_type enum that we created above

xr.ds.to_netcdf("mre_xr.nc")

-- Open xarray outputted dataset with netCDF4 lib

print(nc.Dataset("mre_xr.nc", "r", format="NETCDF4").enumtypes())

{}

--> Empty dictionary: the enum we created is lost

```

If you know CF, enums could replace replace flag_meanings and flag_values, see CF Enums are not yet part of CF though.

Describe the solution you'd like

As far as I understand, to describe the enum we only need a dictionary that map numbers (enum key) to string (enum value) and a way to reference this dictionary in variables that are "typed" to this enum. Bear in mind that the dtype of the variable would still be a number, the enum type would be a secondary metadata.

Describe alternatives you've considered

Most people that produce data could get away with using flag_meanings and flag_values to describe their data in a way which is both CF proof and properly managed by xarray. For me, the only workaround at the moment is to use the netCDF4 library directly.

Additional context

```py nc.version

1.6.2

xr.version

2023.2.0

```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8144/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 20.512ms · About: xarray-datasette