home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 1642635191

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1642635191 I_kwDOAMm_X85h6J-3 7686 Add reset_encoding to Dataset and DataArray objects 2443309 closed 0     2 2023-03-27T18:51:39Z 2023-03-30T21:09:17Z 2023-03-30T21:09:17Z MEMBER      

Is your feature request related to a problem?

Xarray maintains the encoding of datasets read from most of its supported backend formats (e.g. NetCDF, Zarr, etc.). This is very useful when you want to perfectly roundtrip but it often gets in the way, causing conflicts when writing a modified dataset or when appending to another dataset. Most of the time, the solution is to just remove the encoding from the dataset and continue on. The following code sample is found in a number of issues that reference this problem.

```python for v in list(ds.coords.keys()): if ds.coords[v].dtype == object: ds[v].encoding.clear()

for v in list(ds.variables.keys()):
    if ds[v].dtype == object:
        ds[v].encoding.clear()

```

A sample of issues that show variants of this problem.

  • https://github.com/pydata/xarray/issues/3476
  • https://github.com/pydata/xarray/issues/3739
  • https://github.com/pydata/xarray/issues/4380
  • https://github.com/pydata/xarray/issues/5219
  • https://github.com/pydata/xarray/issues/5969
  • https://github.com/pydata/xarray/issues/6329
  • https://github.com/pydata/xarray/issues/6352

Describe the solution you'd like

In many cases, the solution to these problems is to leave the original dataset encoding behind and either use Xarray's default encoding (or the backends default) or to specify one's own encoding options. Both cases would benefit from a convenience method to reset the original encoding. Something like would serve this process:

python ds = xr.open_dataset(...).reset_encoding()

Describe alternatives you've considered

Variations on the API above could also be considered:

python xr.open_dataset(..., keep_encoding=False)

or even: python with xr.set_options(keep_encoding=False): ds = xr.open_dataset(...)

We can/should also do a better job of surfacing inconsistent encoding in our backends (e.g. to_netcdf).

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7686/reactions",
    "total_count": 2,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 2,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 0.981ms · About: xarray-datasette