home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 1698656265

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1698656265 I_kwDOAMm_X85lP3AJ 7823 DataArray.to_dataset(dim) silently drops variable if it is already a dim 20118130 closed 0     3 2023-05-06T14:37:59Z 2023-11-14T22:28:18Z 2023-11-14T22:28:18Z CONTRIBUTOR      

What happened?

If I have a DataArray da which I split into a Dataset using da.to_dataset(dim), and one of the values of da[dim] also happens to be one of the dimensions of da, that variable is silently missing from the resulting dataset.

What did you expect to happen?

If a variable cannot be created because it is already a dimension, it should raise an exception, or possibly issue a warning and rename the variable, so that no data is lost.

Minimal Complete Verifiable Example

```Python import xarray as xr

da = xr.DataArray( np.zeros((3, 3)), coords={ # note how 'foo' is one of the coordinate values, and also the name of a dimension "x": ["foo", "bar", "baz"], "foo": [1, 2, 3], } )

this produces a Dataset with the variables 'bar' and 'baz', 'foo' is missing (because it is already a coordinate)

print(da.to_dataset("x"))

this produces a dataset with the variables 'foo', 'bar', and 'baz', as epected

print(da.rename({"foo": "qux"}).to_dataset("x")) ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

```Python

Output of first conversion

<xarray.Dataset> Dimensions: (foo: 3) Coordinates: * foo (foo) int64 1 2 3 Data variables: bar (foo) float64 0.0 0.0 0.0 baz (foo) float64 0.0 0.0 0.0

Output of second conversion

<xarray.Dataset> Dimensions: (qux: 3) Coordinates: * qux (qux) int64 1 2 3 Data variables: foo (qux) float64 0.0 0.0 0.0 bar (qux) float64 0.0 0.0 0.0 baz (qux) float64 0.0 0.0 0.0 ```

Anything else we need to know?

This came up when I did to_dataset("param") on the fit result returned by curvefit, and one of the data dimensions happened to be the same as one of the arguments of the function which I was fitting. I was initially very confused by this.

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.10.10 (main, Mar 01 2023, 21:10:14) [GCC] python-bits: 64 OS: Linux OS-release: 6.2.12-1-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: ('en_GB', 'UTF-8') libhdf5: 1.12.2 libnetcdf: None xarray: 2023.4.2 pandas: 2.0.1 numpy: 1.23.5 scipy: 1.10.1 netCDF4: None pydap: None h5netcdf: 1.1.0 h5py: 3.8.0 Nio: None zarr: 2.14.2 cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: 1.3.7 dask: 2023.4.1 distributed: None matplotlib: 3.7.1 cartopy: None seaborn: 0.12.2 numbagg: None fsspec: 2023.4.0 cupy: None pint: None sparse: 0.14.0 flox: None numpy_groupies: None setuptools: 65.5.0 pip: 22.3.1 conda: None pytest: 7.3.1 mypy: None IPython: 8.13.2 sphinx: 6.2.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7823/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 0 rows from issue in issue_comments
Powered by Datasette · Queries took 0.85ms · About: xarray-datasette