home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

2 rows where repo = 13221727 and user = 16033750 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue 2

state 1

  • closed 2

repo 1

  • xarray · 2 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
2090281639 I_kwDOAMm_X858lyqn 8628 objects remain unserializable after reset_index bjarketol 16033750 closed 0 benbovy 4160723   1 2024-01-19T11:03:56Z 2024-01-31T17:42:30Z 2024-01-31T17:42:30Z NONE      

What happened?

With the 2024.1 release, I am unable to write objects to netCDF after having stacked dimensions with .stack() and called .reset_index() to get rid of the multi-index

What did you expect to happen?

No response

Minimal Complete Verifiable Example

Python import numpy as np import xarray as xr da = xr.DataArray(np.zeros([2, 3]), dims=["x", "y"]) da = da.stack(point=("x", "y")) da = da.reset_index("point") da.to_netcdf("test.nc")

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

```Python 86 def ensure_not_multiindex(var: Variable, name: T_Name = None) -> None: 87 if isinstance(var._data, indexing.PandasMultiIndexingAdapter): ---> 88 raise NotImplementedError( 89 f"variable {name!r} is a MultiIndex, which cannot yet be " 90 "serialized. Instead, either use reset_index() " 91 "to convert MultiIndex levels into coordinate variables instead " 92 "or use https://cf-xarray.readthedocs.io/en/latest/coding.html." 93 )

NotImplementedError: variable 'x' is a MultiIndex, which cannot yet be serialized. Instead, either use reset_index() to convert MultiIndex levels into coordinate variables instead or use https://cf-xarray.readthedocs.io/en/latest/coding.html. ```

Anything else we need to know?

Creating the stacked object from scratch and saving it to netCDF works fine. The difference is that type(da.x.variable._data) is xarray.core.indexing.PandasMultiIndexingAdapter if it was stacked and reset and numpy.ndarray if it's created from scratch

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.11.7 | packaged by conda-forge | (main, Dec 23 2023, 14:43:09) [GCC 12.3.0] python-bits: 64 OS: Linux OS-release: 5.15.133.1-microsoft-standard-WSL2 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: C.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.14.3 libnetcdf: 4.9.2 xarray: 2024.1.0 pandas: 2.1.4 numpy: 1.26.3 scipy: 1.11.4 netCDF4: 1.6.5 pydap: None h5netcdf: 1.2.0 h5py: 3.10.0 Nio: None zarr: 2.16.1 cftime: 1.6.3 nc_time_axis: None iris: None bottleneck: 1.3.7 dask: 2024.1.0 distributed: None matplotlib: 3.8.2 cartopy: 0.22.0 seaborn: 0.13.1 numbagg: None fsspec: 2023.12.2 cupy: None pint: 0.23 sparse: None flox: None numpy_groupies: None setuptools: 69.0.3 pip: 23.3.2 conda: None pytest: 7.4.4 mypy: None IPython: 8.20.0
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8628/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
357629971 MDU6SXNzdWUzNTc2Mjk5NzE= 2401 DataArray.mean() produces wrong result for float32 arrays of particular shapes bjarketol 16033750 closed 0     1 2018-09-06T12:21:57Z 2018-09-06T13:17:13Z 2018-09-06T13:15:17Z NONE      

Code Sample

```python import numpy as np import xarray as xr

np.random.seed(42)

dims = ('a', 'b', 'c', 'd') shape = (10, 10, 500, 500)

coords = {d: np.arange(s) for d, s in zip(dims, shape)}

Using data with non-normal distribution

data = np.random.lognormal(size=shape) data = data.astype(np.float32)

da = xr.DataArray(data, coords=coords, dims=dims)

Numpy method gives the correct value

print(da.values.mean())

Explicitly specifying all axis gives the correct value

print(da.mean(axis=(0, 1, 2, 3)))

Default DataArray mean method gives incorrect value

print(da.mean()) # <- Problem arise here

float64 arrays produce the correct value

print(da.astype(np.float64).mean()) This is the output I see: 1.6489075

<xarray.DataArray ()> array(1.648908, dtype=float32)

<xarray.DataArray ()> array(1.517693)

<xarray.DataArray ()> array(1.648907) ```

Problem description

Wrong mean value calculated by DataArray.mean() method with default arguments. I have only observed the problem for float32 arrays. It appears to be sensitive to the shape of the array, e.g. a shape of (10, 10, 10, 10) seems to be fine.

Expected Output

This is the output I expect for the sample above: ``` 1.6489075

<xarray.DataArray ()> array(1.648908, dtype=float32)

<xarray.DataArray ()> array(1.648908, dtype=float32)

<xarray.DataArray ()> array(1.648907) ```

INSTALLED VERSIONS ------------------ commit: None python: 3.6.6.final.0 python-bits: 64 OS: Linux OS-release: 4.15.0-33-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.8 pandas: 0.23.4 numpy: 1.15.1 scipy: 1.1.0 netCDF4: 1.4.1 h5netcdf: 0.6.2 h5py: 2.8.0 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: 0.19.0 distributed: 1.23.0 matplotlib: 2.2.3 cartopy: 0.16.0 seaborn: 0.9.0 setuptools: 40.2.0 pip: 18.0 conda: 4.5.11 pytest: 3.7.4 IPython: 6.5.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2401/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 28.835ms · About: xarray-datasette