home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 900502141

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
900502141 MDU6SXNzdWU5MDA1MDIxNDE= 5368 ds.mean('dim') drops strings dataarrays, even when the 'dim' is not dimension of the string dataarray 18579092 closed 0     2 2021-05-25T08:41:22Z 2021-06-12T17:45:00Z 2021-06-12T17:45:00Z CONTRIBUTOR      

What happened:

I have a dataset along many dimensions, e.g. time and experiments. Some of the dataarrays only contain strings and only have the experiments as dimension. When I use ds.mean('time') I lose these dataarrays.

What you expected to happen:

I would expect that the mean on the full dataset would be similar that what happends with float dataarrays that don't contain the dimension: just return them unchanged. As shown in the minimal example attached, using ds.min produced the result I would expect.

Minimal Complete Verifiable Example:

Here da1 corresponds to my dataarrays that are lost. da3 produced what I would expect (same result no matter the data type).

```python import xarray as xr

da1 = xr.DataArray(['a','b']).rename('da1') print(da1, '\n')

da3 = xr.DataArray([-1, -2]).rename('da3') print(da3, '\n')

da2 = xr.DataArray([[0,1],[2,3]]).rename('da2') print(da2, '\n')

ds = xr.merge([da1,da2,da3]) print(ds, '\n')

print('mean:', ds.mean('dim_1')) print('min:', ds.min('dim_1')) ```

And the output is:

``` <xarray.DataArray 'da1' (dim_0: 2)> array(['a', 'b'], dtype='<U1') Dimensions without coordinates: dim_0

<xarray.DataArray 'da3' (dim_0: 2)> array([-1, -2]) Dimensions without coordinates: dim_0

<xarray.DataArray 'da2' (dim_0: 2, dim_1: 2)> array([[0, 1], [2, 3]]) Dimensions without coordinates: dim_0, dim_1

<xarray.Dataset> Dimensions: (dim_0: 2, dim_1: 2) Dimensions without coordinates: dim_0, dim_1 Data variables: da1 (dim_0) <U1 'a' 'b' da2 (dim_0, dim_1) int64 0 1 2 3 da3 (dim_0) int64 -1 -2

mean: <xarray.Dataset> Dimensions: (dim_0: 2) Dimensions without coordinates: dim_0 Data variables: da2 (dim_0) float64 0.5 2.5 da3 (dim_0) int64 -1 -2 min: <xarray.Dataset> Dimensions: (dim_0: 2) Dimensions without coordinates: dim_0 Data variables: da1 (dim_0) <U1 'a' 'b' da2 (dim_0) int64 0 2 da3 (dim_0) int64 -1 -2 ```

I searched in the opened issues but haven't seen any similar one. I hope I did not miss anything there nor in the doc.

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.5 (default, Jan 27 2021, 15:41:15) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.8.0-50-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.12.0 libnetcdf: 4.7.4 xarray: 0.18.0 pandas: 1.2.4 numpy: 1.20.3 scipy: 1.6.3 netCDF4: 1.5.6 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.4.1 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2021.05.0 distributed: 2021.05.0 matplotlib: 3.4.2 cartopy: 0.19.0.post1 seaborn: 0.11.1 numbagg: None pint: None setuptools: 44.0.0 pip: 20.0.2 conda: None pytest: 6.2.4 IPython: 7.23.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5368/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 0.568ms · About: xarray-datasette