id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
2024104632,I_kwDOAMm_X854pWK4,8515,Inconsistant behaviour of groupby_bins mean when using flox and numbagg,21100296,closed,0,,,5,2023-12-04T15:17:51Z,2023-12-05T08:21:44Z,2023-12-04T18:57:30Z,NONE,,,,"### What happened?
When I group an xarray.DataArray in a single group and calculate the mean, then I expect the mean of this group to be the same as the mean of the input data.
When I have flox and numbagg installed next to xarray, I get inconsistant behavoir. The behaviour is consistent again when setting the option ""use_flox"" to False.
### What did you expect to happen?
I expected xarray to give the mean of the values in the group. I expected this mean to be the same with flox as without flox. More specifically, I expected it to be (almost) equal to the numpy.mean.
### Minimal Complete Verifiable Example
```Python
# in a clean python.org environment:
# pip install xarray, numbagg, flox
import numpy as np
import xarray as xr
def grouped_mean(number):
# Generate a set of random values
np.random.seed(0)
values = np.random.rand(number)
# Use numpy to calculated the expected mean
expected = np.mean(values)
# Create an xarray dataset with coordinates
data = xr.DataArray(values, [(""dim_0"", np.arange(number, dtype=float))])
# Group the coordinates to that all values fall in a single bin
grouped = data.groupby_bins(""dim_0"", [-1.0, number + 1.0])
# Calculated the grouped mean without flox
xr.core.options.OPTIONS[""use_flox""] = False
result_no_flox = grouped.mean().values[0]
# Calculate the grouped mean with flox
xr.core.options.OPTIONS[""use_flox""] = True
result_flox = grouped.mean().values[0]
# Print the results
print(f""Try with number = {number}"")
print(expected, ""using numpy.mean"")
print(result_no_flox, ""grouped.mean no flox"")
print(result_flox, ""grouped.mean with flox"")
for number in [127, 128, 255, 256, 1000]:
grouped_mean(number)
```
### MVCE confirmation
- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [ ] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
- [x] Recent environment — the issue occurs with the latest version of xarray and its dependencies.
### Relevant log output
```Python
Run python test.py
Try with number = 127
0.5000245417623892 using numpy.mean
0.5000245417623892 grouped.mean no flox
0.5000245417623891 grouped.mean with flox
Try with number = 128
0.49847415328514055 using numpy.mean
0.49847415328514055 grouped.mean no flox
-0.49847415328514033 grouped.mean with flox
Try with number = 255
0.4973500025365464 using numpy.mean
0.4973500025365464 grouped.mean no flox
-126.82425064681932 grouped.mean with flox
Try with number = 256
0.4957330979775834 using numpy.mean
0.4957330979775834 grouped.mean no flox
nan grouped.mean with flox
Try with number = 1000
0.49592153437178277 using numpy.mean
0.49592153437178277 grouped.mean no flox
-20.663397265490953 grouped.mean with flox
```
### Anything else we need to know?
This behaviour is only there when installing numbagg and flox next to xarray.
(pip install xarray flox numbagg)
The above mentioned output is from a github action, using linux and windows latest with python 3.11
### Environment
Run python -c ""import xarray as xr;print(xr.show_versions())""
/opt/hostedtoolcache/Python/3.[11](https://github.com/daanscheltens/test_xarray/actions/runs/7088608658/job/19291471251#step:10:12).6/x64/lib/python3.11/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn(""Setuptools is replacing distutils."")
INSTALLED VERSIONS
------------------
commit: None
python: 3.11.6 (main, Oct 3 2023, 04:42:57) [GCC 11.4.0]
python-bits: 64
OS: Linux
OS-release: 6.2.0-10[16](https://github.com/daanscheltens/test_xarray/actions/runs/7088608658/job/19291471251#step:10:17)-azure
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None
xarray: [20](https://github.com/daanscheltens/test_xarray/actions/runs/7088608658/job/19291471251#step:10:21)[23](https://github.com/daanscheltens/test_xarray/actions/runs/7088608658/job/19291471251#step:10:24).11.0
pandas: 2.1.3
numpy: 1.[26](https://github.com/daanscheltens/test_xarray/actions/runs/7088608658/job/19291471251#step:10:27).2
scipy: 1.11.4
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: 0.6.4
fsspec: None
cupy: None
pint: None
sparse: None
flox: 0.8.5
numpy_groupies: 0.10.2
setuptools: 65.5.0
pip: 23.3.1
conda: None
pytest: None
mypy: None
IPython: None
sphinx: None
None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8515/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1399324758,I_kwDOAMm_X85TaABW,7136,import xarray causes fatal python crash on windows when h5netcdf and netcdf4 are installed,21100296,closed,0,,,3,2022-10-06T10:38:59Z,2023-01-30T14:41:52Z,2023-01-30T14:41:52Z,NONE,,,,"### What happened?
On Windows with python (3.9 and 3.10) the command `import xarray` results in a crash of python, if I have the packages netcdf4 and h5netcdf installed.
### What did you expect to happen?
I expected that xarray would import normally, without a fatal python error.
### Minimal Complete Verifiable Example
```Python
# On windows:
pip install xarray
pip install h5netcdf
pip install netcdf4
# This results in a crash
python -c ""import xarray""
# The crash does not occur when I first import h5netcdf and then import xarray, so the next line does not result in a crash:
python -c ""import h5netcdf;import xarray""
# The crash does not occur on linux.
# The crash does not occur when I have only h5netcdf or netcdf4 installed.
```
### MVCE confirmation
- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [ ] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
### Relevant log output
```Python
command: python -c ""import xarray""
C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\h5py\__init__.py:36: UserWarning: h5py is running against HDF5 1.12.1 when it was built against 1.12.2, this may cause problems
_warn((""h5py is running against HDF5 {0} when it was built against {1}, ""
Warning! ***HDF5 library version mismatched error***
The HDF5 header files used to compile this application do not match
the version used by the HDF5 library to which this application is linked.
Data corruption or segmentation faults may occur if the application continues.
This can happen when an application was compiled by one version of HDF5 but
linked with a different version of static or shared HDF5 library.
You should recompile the application or check your shared library related
settings such as 'LD_LIBRARY_PATH'.
You can, at your own risk, disable this warning by setting the environment
variable 'HDF5_DISABLE_VERSION_CHECK' to a value of '1'.
Setting it to 2 or higher will suppress the warning messages totally.
Headers are 1.12.2, library is 1.12.1
SUMMARY OF THE HDF5 CONFIGURATION
=================================
General Information:
-------------------
HDF5 Version: 1.12.1
Configured on: 2022-03-04
Configured by: Ninja
Host system: Windows-10.0.17763
Uname information: Windows
Byte sex: little-endian
Installation point: D:/bld/hdf5_split_1646412547396/_h_env/Library
Compiling Options:
------------------
Build Mode: RELEASE
Debugging Symbols: OFF
Asserts: OFF
Profiling: OFF
Optimization Level: OFF
Linking Options:
----------------
Libraries:
Statically Linked Executables: OFF
LDFLAGS: /machine:x64
H5_LDFLAGS:
AM_LDFLAGS:
Extra libraries: D:/bld/hdf5_split_1646412547396/_h_env/Library/lib/libcurl.lib;D:/bld/hdf5_split_1646412547396/_h_env/Library/lib/libssl.lib;D:/bld/hdf5_split_1646412547396/_h_env/Library/lib/libcrypto.lib
Archiver: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.16.27023/bin/HostX64/x64/lib.exe
Ranlib: :
Languages:
----------
C: YES
C Compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.16.27023/bin/HostX64/x64/cl.exe 19.16.27045.0
CPPFLAGS:
H5_CPPFLAGS:
AM_CPPFLAGS:
CFLAGS: /DWIN32 /D_WINDOWS
H5_CFLAGS: /W3;/wd4100;/wd4706;/wd4127
AM_CFLAGS:
Shared C Library: YES
Static C Library: YES
Fortran: OFF
Fortran Compiler:
Fortran Flags:
H5 Fortran Flags:
AM Fortran Flags:
Shared Fortran Library: YES
Static Fortran Library: YES
C++: ON
C++ Compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Enterprise/VC/Tools/MSVC/14.16.27023/bin/HostX64/x64/cl.exe 19.16.27045.0
C++ Flags:
H5 C++ Flags: /W3;/wd4100;/wd4706;/wd4127
AM C++ Flags:
Shared C++ Library: YES
Static C++ Library: YES
JAVA: OFF
JAVA Compiler:
Features:
---------
Parallel HDF5: OFF
Parallel Filtered Dataset Writes:
Large Parallel I/O:
High-level library: ON
Build HDF5 Tests: ON
Build HDF5 Tools: ON
Threadsafety: ON (recursive RW locks: )
Default API mapping: v112
With deprecated public symbols: ON
I/O filters (external): DEFLATE
MPE:
Direct VFD:
Mirror VFD:
(Read-Only) S3 VFD: 1
(Read-Only) HDFS VFD:
dmalloc:
Packages w/ extra debug output:
API Tracing: OFF
Using memory checker: OFF
Memory allocation sanity checks: OFF
Function Stack Tracing: OFF
Use file locking: best-effort
Strict File Format Checks: OFF
Optimization Instrumentation:
Bye...
Error: Process completed with exit code 1.
```
### Anything else we need to know?
This bug is reproduced by the github action runner: https://github.com/daanscheltens/test-netcdf4/actions/runs/3196339371/jobs/5218135577
This action is part of a dedicated empty repository that just contains this action workflow:
https://github.com/daanscheltens/test-netcdf4/blob/main/.github/workflows/action.yml
### Environment
python -c ""import h5netcdf; import xarray as xr;xr.show_versions()""
INSTALLED VERSIONS
------------------
commit: None
python: 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)]
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 106 Stepping 6, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: ('English_United States', '1252')
libhdf5: 1.12.2
libnetcdf: None
xarray: 2022.9.0
pandas: 1.5.0
numpy: 1.23.3
scipy: None
netCDF4: None
pydap: None
h5netcdf: 1.0.2
h5py: 3.7.0
Nio: None
zarr: None
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 5[8](https://github.com/daanscheltens/test-netcdf4/actions/runs/3196339371/jobs/5218135408#step:15:9).1.0
pip: [22](https://github.com/daanscheltens/test-netcdf4/actions/runs/3196339371/jobs/5218135408#step:15:23).2.2
conda: None
pytest: None
IPython: None
sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7136/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue