issues: 1657036222
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1657036222 | I_kwDOAMm_X85ixF2- | 7730 | flox performance regression for cftime resampling | 10194086 | closed | 0 | 8 | 2023-04-06T09:38:03Z | 2023-10-15T03:48:44Z | 2023-10-15T03:48:44Z | MEMBER | What happened?Running an in-memory What did you expect to happen?flox to be at least on par with our naive implementation Minimal Complete Verifiable Example```Python import numpy as np import xarray as xr arr = np.random.randn(10, 10, 36530) time = xr.date_range("2000", periods=30365, calendar="noleap") da = xr.DataArray(arr, dims=("y", "x", "time"), coords={"time": time}) using maxprint("max:") xr.set_options(use_flox=True) %timeit da.groupby("time.year").max("time") %timeit da.groupby("time.year").max("time", engine="flox") xr.set_options(use_flox=False) %timeit da.groupby("time.year").max("time") as reference%timeit [da.sel(time=str(year)).max("time") for year in range(2000, 2030)] using meanprint("mean:") xr.set_options(use_flox=True) %timeit da.groupby("time.year").mean("time") %timeit da.groupby("time.year").mean("time", engine="flox") xr.set_options(use_flox=False) %timeit da.groupby("time.year").mean("time") as reference%timeit [da.sel(time=str(year)).mean("time") for year in range(2000, 2030)] ``` MVCE confirmation
Relevant log output```Python max: 158 ms ± 4.41 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 28.1 ms ± 318 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) 11.5 ms ± 52.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) mean: 95.6 ms ± 10.8 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) 34.8 ms ± 2.88 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) 15.2 ms ± 232 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) ``` Anything else we need to know?No response Environment
INSTALLED VERSIONS
------------------
commit: f8127fc9ad24fe8b41cce9f891ab2c98eb2c679a
python: 3.10.10 | packaged by conda-forge | (main, Mar 24 2023, 20:08:06) [GCC 11.3.0]
python-bits: 64
OS: Linux
OS-release: 5.15.0-69-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.9.1
xarray: main
pandas: 1.5.3
numpy: 1.23.5
scipy: 1.10.1
netCDF4: 1.6.3
pydap: installed
h5netcdf: 1.1.0
h5py: 3.8.0
Nio: None
zarr: 2.14.2
cftime: 1.6.2
nc_time_axis: 1.4.1
PseudoNetCDF: 3.2.2
iris: 3.4.1
bottleneck: 1.3.7
dask: 2023.3.2
distributed: 2023.3.2.1
matplotlib: 3.7.1
cartopy: 0.21.1
seaborn: 0.12.2
numbagg: 0.2.2
fsspec: 2023.3.0
cupy: None
pint: 0.20.1
sparse: 0.14.0
flox: 0.6.10
numpy_groupies: 0.9.20
setuptools: 67.6.1
pip: 23.0.1
conda: None
pytest: 7.2.2
mypy: None
IPython: 8.12.0
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7730/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |