id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 811321550,MDU6SXNzdWU4MTEzMjE1NTA=,4922,Bottleneck and dask objects ignore `min_periods` on `rolling`,8881170,open,0,,,5,2021-02-18T17:43:50Z,2021-12-19T15:18:45Z,,CONTRIBUTOR,,,,"**What happened**: When `bottleneck` is installed in an environment, it seems to ignore the `min_periods` kwarg on `ds.rolling(...)`. **What you expected to happen**: When using `ds.rolling(..., min_periods=1)`, it should be able to handle an array of length 1. Without `bottleneck` installed, it returns the original value of a length 1 array. With `bottleneck` installed, the error is: ```python-traceback ValueError: Moving window (=2) must between 1 and 1, inclusive ``` **Minimal Complete Verifiable Example**: With `bottleneck` installed to environment: ```python import xarray as xr ds = xr.DataArray([1], dims='time') ds.rolling(time=2, center=True, min_periods=1).mean() ``` ```python-traceback ValueError: Moving window (=2) must between 1 and 1, inclusive ``` Without `bottleneck` installed to environment: ```python import xarray as xr ds = xr.DataArray([1], dims='time') ds.rolling(time=2, center=True, min_periods=1).mean() >>> >>> array([1.]) >>> Dimensions without coordinates: time ``` **Anything else we need to know?**: In an applied case, this came up while working on `.groupby('time.dayofyear').map(_rolling)`, where we map a rolling mean function over a defined N days with `min_periods=1`. Some climatological days (like leap years) will not have the N day requirement, so the `min_period` catch handles that, but with `bottleneck` installed it breaks due to the above issue. **Environment**:
Output of xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.8.6 | packaged by conda-forge | (default, Jan 25 2021, 23:22:12) [Clang 11.0.1 ] python-bits: 64 OS: Darwin OS-release: 19.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.16.2 pandas: 1.2.1 numpy: 1.19.5 scipy: 1.6.0 netCDF4: 1.5.5.1 pydap: None h5netcdf: 0.8.1 h5py: 3.1.0 Nio: None zarr: 2.6.1 cftime: 1.3.1 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: 1.1.8 cfgrib: None iris: None bottleneck: 1.3.2 dask: 2021.01.1 distributed: 2021.01.1 matplotlib: 3.3.3 cartopy: 0.18.0 seaborn: 0.11.1 numbagg: None pint: 0.16.1 setuptools: 49.6.0.post20210108 pip: 21.0 conda: None pytest: 6.2.2 IPython: 7.18.1 sphinx: 3.4.3
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4922/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 445175953,MDU6SXNzdWU0NDUxNzU5NTM=,2969,`where` function mis-broadcasts and alters data type on dataset,8881170,closed,0,,,2,2019-05-16T21:52:58Z,2019-05-20T16:30:02Z,2019-05-20T16:30:02Z,CONTRIBUTOR,,,,"#### Code Sample, a copy-pastable example if possible ```python import numpy as np # generate data dateVar = np.arange('2005-02', '2005-06', dtype='datetime64[D]') t = len(dateVar) floatVar = np.random.rand(t, 100) indexVar = np.arange(100) intVar = np.random.randint(1, high=10, size=(t, 100)) # create dataset A = xr.DataArray(floatVar, dims=['time', 'N']) A.name = 'floatVar' B = xr.DataArray(indexVar, dims=['N']) B.name = 'indexVar' C = xr.DataArray(intVar, dims=['time', 'N']) C.name = 'intVar' D = xr.DataArray(dateVar, dims=['time']) D.name = 'dateVar' ds = xr.merge([A,B,C,D]) print(ds) Dimensions: (N: 100, time: 120) Dimensions without coordinates: N, time Data variables: floatVar (time, N) float64 0.4223 0.5019 0.8522 ... 0.9338 0.5833 0.09859 indexVar (N) int64 0 1 2 3 4 5 6 7 8 9 10 ... 90 91 92 93 94 95 96 97 98 99 intVar (time, N) int64 9 2 3 6 8 4 8 7 6 4 2 6 ... 3 1 8 3 8 3 5 3 1 6 7 dateVar (time) datetime64[ns] 2005-02-01 2005-02-02 ... 2005-05-31 # apply where function ds.where(ds.indexVar > 50, drop=True) Dimensions: (N: 49, time: 120) Dimensions without coordinates: N, time Data variables: floatVar (time, N) float64 0.3381 0.04735 0.464 ... 0.5571 0.5297 0.8106 indexVar (N) float64 51.0 52.0 53.0 54.0 55.0 ... 95.0 96.0 97.0 98.0 99.0 intVar (time, N) float64 5.0 2.0 9.0 5.0 5.0 1.0 ... 1.0 6.0 5.0 4.0 3.0 dateVar (time, N) datetime64[ns] 2005-02-01 2005-02-01 ... 2005-05-31 ``` #### Problem description This is motivated by a use-case of dimensions (Time, nParticle) for a Lagrangian particle simulation. In the above code snippet, I filter by some condition on `indexVar` (e.g., some type of particle). For variables that contain the same dimension as the one in `indexVar` (N), it broadcasts fine and there's no dimension changes. In the case of `dateVar`, which only has dimension `time`, there is a dim expansion to add N. Further, data-types are changed (`indexVar` and `intVar` from `int64` to `float64`). In my use-case, a variable of type `S64` was converted to `O`. So the two major issues here: 1. `where` mis-broadcasts certain variables by adding the filtered dimension (in this case `N` to `dateVar`). It should instead ignore variables that don't contain the variable being filtered. 2. `where` changes data-types of variables after the filter is applied. #### Expected Output ```python Dimensions: (N: 49, time: 120) Dimensions without coordinates: N, time Data variables: floatVar (time, N) float64 0.3381 0.04735 0.464 ... 0.5571 0.5297 0.8106 indexVar (N) int64 51 52 53 54 55 ... 95 96 97 98 99 intVar (time, N) int64 5 2 9 5 5 1 ... 1 6 5 4 3 dateVar (time) datetime64[ns] 2005-02-01 2005-02-02 ... 2005-05-31 ``` #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Mar 27 2019, 16:54:48) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.3 scipy: 1.2.1 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 1.2.2 distributed: 1.28.0 matplotlib: 3.0.3 cartopy: 0.17.0 seaborn: None setuptools: 41.0.1 pip: 19.1.1 conda: None pytest: 4.5.0 IPython: 7.5.0 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2969/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue