id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 445175953,MDU6SXNzdWU0NDUxNzU5NTM=,2969,`where` function mis-broadcasts and alters data type on dataset,8881170,closed,0,,,2,2019-05-16T21:52:58Z,2019-05-20T16:30:02Z,2019-05-20T16:30:02Z,CONTRIBUTOR,,,,"#### Code Sample, a copy-pastable example if possible ```python import numpy as np # generate data dateVar = np.arange('2005-02', '2005-06', dtype='datetime64[D]') t = len(dateVar) floatVar = np.random.rand(t, 100) indexVar = np.arange(100) intVar = np.random.randint(1, high=10, size=(t, 100)) # create dataset A = xr.DataArray(floatVar, dims=['time', 'N']) A.name = 'floatVar' B = xr.DataArray(indexVar, dims=['N']) B.name = 'indexVar' C = xr.DataArray(intVar, dims=['time', 'N']) C.name = 'intVar' D = xr.DataArray(dateVar, dims=['time']) D.name = 'dateVar' ds = xr.merge([A,B,C,D]) print(ds) Dimensions: (N: 100, time: 120) Dimensions without coordinates: N, time Data variables: floatVar (time, N) float64 0.4223 0.5019 0.8522 ... 0.9338 0.5833 0.09859 indexVar (N) int64 0 1 2 3 4 5 6 7 8 9 10 ... 90 91 92 93 94 95 96 97 98 99 intVar (time, N) int64 9 2 3 6 8 4 8 7 6 4 2 6 ... 3 1 8 3 8 3 5 3 1 6 7 dateVar (time) datetime64[ns] 2005-02-01 2005-02-02 ... 2005-05-31 # apply where function ds.where(ds.indexVar > 50, drop=True) Dimensions: (N: 49, time: 120) Dimensions without coordinates: N, time Data variables: floatVar (time, N) float64 0.3381 0.04735 0.464 ... 0.5571 0.5297 0.8106 indexVar (N) float64 51.0 52.0 53.0 54.0 55.0 ... 95.0 96.0 97.0 98.0 99.0 intVar (time, N) float64 5.0 2.0 9.0 5.0 5.0 1.0 ... 1.0 6.0 5.0 4.0 3.0 dateVar (time, N) datetime64[ns] 2005-02-01 2005-02-01 ... 2005-05-31 ``` #### Problem description This is motivated by a use-case of dimensions (Time, nParticle) for a Lagrangian particle simulation. In the above code snippet, I filter by some condition on `indexVar` (e.g., some type of particle). For variables that contain the same dimension as the one in `indexVar` (N), it broadcasts fine and there's no dimension changes. In the case of `dateVar`, which only has dimension `time`, there is a dim expansion to add N. Further, data-types are changed (`indexVar` and `intVar` from `int64` to `float64`). In my use-case, a variable of type `S64` was converted to `O`. So the two major issues here: 1. `where` mis-broadcasts certain variables by adding the filtered dimension (in this case `N` to `dateVar`). It should instead ignore variables that don't contain the variable being filtered. 2. `where` changes data-types of variables after the filter is applied. #### Expected Output ```python Dimensions: (N: 49, time: 120) Dimensions without coordinates: N, time Data variables: floatVar (time, N) float64 0.3381 0.04735 0.464 ... 0.5571 0.5297 0.8106 indexVar (N) int64 51 52 53 54 55 ... 95 96 97 98 99 intVar (time, N) int64 5 2 9 5 5 1 ... 1 6 5 4 3 dateVar (time) datetime64[ns] 2005-02-01 2005-02-02 ... 2005-05-31 ``` #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Mar 27 2019, 16:54:48) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.3 scipy: 1.2.1 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 1.2.2 distributed: 1.28.0 matplotlib: 3.0.3 cartopy: 0.17.0 seaborn: None setuptools: 41.0.1 pip: 19.1.1 conda: None pytest: 4.5.0 IPython: 7.5.0 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2969/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue