home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 445175953

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
445175953 MDU6SXNzdWU0NDUxNzU5NTM= 2969 `where` function mis-broadcasts and alters data type on dataset 8881170 closed 0     2 2019-05-16T21:52:58Z 2019-05-20T16:30:02Z 2019-05-20T16:30:02Z CONTRIBUTOR      

Code Sample, a copy-pastable example if possible

```python import numpy as np

generate data

dateVar = np.arange('2005-02', '2005-06', dtype='datetime64[D]') t = len(dateVar) floatVar = np.random.rand(t, 100) indexVar = np.arange(100) intVar = np.random.randint(1, high=10, size=(t, 100))

create dataset

A = xr.DataArray(floatVar, dims=['time', 'N']) A.name = 'floatVar' B = xr.DataArray(indexVar, dims=['N']) B.name = 'indexVar' C = xr.DataArray(intVar, dims=['time', 'N']) C.name = 'intVar' D = xr.DataArray(dateVar, dims=['time']) D.name = 'dateVar' ds = xr.merge([A,B,C,D]) print(ds)

<xarray.Dataset> Dimensions: (N: 100, time: 120) Dimensions without coordinates: N, time Data variables: floatVar (time, N) float64 0.4223 0.5019 0.8522 ... 0.9338 0.5833 0.09859 indexVar (N) int64 0 1 2 3 4 5 6 7 8 9 10 ... 90 91 92 93 94 95 96 97 98 99 intVar (time, N) int64 9 2 3 6 8 4 8 7 6 4 2 6 ... 3 1 8 3 8 3 5 3 1 6 7 dateVar (time) datetime64[ns] 2005-02-01 2005-02-02 ... 2005-05-31

apply where function

ds.where(ds.indexVar > 50, drop=True)

<xarray.Dataset> Dimensions: (N: 49, time: 120) Dimensions without coordinates: N, time Data variables: floatVar (time, N) float64 0.3381 0.04735 0.464 ... 0.5571 0.5297 0.8106 indexVar (N) float64 51.0 52.0 53.0 54.0 55.0 ... 95.0 96.0 97.0 98.0 99.0 intVar (time, N) float64 5.0 2.0 9.0 5.0 5.0 1.0 ... 1.0 6.0 5.0 4.0 3.0 dateVar (time, N) datetime64[ns] 2005-02-01 2005-02-01 ... 2005-05-31 ```

Problem description

This is motivated by a use-case of dimensions (Time, nParticle) for a Lagrangian particle simulation. In the above code snippet, I filter by some condition on indexVar (e.g., some type of particle).

For variables that contain the same dimension as the one in indexVar (N), it broadcasts fine and there's no dimension changes. In the case of dateVar, which only has dimension time, there is a dim expansion to add N.

Further, data-types are changed (indexVar and intVar from int64 to float64). In my use-case, a variable of type S64 was converted to O.

So the two major issues here: 1. where mis-broadcasts certain variables by adding the filtered dimension (in this case N to dateVar). It should instead ignore variables that don't contain the variable being filtered. 2. where changes data-types of variables after the filter is applied.

Expected Output

python <xarray.Dataset> Dimensions: (N: 49, time: 120) Dimensions without coordinates: N, time Data variables: floatVar (time, N) float64 0.3381 0.04735 0.464 ... 0.5571 0.5297 0.8106 indexVar (N) int64 51 52 53 54 55 ... 95 96 97 98 99 intVar (time, N) int64 5 2 9 5 5 1 ... 1 6 5 4 3 dateVar (time) datetime64[ns] 2005-02-01 2005-02-02 ... 2005-05-31

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Mar 27 2019, 16:54:48) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.3 scipy: 1.2.1 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 1.2.2 distributed: 1.28.0 matplotlib: 3.0.3 cartopy: 0.17.0 seaborn: None setuptools: 41.0.1 pip: 19.1.1 conda: None pytest: 4.5.0 IPython: 7.5.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2969/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 0.528ms · About: xarray-datasette