home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

1 row where repo = 13221727, type = "issue" and user = 20225454 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue · 1 ✖

state 1

  • closed 1

repo 1

  • xarray · 1 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
481866516 MDU6SXNzdWU0ODE4NjY1MTY= 3225 xr.DataArray.where sets valid points to nan when using several dask chunks climachine 20225454 closed 0     3 2019-08-17T09:09:59Z 2022-04-18T15:58:40Z 2022-04-18T15:58:40Z NONE      

MCVE Code Sample

I am trying to randomly delete a fraction of a xr.DataArray (see identical StackOverflow question) and subsequently access only the values from the original dataset data that were deleted.

This works fine as long as the data is not stored in dask arrays or in only one dask array. As soon as I define chunks smaller than the total size of the data, the original values are set to nan.

```python data = xr.DataArray(np.arange(555.).reshape(5,5,5), dims=('time','latitude','longitude')) data.to_netcdf('/path/to/file.nc')

data = xr.open_dataarray('/path/to/file.nc', chunks={'time':5}) # produces expected output

data = xr.open_dataarray('/path/to/file.nc', chunks={'time':2}) # produces observed output

def set_fraction_randomly_to_nan(data, frac_missing): np.random.seed(0) data[np.random.rand(*data.shape) < frac_missing] = np.nan return data

data_lost = xr.apply_ufunc(set_fraction_randomly_to_nan, data.copy(deep=True), output_core_dims=[['latitude','longitude']], dask='parallelized', input_core_dims=[['latitude','longitude']], output_dtypes=[data.dtype], kwargs={'frac_missing': 0.5})

print(data[0,-4:,-4:].values)

>>

[[ 6. 7. 8. 9.]

[11. 12. 13. 14.]

[16. 17. 18. 19.]

[21. 22. 23. 24.]]

print(data.where(np.isnan(data_lost),0)[0,-4:,-4:].values) ```

Expected Output

expected output of the last line: keep all values where np.isnan(data_lost) is True and set rest to zero

python [[ 6. 0. 0. 9.] [ 0. 0. 0. 14.] [16. 0. 0. 0.] [ 0. 22. 0. 24.]]

Problem Description

observed output of the last line: set all values where np.isnan(data_lost) is True to nan and set rest to zero

python [[nan 0. 0. nan] [ 0. 0. 0. nan] [nan 0. 0. 0.] [ 0. nan 0. nan]]

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 4.4.138-59-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: de_DE.UTF-8 LOCALE: de_DE.UTF-8 xarray: 0.10.2 pandas: 0.22.0 numpy: 1.14.2 scipy: 1.0.1 netCDF4: 1.3.1 h5netcdf: 0.5.0 h5py: 2.7.1 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: 1.0.0 dask: 0.17.2 distributed: 1.21.5 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: 0.8.1 setuptools: 39.0.1 pip: 9.0.3 conda: None pytest: 3.5.0 IPython: 6.3.1 sphinx: 1.7.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3225/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 2321.704ms · About: xarray-datasette