home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 462859457

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
462859457 MDU6SXNzdWU0NjI4NTk0NTc= 3068 Multidimensional dask coordinates unexpectedly computed 1828519 closed 0     8 2019-07-01T18:52:03Z 2019-11-05T15:41:15Z 2019-11-05T15:41:15Z CONTRIBUTOR      

MCVE Code Sample

```python from dask.diagnostics import ProgressBar import xarray as xr import numpy as np import dask.array as da

a = xr.DataArray(da.zeros((10, 10), chunks=2), dims=('y', 'x'), coords={'y': np.arange(10), 'x': np.arange(10), 'lons': (('y', 'x'), da.zeros((10, 10), chunks=2))}) b = xr.DataArray(da.zeros((10, 10), chunks=2), dims=('y', 'x'), coords={'y': np.arange(10), 'x': np.arange(10), 'lons': (('y', 'x'), da.zeros((10, 10), chunks=2))})

with ProgressBar(): c = a + b

```

Output:

[########################################] | 100% Completed | 0.1s

Problem Description

Using arrays with 2D dask array coordinates results in the coordinates being computed for any binary operations (anything combining two or more DataArrays). I use ProgressBar in the above example to show when coordinates are being computed.

In my own work, when I learned that 2D dask coordinates were possible, I started adding longitude and latitude coordinates. These are rather large and can take a while to load/compute so I was surprised that simple operations (ex. a.fillna(b)) were causing things to be computed and taking a long time.

Is this computation by design or a possible bug?

Expected Output

No output from the ProgressBar, hoping that no coordinates would be computed/loaded.

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.7 | packaged by conda-forge | (default, Feb 28 2019, 02:16:08) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 18.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.1 pandas: 0.24.2 numpy: 1.14.3 scipy: 1.3.0 netCDF4: 1.5.1.2 pydap: None h5netcdf: 0.7.4 h5py: 2.9.0 Nio: None zarr: 2.3.2 cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: 1.0.22 cfgrib: None iris: None bottleneck: 1.2.1 dask: 2.0.0 distributed: 2.0.0 matplotlib: 3.1.0 cartopy: 0.17.1.dev147+HEAD.detached.at.5e624fe seaborn: None setuptools: 41.0.1 pip: 19.1.1 conda: None pytest: 4.6.3 IPython: 7.5.0 sphinx: 2.1.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3068/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 8 rows from issue in issue_comments
Powered by Datasette · Queries took 0.736ms · About: xarray-datasette