home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 601224688

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
601224688 MDU6SXNzdWU2MDEyMjQ2ODg= 3977 xr.full_like (often) fails when other is chunked and fill_value is non-scalar 13662783 closed 0     1 2020-04-16T16:23:59Z 2020-04-24T07:15:44Z 2020-04-24T07:15:44Z CONTRIBUTOR      

I've been running into some issues when using xr.full_like, when my other.data is a chunked dask array, and the fill_value is a numpy array.

Now, I just checked, full_like mentions only scalar in the signature. However, this is a very convenient way to get all the coordinates and dimensions attached to an array like this, so it feels like desirable functionality. And as I mention below, both numpy and dask function similary, taking much more than just scalars. https://xarray.pydata.org/en/stable/generated/xarray.full_like.html

MCVE Code Sample

python x = [1, 2, 3, 4] y = [1, 2, 3] da1 = xr.DataArray(dask.array.ones((3, 4), chunks=(1, 4)), {"y": y, "x": x}, ("y", "x")) da2 = xr.full_like(da1, np.ones((3, 4))) print(da2.values)

This results in an error: ValueError: could not broadcast input array from shape (1,3) into shape (1,4)

Expected Output

Expected is a DataArray with the dimensions and coords of other, and the numpy array of fill_value as its data.

Problem Description

The issue lies here: https://github.com/pydata/xarray/blob/2c77eb531b6689f9f1d2adbde0d8bf852f1f7362/xarray/core/common.py#L1420-L1436

Calling dask.array.full with the given number of chunks results in it trying to to apply the fill_value for every individual chunk.

As one would expect, if I set fill_value to the size of a single chunk it doesn't error: python da2 = xr.full_like(da1, np.ones((1, 4))) print(da2.values)

It does fail on a similarly chunked dask array (since it's applying it for every chunk): python da2 = xr.full_like(da1, dask.array.ones((3, 4))) print(da2.values)

The most obvious solution would be to force it down the np.full_like route, since all the values already exist in memory anyway. So maybe another type check does the trick. However, full() accepts quite a variety of arguments for the fill value (scalars, numpy arrays, lists, tuples, ranges). The dask docs mention only a scalar in the signature for dask.array.full: https://docs.dask.org/en/latest/array-api.html#dask.array.full As does numpy.full: https://docs.scipy.org/doc/numpy/reference/generated/numpy.full.html

However, in all cases, they still broadcast automatically... ```python a = np.full((2, 2), [1, 2]

array([[1, 2], [1, 2]]) ```

So kind of undefined behavior of a blocked full?

Versions

Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 | packaged by conda-forge | (default, Jan 7 2020, 21:48:41) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: en LOCALE: None.None libhdf5: 1.10.5 libnetcdf: 4.7.3 xarray: 0.15.1 pandas: 0.25.3 numpy: 1.17.5 scipy: 1.3.1 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: 2.4.0 cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: 1.1.2 cfgrib: None iris: None bottleneck: 1.3.2 dask: 2.9.2 distributed: 2.10.0 matplotlib: 3.1.2 cartopy: None seaborn: 0.10.0 numbagg: None setuptools: 46.1.3.post20200325 pip: 20.0.2 conda: None pytest: 5.3.4 IPython: 7.13.0 sphinx: 2.3.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3977/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 1 row from issue in issue_comments
Powered by Datasette · Queries took 0.565ms · About: xarray-datasette