issues: 433916353
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
433916353 | MDU6SXNzdWU0MzM5MTYzNTM= | 2902 | DataArray sum().values depends on chunk size | 15570875 | closed | 0 | 1 | 2019-04-16T18:09:33Z | 2019-04-17T02:01:55Z | 2019-04-17T02:01:55Z | NONE | Hi, The code below creates a Dataset with an While I'm not surprised at these round-off differences, I could not find mention of such behavior in the xarray documentation. Is this feature known to xarray developers? Do xarray developers consider it a feature or a bug? Either way, I think it would be useful if the xarray documentation would mention that the results of some operations depends on chunk size. code: ```import numpy as np import xarray as xr N = 128 val = 1.9 val_array = np.full((N, N, N), val) exact_sum = N * N * N * val ds = xr.DataArray(val_array, name='val_array', dims=['x', 'y', 'z']).to_dataset() rel_diff = (ds['val_array'].sum().values - exact_sum) / exact_sum print('no chunking, rel_diff = %e' % rel_diff) for chunk_x in [N//16, N//4, N]: for chunk_y in [N//16, N//4, N]: for chunk_z in [N//16, N//4, N]: ds2 = ds.chunk({'x':chunk_x, 'y':chunk_y, 'z':chunk_z}) rel_diff = (ds2['val_array'].sum().values - exact_sum) / exact_sum print('chunk_x = %3d, chunk_y = %3d, chunk_z = %3d, rel_diff = %e' \ % (chunk_x, chunk_y, chunk_z, rel_diff)) ``` results:
Output of
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-693.21.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.4
libnetcdf: 4.6.2
xarray: 0.12.1
pandas: 0.24.2
numpy: 1.16.2
scipy: 1.2.1
netCDF4: 1.4.2
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.0.3.4
nc_time_axis: None
PseudonetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 1.1.5
distributed: 1.26.1
matplotlib: 3.0.3
cartopy: None
seaborn: None
setuptools: 40.8.0
pip: 19.0.3
conda: None
pytest: 4.3.1
IPython: 7.4.0
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2902/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |