home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 732910109

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
732910109 MDU6SXNzdWU3MzI5MTAxMDk= 4554 Unexpected chunking of 3d DataArray in `polyfit()` 26591824 open 0     3 2020-10-30T06:07:34Z 2021-04-19T15:44:07Z   CONTRIBUTOR      

What happened: When running polyfit() on a 3d chunked xarray DataArray, the output is chunked differently than the input array.

What you expected to happen: I expect the output to have the same chunking as the input.

Minimal Complete Verifiable Example: (from @rabernat in https://github.com/xgcm/xrft/issues/116)

Example: number of chunks decreases ```python import dask.array as dsa import xarray as xr

nz, ny, nx = (10, 20, 30) data = dsa.ones((nz, ny, nx), chunks=(1, 5, nx)) da = xr.DataArray(data, dims=['z', 'y', 'x']) da.chunks

-> ((1, 1, 1, 1, 1, 1, 1, 1, 1, 1), (5, 5, 5, 5), (30,))

pf = da.polyfit('x', 1) pf.polyfit_coefficients.chunks

-> ((1, 1, 1, 1, 1, 1, 1, 1, 1, 1), (20,), (30,))

chunks on the y dimension have been consolidated!

pv = xr.polyval(da.x, pf.polyfit_coefficients).transpose('z', 'y', 'x') pv.chunks

-> ((1, 1, 1, 1, 1, 1, 1, 1, 1, 1), (20,), (30,))

and this propagates to polyval

align back against the original data

(da - pv).chunks

-> ((1, 1, 1, 1, 1, 1, 1, 1, 1, 1), (5, 5, 5, 5), (30,))

hides the fact that we have chunk consolidation happening upstream

```

Example: number of chunks increases ```python nz, ny, nx = (6, 10, 4) data = dsa.ones((nz, ny, nx), chunks=(2, 10, 2)) da = xr.DataArray(data, dims=['z', 'y', 'x']) da.chunks

-> ((2, 2, 2), (10,), (2, 2))

pf = da.polyfit('y', 1) pf.polyfit_coefficients.chunks

-> ((2,), (1, 1, 1, 1, 1, 1), (4,))

pv = xr.polyval(da.y, pf.polyfit_coefficients).transpose('z', 'y', 'x') pv.chunks

-> ((1, 1, 1, 1, 1, 1), (10,), (4,))

(da - pv).chunks

-> ((1, 1, 1, 1, 1, 1), (10,), (2, 2))

``` (This discussion started in https://github.com/xgcm/xrft/issues/116 with @rabernat and @navidcy.)

Environment:

Running on Pangeo Cloud

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.6 | packaged by conda-forge | (default, Oct 7 2020, 19:08:05) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 4.19.112+ machine: x86_64 processor: x86_64 byteorder: little LC_ALL: C.UTF-8 LANG: C.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.16.1 pandas: 1.1.3 numpy: 1.19.2 scipy: 1.5.2 netCDF4: 1.5.4 pydap: installed h5netcdf: 0.8.1 h5py: 2.10.0 Nio: None zarr: 2.4.0 cftime: 1.2.1 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: 1.1.7 cfgrib: 0.9.8.4 iris: None bottleneck: 1.3.2 dask: 2.30.0 distributed: 2.30.0 matplotlib: 3.3.2 cartopy: 0.18.0 seaborn: None numbagg: None pint: 0.16.1 setuptools: 49.6.0.post20201009 pip: 20.2.3 conda: None pytest: 6.1.1 IPython: 7.18.1 sphinx: 3.2.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4554/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 3 rows from issue in issue_comments
Powered by Datasette · Queries took 157.518ms · About: xarray-datasette