issues: 1037894157
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1037894157 | I_kwDOAMm_X8493QIN | 5902 | Slow performance of `DataArray.unstack()` from checking `variable.data` | 1312546 | closed | 0 | 4 | 2021-10-27T21:54:48Z | 2021-10-29T15:21:24Z | 2021-10-29T15:21:24Z | MEMBER | What happened: Calling What you expected to happen: Faster unstack. Minimal Complete Verifiable Example: ```python import pandas as pd import numpy as np import xarray as xr t = pd.date_range("2000", periods=2) x = np.arange(1000) y = np.arange(1000) component = np.arange(4) idx = pd.MultiIndex.from_product([t, y, x], names=["time", "y", "x"]) data = np.random.uniform(size=(len(idx), len(component))) arr = xr.DataArray( data, coords={"pixel": xr.DataArray(idx, name="pixel", dims="pixel"), "component": xr.DataArray(component, name="component", dims="component")}, dims=("pixel", "component") ) %time _ = arr.unstack() CPU times: user 6.33 s, sys: 295 ms, total: 6.62 s Wall time: 6.62 s ``` Anything else we need to know?: For this example, >99% of the time is spent at on this line: https://github.com/pydata/xarray/blob/df7646182b17d829fe9b2199aebf649ddb2ed480/xarray/core/dataset.py#L4162, specifically on the call to Just going by the comments, it does seem like accessing Alternatively, if that's too difficult, perhaps we could add a flag to Environment: Output of <tt>xr.show_versions()</tt>``` INSTALLED VERSIONS ------------------ commit: None python: 3.8.12 | packaged by conda-forge | (default, Sep 29 2021, 19:52:28) [GCC 9.4.0] python-bits: 64 OS: Linux OS-release: 5.4.0-1040-azure machine: x86_64 processor: x86_64 byteorder: little LC_ALL: C.UTF-8 LANG: C.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 0.19.0 pandas: 1.3.3 numpy: 1.20.0 scipy: 1.7.1 netCDF4: 1.5.7 pydap: installed h5netcdf: 0.11.0 h5py: 3.4.0 Nio: None zarr: 2.10.1 cftime: 1.5.1 nc_time_axis: 1.3.1 PseudoNetCDF: None rasterio: 1.2.9 cfgrib: 0.9.9.0 iris: None bottleneck: 1.3.2 dask: 2021.08.1 distributed: 2021.08.1 matplotlib: 3.4.3 cartopy: 0.20.0 seaborn: 0.11.2 numbagg: None pint: 0.17 setuptools: 58.0.4 pip: 20.3.4 conda: None pytest: None IPython: 7.28.0 sphinx: None ``` |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5902/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |