issues: 782943813
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
782943813 | MDU6SXNzdWU3ODI5NDM4MTM= | 4789 | Poor performance of repr of large arrays, particularly jupyter repr | 5635139 | closed | 0 | 5 | 2021-01-11T00:28:24Z | 2021-01-29T23:05:58Z | 2021-01-29T23:05:58Z | MEMBER | What happened: The What you expected to happen: We should really focus on having good repr performance, given how essential it is to any REPL workflow. Minimal Complete Verifiable Example: ```python In [10]: import xarray as xr ...: import numpy as np ...: import pandas as pd In [11]: idx = pd.MultiIndex.from_product([range(10_000), range(10_000)]) In [12]: df = pd.DataFrame(range(100_000_000), index=idx) In [13]: da = xr.DataArray(df) In [14]: da Out[14]: <xarray.DataArray (dim_0: 100000000, dim_1: 1)> array([[ 0], [ 1], [ 2], ..., [99999997], [99999998], [99999999]]) Coordinates: * dim_0 (dim_0) MultiIndex - dim_0_level_0 (dim_0) int64 0 0 0 0 0 0 0 ... 9999 9999 9999 9999 9999 9999 - dim_0_level_1 (dim_0) int64 0 1 2 3 4 5 6 ... 9994 9995 9996 9997 9998 9999 * dim_1 (dim_1) int64 0 In [26]: %timeit repr(da) 1.87 s ± 7.33 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) In [27]: %timeit da.repr_html() 4.78 s ± 1.8 s per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` Environment: Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None python: 3.8.7 (default, Dec 30 2020, 10:13:08) [Clang 12.0.0 (clang-1200.0.32.28)] python-bits: 64 OS: Darwin OS-release: 19.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None xarray: 0.16.3.dev48+gbf0fe2ca pandas: 1.1.3 numpy: 1.19.2 scipy: 1.5.3 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.5.0 cftime: 1.2.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2.30.0 distributed: None matplotlib: 3.3.2 cartopy: None seaborn: 0.11.0 numbagg: installed pint: 0.16.1 setuptools: 51.1.1 pip: 20.3.3 conda: None pytest: 6.1.1 IPython: 7.19.0 sphinx: None |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4789/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |