issues: 775322346
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
775322346 | MDU6SXNzdWU3NzUzMjIzNDY= | 4736 | Limit number of data variables shown in repr | 14371165 | closed | 0 | 2 | 2020-12-28T10:15:26Z | 2021-01-04T02:13:52Z | 2021-01-04T02:13:52Z | MEMBER | What happened: xarray feels very unresponsive when using datasets with >2000 data variables because it has to print all the 2000 variables everytime you print something to console. What you expected to happen: xarray should limit the number of variables printed to console. Maximum maybe 25? Same idea probably apply to dimensions, coordinates and attributes as well, pandas only shows 2 for reference, the first and last variables. Minimal Complete Verifiable Example: ```python import numpy as np import xarray as xr a = np.arange(0, 2000) b = np.core.defchararray.add("long_variable_name", a.astype(str)) data_vars = dict() for v in b: data_vars[v] = xr.DataArray( name=v, data=[3, 4], dims=["time"], coords=dict(time=[0, 1]) ) ds = xr.Dataset(data_vars) Everything above feels fast. Printing to console however takes about 13 seconds for me:print(ds) ``` Anything else we need to know?: Out of scope brainstorming: Though printing 2000 variables is probably madness for most people it is kind of nice to show all variables because you sometimes want to know what happened to a few other variables as well. Is there already an easy and fast way to create subgroup of the dataset, so we don' have to rely on the dataset printing everything to the console everytime? Environment: Output of <tt>xr.show_versions()</tt>xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 libhdf5: 1.10.4 libnetcdf: None xarray: 0.16.2 pandas: 1.1.5 numpy: 1.17.5 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2020.12.0 distributed: 2020.12.0 matplotlib: 3.3.2 cartopy: None seaborn: 0.11.1 numbagg: None pint: None setuptools: 51.0.0.post20201207 pip: 20.3.3 conda: 4.9.2 pytest: 6.2.1 IPython: 7.19.0 sphinx: 3.4.0 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4736/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |