home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 775322346

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
775322346 MDU6SXNzdWU3NzUzMjIzNDY= 4736 Limit number of data variables shown in repr 14371165 closed 0     2 2020-12-28T10:15:26Z 2021-01-04T02:13:52Z 2021-01-04T02:13:52Z MEMBER      

What happened: xarray feels very unresponsive when using datasets with >2000 data variables because it has to print all the 2000 variables everytime you print something to console.

What you expected to happen: xarray should limit the number of variables printed to console. Maximum maybe 25? Same idea probably apply to dimensions, coordinates and attributes as well,

pandas only shows 2 for reference, the first and last variables.

Minimal Complete Verifiable Example:

```python import numpy as np import xarray as xr

a = np.arange(0, 2000) b = np.core.defchararray.add("long_variable_name", a.astype(str)) data_vars = dict() for v in b: data_vars[v] = xr.DataArray( name=v, data=[3, 4], dims=["time"], coords=dict(time=[0, 1]) ) ds = xr.Dataset(data_vars)

Everything above feels fast. Printing to console however takes about 13 seconds for me:

print(ds) ```

Anything else we need to know?: Out of scope brainstorming: Though printing 2000 variables is probably madness for most people it is kind of nice to show all variables because you sometimes want to know what happened to a few other variables as well. Is there already an easy and fast way to create subgroup of the dataset, so we don' have to rely on the dataset printing everything to the console everytime?

Environment:

Output of <tt>xr.show_versions()</tt> xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 libhdf5: 1.10.4 libnetcdf: None xarray: 0.16.2 pandas: 1.1.5 numpy: 1.17.5 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2020.12.0 distributed: 2020.12.0 matplotlib: 3.3.2 cartopy: None seaborn: 0.11.1 numbagg: None pint: None setuptools: 51.0.0.post20201207 pip: 20.3.3 conda: 4.9.2 pytest: 6.2.1 IPython: 7.19.0 sphinx: 3.4.0
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4736/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 78.281ms · About: xarray-datasette