home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

4 rows where "created_at" is on date 2022-06-24 and user = 2448579 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 2

  • pull 3
  • issue 1

state 1

  • closed 4

repo 1

  • xarray 4
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1284094480 I_kwDOAMm_X85MiboQ 6722 Avoid loading any data for reprs dcherian 2448579 closed 0     5 2022-06-24T19:04:30Z 2022-10-28T16:23:20Z 2022-10-28T16:23:20Z MEMBER      

What happened?

For "small" datasets, we load in to memory when displaying the repr. For cloud backed datasets with large number of "small" variables, this can use a lot of time sequentially loading O(100) variables just for a repr.

https://github.com/pydata/xarray/blob/6c8db5ed005e000b35ad8b6ea9080105e608e976/xarray/core/formatting.py#L548-L549

What did you expect to happen?

Fast reprs!

Minimal Complete Verifiable Example

This dataset has 48 "small" variables ```Python import xarray as xr

dc1 = xr.open_dataset('s3://its-live-data/datacubes/v02/N40E080/ITS_LIVE_vel_EPSG32645_G0120_X250000_Y4750000.zarr', engine= 'zarr', storage_options = {'anon':True}) dc1.repr_html() ```

On 2022.03.0 this repr takes 36.4s If I comment the array.size condition I get 6μs.

MVCE confirmation

  • [x] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [x] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [x] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [x] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.10.4 | packaged by conda-forge | (main, Mar 24 2022, 17:43:32) [Clang 12.0.1 ] python-bits: 64 OS: Darwin OS-release: 21.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2022.3.0 pandas: 1.4.2 numpy: 1.22.4 scipy: 1.8.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.11.3 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.10 cfgrib: None iris: None bottleneck: None dask: 2022.05.2 distributed: None matplotlib: 3.5.2 cartopy: 0.20.2 seaborn: 0.11.2 numbagg: None fsspec: 2022.5.0 cupy: None pint: None sparse: None setuptools: 62.3.2 pip: 22.1.2 conda: None pytest: None IPython: 8.4.0 sphinx: 4.5.0
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6722/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1284071791 PR_kwDOAMm_X846VIdv 6721 Fix .chunks loading lazy backed array data dcherian 2448579 closed 0     5 2022-06-24T18:45:45Z 2022-06-29T20:15:16Z 2022-06-29T20:06:36Z MEMBER   0 pydata/xarray/pulls/6721
  • [x] Closes #6538
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst

@shoyer is there a way to test this?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6721/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1284257698 PR_kwDOAMm_X846VxFI 6724 [skip-ci] Add sphinx module directive dcherian 2448579 closed 0     0 2022-06-24T22:46:10Z 2022-06-26T02:00:51Z 2022-06-25T23:39:34Z MEMBER   0 pydata/xarray/pulls/6724

xref https://github.com/xarray-contrib/xarray-tutorial/pull/85#issuecomment-1159518073

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6724/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1284252630 PR_kwDOAMm_X846VwFj 6723 Better documentation of options dcherian 2448579 closed 0     0 2022-06-24T22:40:24Z 2022-06-25T20:01:18Z 2022-06-25T20:01:07Z MEMBER   0 pydata/xarray/pulls/6723
  • [x] Closes #1624, Closes #5699
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6723/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 4959.165ms · About: xarray-datasette