home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

1 row where state = "open", type = "issue" and user = 58827984 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

type 1

  • issue · 1 ✖

state 1

  • open · 1 ✖

repo 1

  • xarray 1
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
928533488 MDU6SXNzdWU5Mjg1MzM0ODg= 5521 Memory inefficiency when using sortby ericgyounkin 58827984 open 0     0 2021-06-23T18:31:18Z 2021-06-23T18:31:18Z   NONE      

What happened:

High memory usage seen when sorting after loading from disk. Loading from disk took about 150MB, where after the sort I saw a usage of about 1.5 GB. I believe this is due to the reindexing that requires the data to be loaded into memory during sort. So I guess I am not surprised, but I wanted to submit this as a possible issue just to make sure that my reasoning is good. For my use case, I will have to abandon sortby and ensure data is sorted prior to writing to disk.

I am afraid my MVCE relies on data on disk that I have. If this is an actual issue that needs more looking into, I can provide an example that anyone can run. Otherwise I can close.

Minimal Complete Verifiable Example:

```python import xarray as xr from psutil import virtual_memory startmem = virtual_memory().used data = xr.open_zarr(r"D:\falkor\FK181005_processed\em302_105_10_06_2018\attitude.zarr", synchronizer=None, mask_and_scale=False, decode_coords=False, decode_times=False, decode_cf=False, concat_characters=False) afterload_mem = virtual_memory().used - startmem ans = data.sortby('time') aftersort_mem = virtual_memory().used - startmem print('Without sort: {}'.format(afterload_mem)) print('With sort: {}'.format(aftersort_mem))

Out: Without sort: 149241856 Out: With sort: 1657593856 ```

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.8 | packaged by conda-forge | (default, Feb 20 2021, 15:50:08) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: AMD64 Family 23 Model 113 Stepping 0, AuthenticAMD byteorder: little LC_ALL: None LANG: None LOCALE: English_United States.1252 libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.17.0 pandas: 1.2.3 numpy: 1.20.3 scipy: 1.6.0 netCDF4: 1.5.6 pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: 2.6.1 cftime: 1.4.1 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.1 cfgrib: None iris: None bottleneck: 1.3.2 dask: 2021.03.0 distributed: 2021.03.0 matplotlib: 3.3.4 cartopy: 0.18.0 seaborn: 0.11.1 numbagg: None pint: None setuptools: 49.6.0.post20210108 pip: 21.0.1 conda: None pytest: 6.2.2 IPython: 7.21.0 sphinx: 3.5.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5521/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 89.456ms · About: xarray-datasette