home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

2 rows where "created_at" is on date 2021-09-01 and user = 2448579 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: closed_at (date)

type 1

  • issue 2

state 1

  • closed 2

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
984555353 MDU6SXNzdWU5ODQ1NTUzNTM= 5754 Variable.stack constructs extremely large chunks dcherian 2448579 closed 0     6 2021-09-01T03:08:02Z 2023-03-22T14:51:44Z 2021-12-14T17:31:45Z MEMBER      

Minimal Complete Verifiable Example:

Here's a small array with too-small chunk sizes just as an example ```python

Put your MCVE code here

import dask.array import xarray as xr

var = xr.Variable(("x", "y", "z"), dask.array.random.random((4, 18483, 1000), chunks=(1, 183, -1))) ```

Now stack two dimensions, this is a 100x increase in chunk size (in my actual code, 85MB chunks become 8.5GB chunks =) )

var.stack(new=("x", "y"))

But calling reshape on the dask array preserves the original chunk size var.data.reshape((4*18483, -1))

Solution

Ah, found it , we transpose then reshape in Variable_stack_once. https://github.com/pydata/xarray/blob/f915515d610b4471888fa44dfb00dbae3fd22349/xarray/core/variable.py#L1521-L1527

Writing those steps with pure dask yields the same 100x increase in chunksize

python var.data.transpose([2, 0, 1]).reshape((-1, 4*18483))

Anything else we need to know?:

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.6 | packaged by conda-forge | (default, Jan 25 2021, 23:21:18) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1127.18.2.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.19.0 pandas: 1.3.1 numpy: 1.21.1 scipy: 1.5.3 netCDF4: 1.5.6 pydap: installed h5netcdf: 0.11.0 h5py: 3.3.0 Nio: None zarr: 2.8.3 cftime: 1.5.0 nc_time_axis: 1.3.1 PseudoNetCDF: None rasterio: None cfgrib: None iris: 3.0.4 bottleneck: 1.3.2 dask: 2021.07.2 distributed: 2021.07.2 matplotlib: 3.4.2 cartopy: 0.19.0.post1 seaborn: 0.11.1 numbagg: None pint: 0.17 setuptools: 49.6.0.post20210108 pip: 21.2.2 conda: 4.10.3 pytest: 6.2.4 IPython: 7.26.0 sphinx: 4.1.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5754/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
985376146 MDU6SXNzdWU5ODUzNzYxNDY= 5758 RTD build failing dcherian 2448579 closed 0     6 2021-09-01T16:50:58Z 2021-09-08T09:47:17Z 2021-09-08T09:47:16Z MEMBER      

The current RTD build is failing in plotting.rst

`` sphinx.errors.SphinxParallelError: RuntimeError: Non Expected exception in/home/docs/checkouts/readthedocs.org/user_builds/xray/checkouts/latest/doc/user-guide/plotting.rst` line None

Sphinx parallel build error: RuntimeError: Non Expected exception in /home/docs/checkouts/readthedocs.org/user_builds/xray/checkouts/latest/doc/user-guide/plotting.rst line None [IPKernelApp] WARNING | Parent appears to have exited, shutting down. ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5758/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 5839.235ms · About: xarray-datasette