home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

1 row where "closed_at" is on date 2021-10-29, state_reason = "completed" and user = 35968931 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue 1

state 1

  • closed 1

repo 1

  • xarray 1
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1020282789 I_kwDOAMm_X8480Eel 5843 Why are `da.chunks` and `ds.chunks` properties inconsistent? TomNicholas 35968931 closed 0     6 2021-10-07T17:21:01Z 2021-10-29T18:12:22Z 2021-10-29T18:12:22Z MEMBER      

Basically the title, but what I'm referring to is this:

```python In [2]: da = xr.DataArray([[0, 1], [2, 3]], name='foo').chunk(1)

In [3]: ds = da.to_dataset()

In [4]: da.chunks Out[4]: ((1, 1), (1, 1))

In [5]: ds.chunks Out[5]: Frozen({'dim_0': (1, 1), 'dim_1': (1, 1)}) ```

Why does DataArray.chunks return a tuple and Dataset.chunks return a frozen dictionary?

This seems a bit silly, for a few reasons:

1) it means that some perfectly reasonable code might fail unnecessarily if passed a DataArray instead of a Dataset or vice versa, such as

```python
def is_core_dim_chunked(obj, core_dim):
    return len(obj.chunks[core_dim]) > 1
```
which will work as intended for a dataset but raises a `TypeError` for a dataarray.

2) it breaks the pattern we use for .sizes, where

```python
In [14]: da.sizes
Out[14]: Frozen({'dim_0': 2, 'dim_1': 2})

In [15]: ds.sizes
Out[15]: Frozen({'dim_0': 2, 'dim_1': 2})
```

3) if you want the chunks as a tuple they are always accessible via da.data.chunks, which is a more sensible place to look to find the chunks without dimension names.

4) It's an undocumented difference, as the docstrings for ds.chunks and da.chunks both only say

`"""Block dimensions for this dataset’s data or None if it’s not a dask array."""`

which doesn't tell me anything about the return type, or warn me that the return types are different.

EDIT: In fact `DataArray.chunk` doesn't even appear to be listed on the API docs page at all.

In our codebase this difference is mostly washed out by us using ._to_temp_dataset() all the time, and also by the way that the .chunk() method accepts both the tuple and dict form, so both of these invariants hold (but in different ways):

ds == ds.chunk(ds.chunks) da == da.chunk(da.chunks)

I'm not sure whether making this consistent is worth the effort of a significant breaking change though :confused:

(Sort of related to https://github.com/pydata/xarray/issues/2103)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5843/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 163.458ms · About: xarray-datasette