home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

2 rows where type = "issue" and user = 38358698 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

state 2

  • closed 1
  • open 1

type 1

  • issue · 2 ✖

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1160062673 I_kwDOAMm_X85FJSbR 6333 Expressing dimension's preferred chunks as tuple of integers causes TypeError stanwest 38358698 closed 0     0 2022-03-04T21:23:21Z 2022-04-08T17:18:50Z 2022-04-08T17:18:50Z CONTRIBUTOR      

What happened?

When opening a dataset containing a variable that has preferred chunks expressed along some dimension as a tuple of integers, xarray raises a TypeError.

What did you expect to happen?

I expected to open the dataset with its preferred chunks, as described in the documentation on preferred chunks within "How to add a new backend".

Minimal Complete Verifiable Example

```Python import xarray as xr

class PassThroughBackendEntrypoint(xr.backends.BackendEntrypoint): def open_dataset(self, dataset, *, drop_variables=None): return dataset

initial = xr.Dataset( { "data": xr.Variable( ("dim",), [0, 0], encoding={"preferred_chunks": {"dim": (1, 1)}} ) } ) final = xr.open_dataset(initial, engine=PassThroughBackendEntrypoint, chunks={}) ```

Relevant log output

```Python [Paths simplified.]

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "...\xarray\backends\api.py", line 501, in open_dataset ds = _dataset_from_backend_dataset( File "...\xarray\backends\api.py", line 317, in _dataset_from_backend_dataset ds = _chunk_ds( File "...\xarray\backends\api.py", line 287, in _chunk_ds var_chunks = _get_chunk(var, chunks) File "...\xarray\core\dataset.py", line 409, in _get_chunk

_check_chunks_compatibility(var, output_chunks, preferred_chunks)

File "...\xarray\core\dataset.py", line 371, in _check_chunks_compatibility if any(s % preferred_chunks_dim for s in chunks_dim): File "...\xarray\core\dataset.py", line 371, in <genexpr> if any(s % preferred_chunks_dim for s in chunks_dim): TypeError: unsupported operand type(s) for %: 'int' and 'tuple' ```

Anything else we need to know?

The behavior exhibited above touches on the following related issues:

  • The _check_chunks_compatibility function assumes that a dimension expresses its preferred chunks only as an integer, not a sequence of integers. In contrast, Dask will handle either within the previous_chunks argument to its normalize_chunks function.

  • The examples in the documentation of "preferred_chunks" mappings, namely {“dim1”: 1000, “dim2”: 2000} and {“dim1”: [1000, 100], “dim2”: [2000, 2000, 2000]]}, have syntax errors: The quotation marks are curly instead of straight, and the second example has an extra closing bracket.

  • After correcting the syntax errors, the lists in the second example lead to TypeError: unhashable type: 'list'. Dask raises the exception when it tries to test a mutable list for set membership, as in the following (with simplified paths):

    ```python

    dask.array.core.normalize_chunks([[1000, 100], [2000, 2000, 2000]], (1100, 6000)) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "...\dask\array\core.py", line 2900, in normalize_chunks chunks = tuple(c if c not in {None, -1} else s for c, s in zip(chunks, shape)) File "...\dask\array\core.py", line 2900, in <genexpr> chunks = tuple(c if c not in {None, -1} else s for c, s in zip(chunks, shape)) TypeError: unhashable type: 'list' ```

    If one omits the second argument (the shape) to that call, it succeeds. This may be a bug in Dask.

  • The tests in xarray don't exercise behaviors related to preferred chunks.

[Edited for grammar.]

Environment

``` INSTALLED VERSIONS


commit: None python: 3.8.12 | packaged by conda-forge | (default, Oct 12 2021, 21:22:46) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 94 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('English_United States', '1252') libhdf5: 1.12.1 libnetcdf: 4.8.1

xarray: 0.20.3.dev52+gd3b6aa6d pandas: 1.4.1 numpy: 1.21.5 scipy: 1.8.0 netCDF4: 1.5.8 pydap: installed h5netcdf: 0.13.1 h5py: 3.6.0 Nio: None zarr: 2.11.0 cftime: 1.5.2 nc_time_axis: 1.4.0 PseudoNetCDF: installed rasterio: 1.2.10 cfgrib: None iris: 3.2.0.post0 bottleneck: 1.3.2 dask: 2022.02.0 distributed: 2022.02.0 matplotlib: 3.5.1 cartopy: 0.20.2 seaborn: 0.11.2 numbagg: 0.2.1 fsspec: 2022.01.0 cupy: None pint: 0.18 sparse: 0.13.0 setuptools: 59.8.0 pip: 22.0.3 conda: None pytest: 7.0.1 IPython: None sphinx: None ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6333/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
769348008 MDU6SXNzdWU3NjkzNDgwMDg= 4702 Plotting in 2D with one non-dimension coordinate given behaves contrary to documentation and can cause cryptic ValueError stanwest 38358698 open 0     2 2020-12-16T23:33:25Z 2020-12-17T18:35:13Z   CONTRIBUTOR      

What happened:

I called a 2-D plot method with one coordinate specified as a non-dimension coordinate through the x or y argument. The method raised a ValueError that complained about a tuple of duplicate coordinates that I had not provided.

Also, when I specified x and omitted y, the method took y to be the second dimension of the DataArray.

What you expected to happen:

I expected the unspecified x or y argument to behave as when I specified neither argument, so that, for a suitable DataArray da, calling da.plot(x="lon") would have the same Y axis as da.plot(). That expectation is consistent with the docstring common to 2-D plotting methods, where the default values for the x and y arguments are da.dims[1] and da.dims[0], respectively. However, as discussed below, the docstring does not mention that xarray tries to guess the omitted coordinate.

Minimal Complete Verifiable Example:

Executing

```python import numpy as np import xarray as xr

da = xr.DataArray( np.arange(20).reshape(4, 5), dims=("v", "u"), coords={"lat": ("v", np.linspace(0, 30, 4)), "lon": ("u", np.linspace(-20, 20, 5))}, ) da.plot(x="lon") ```

causes (with file paths abbreviated)

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "...\xarray\plot\plot.py", line 446, in __call__ return plot(self._da, **kwargs) File "...\xarray\plot\plot.py", line 200, in plot return plotfunc(darray, **kwargs) File "...\xarray\plot\plot.py", line 707, in newplotfunc darray = darray.transpose(*dims, transpose_coords=True) File "...\xarray\core\dataarray.py", line 2036, in transpose dims = tuple(utils.infix_dims(dims, self.dims)) File "...\xarray\core\utils.py", line 725, in infix_dims raise ValueError( ValueError: ('u', 'u') must be a permuted list of ('v', 'u'), unless `...` is included

Executing da.plot(y="lon") causes the same exception.

Anything else we need to know?:

Two aspects of this issue that I see are the behavior and the documentation.

The behavior in the above example involves xarray trying to guess the other dimension [#1291 in v. 0.9.2]. The guessing logic tests whether the given coordinate equals one of the array's dimension names; here, the given coordinate is a non-dimension coordinate and fails to match. When the match fails, the other coordinate is set to the second dimension, whether the given coordinate was x or y. One possible improvement would be to set the y coordinate to the first dimension instead, as in the following replacement for the last line of the guessing logic:

python y = darray.dims[1] if x == darray.dims[0] else darray.dims[0]

With that change, da.plot(x="lon") and da.plot(y="lat") would succeed and produce the plots I expect, while da.plot(x="lat") and da.plot(y="lon") would raise ValueError. Also, I would find it more helpful if the exception message were more clearly related to the arguments of the call. The _infer_xy_labels function might need to notice that it was about to set x and y to the same dimension name and raise an exception at that point.

The documentation seems not to have been updated for #1291 and does not mention the guessing behavior.

Environment:

Output of <tt>xr.show_versions()</tt> ``` INSTALLED VERSIONS ------------------ commit: None python: 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 94 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: English_United States.1252 libhdf5: 1.10.4 libnetcdf: 4.7.3 xarray: 0.16.1 pandas: 1.1.3 numpy: 1.19.2 scipy: None netCDF4: 1.5.3 pydap: None h5netcdf: 0.8.1 h5py: 2.10.0 Nio: None zarr: None cftime: 1.3.0 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.30.0 distributed: 2.30.1 matplotlib: 3.3.2 cartopy: None seaborn: None numbagg: None pint: None setuptools: 50.3.0.post20201006 pip: 20.2.4 conda: None pytest: None IPython: 7.18.1 sphinx: None ```
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4702/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 21.573ms · About: xarray-datasette