home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

2 rows where state = "closed" and user = 29104956 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue 2

state 1

  • closed · 2 ✖

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1226931933 I_kwDOAMm_X85JIX7d 6576 Basic examples for creating data structures fail type-checking rsokl 29104956 closed 0     2 2022-05-05T16:42:00Z 2022-05-27T18:01:33Z 2022-05-27T18:01:33Z NONE      

What happened?

The examples provided by this documentation reveal issues with the type-annotations for DataArray and Dataset. Running mypy and pyright on these basic use-cases, only slightly modified, produce type-checking errors.

What did you expect to happen?

The annotations for these classes should accommodate these common use-cases.

Minimal Complete Verifiable Example

```Python

run mypy or pyright on the following file to reproduce the errors

import numpy as np import xarray as xr import pandas as pd

data = np.random.rand(4, 3) locs = ["IA", "IL", "IN"] times = pd.date_range("2000-01-01", periods=4)

foo = xr.DataArray( data, coords=[times, locs], # error: List item 1 has incompatible type "List[str]"; expected "Tuple[Any, ...]" dims=["time", "space"], )

temp = 15 + 8 * np.random.randn(2, 2, 3) precip = 10 * np.random.rand(2, 2, 3) lon = [[-99.83, -99.32], [-99.79, -99.23]] lat = [[42.25, 42.21], [42.63, 42.59]]

A = { "temperature": (["x", "y", "time"], temp), "precipitation": (["x", "y", "time"], precip), }

C = { "lon": (["x", "y"], lon), "lat": (["x", "y"], lat), "time": pd.date_range("2014-09-06", periods=3), "reference_time": pd.Timestamp("2014-09-05"), }

ds = xr.Dataset( A, # error: Argument 1 to "Dataset" has incompatible type "Dict[str, Tuple[List[str], Any]]"; expected "Optional[Mapping[Hashable, Any]]" coords=C, # error: Argument "coords" to "Dataset" has incompatible type "Dict[str, Any]"; expected "Optional[Mapping[Hashable, Any]]" ) ```

MVCE confirmation

  • [x] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [x] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [x] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [x] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

Some of these errors are circumvented when one provides a literal inline, and thus exploit bidrectional inference, which may be why the current mypy tests ran in your CI miss these.

E.g.

```python from typing import Dict, Hashable, Any

def f(x: Dict[Hashable, Any]): ...

f({"hi": 1}) # this is ok -- uses bidirectional inference to see Dict[Hashable, Any]

x = {"hi": 1} f(x) # error: Dict[Hashable, Any] is invariant in Hashable, and is incompatible with str ```

This is a sticky situation as key is invariant even in Mapping: https://github.com/python/typing/issues/445. IMHO it would be great to tweak these annotations, e.g. Hashable -> Hashable | str | <other common coord types> to ensure that users don't face such false positives.

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:18) [GCC 10.3.0] python-bits: 64 OS: Linux OS-release: 4.15.0-153-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 0.19.0 pandas: 1.3.3 numpy: 1.20.3 scipy: 1.7.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: None distributed: None matplotlib: 3.5.2 cartopy: None seaborn: None numbagg: None pint: None setuptools: 59.5.0 pip: 21.3 conda: None pytest: 6.2.5 IPython: 7.28.0 sphinx: 4.5.0
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6576/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
634869703 MDU6SXNzdWU2MzQ4Njk3MDM= 4131 Why am I able to load data from a closed dataset? rsokl 29104956 closed 0     10 2020-06-08T19:14:46Z 2022-04-05T18:35:06Z 2022-04-05T18:35:06Z NONE      

I don't understand why I am able to open and close a dataset, but then proceed to read data from said dataset.

I can open a 4 GB dataset and promptly close is, and then still access the data within, which appears to still be loading lazily. Does querying a closed dataset automatically reopen it?

MCVE Code Sample

```python import numpy as np import xarray as xr

ds = xr.Dataset({"foo": (("x",), np.random.rand(4,))}, coords={"x": [10, 20, 30, 40]}) ds.to_netcdf("tmp_example.nc") python

data = xr.open_dataset("tmp_example.nc") data.close() data.foo <xarray.DataArray 'foo' (x: 4)> array([0.894788, 0.017935, 0.696086, 0.827004]) Coordinates: * x (x) int64 10 20 30 40 ```

Expected Output

Because netCDF data sets are loaded lazily, I would imagine that, having not been touched when opened, that closing the data set would render it inaccessible

Versions

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Mar 27 2019, 22:11:17) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.4.0-166-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.2 libnetcdf: 4.6.1 xarray: 0.15.0 pandas: 1.0.3 numpy: 1.16.3 scipy: 1.4.1 netCDF4: 1.4.1 pydap: None h5netcdf: None h5py: 2.8.0 Nio: None zarr: None cftime: 1.1.3 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.14.0 distributed: None matplotlib: 3.0.3 cartopy: None seaborn: None numbagg: None setuptools: 46.1.3.post20200330 pip: 20.0.2 conda: None pytest: 5.4.1 IPython: 7.5.0 sphinx: 2.4.4
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4131/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 19.125ms · About: xarray-datasette