home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

4 rows where repo = 13221727 and user = 9010180 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)

state 2

  • closed 2
  • open 2

type 1

  • issue 4

repo 1

  • xarray · 4 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
894125618 MDU6SXNzdWU4OTQxMjU2MTg= 5329 xarray 0.18.0 raises ValueError, not FileNotFoundError, when opening a non-existent file pont-us 9010180 open 0     8 2021-05-18T08:35:20Z 2022-09-21T18:19:57Z   NONE      

What happened:

In a Python environment with xarray 0.18.0 and python-netcdf4 installed, I called xarray.open_dataset("nonexistent"). (The file "nonexistent" does not exist.) xarray threw a ValueError: cannot guess the engine, try passing one explicitly.

What you expected to happen:

I expected a FileNotFoundError error to be thrown, as in xarray 0.17.0.

Minimal Complete Verifiable Example:

python import xarray as xr xr.open_dataset("nonexistent")

Anything else we need to know?:

This is presumably related to Issue #5295, but is not fixed by PR #5296: ValueError is also thrown with the currently latest commit in master (9165c266).

This change in behaviour produced a hard-to-diagnose bug deep in xcube, where we were catching the FileNotFound exception to deal gracefully with a missing file, but the new ValueError was of course not caught -- and the error message did not make the cause obvious. Catching ValueError is a workaround, but not a great solution since it may also be thrown for files which do exist but don't have a recognizable data format. I suspect that other codebases may be similarly affected.

xarray 0.17.0 was capable of throwing a ValueError for a non-existent file, but only in the (rare?) case that neither netCDF4-python nor scipy was installed.

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.9.4 | packaged by conda-forge | (default, May 10 2021, 22:13:33) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.8.0-53-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 libhdf5: 1.10.6 libnetcdf: 4.8.0 xarray: 0.18.0 pandas: 1.2.4 numpy: 1.20.2 scipy: None netCDF4: 1.5.6 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.4.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None pint: None setuptools: 49.6.0.post20210108 pip: 21.1.1 conda: None pytest: None IPython: None sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5329/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
894497993 MDU6SXNzdWU4OTQ0OTc5OTM= 5331 AttributeError using map_blocks with dask 2021.05.0 pont-us 9010180 closed 0     3 2021-05-18T15:18:53Z 2021-05-19T08:01:07Z 2021-05-19T08:01:07Z NONE      

What happened:

In an environment with xarray 0.18.0 and dask 2021.05.0 installed, I saved a dataset using to_zarr, opened it again using open_zarr, and called map_blocks on one of its variables. I got the following traceback:

Traceback (most recent call last): File "/home/pont/./dasktest2.py", line 12, in <module> ds2.myvar.map_blocks(lambda block: block) File "/home/pont/loc/envs/xcube-repos/lib/python3.9/site-packages/xarray/core/dataarray.py", line 3770, in map_blocks return map_blocks(func, self, args, kwargs, template) File "/home/pont/loc/envs/xcube-repos/lib/python3.9/site-packages/xarray/core/parallel.py", line 565, in map_blocks data = dask.array.Array( File "/home/pont/loc/envs/xcube-repos/lib/python3.9/site-packages/dask/array/core.py", line 1159, in __new__ if layer.collection_annotations is None: AttributeError: 'dict' object has no attribute 'collection_annotations'

What you expected to happen:

I expected map_blocks to complete successfully.

Minimal Complete Verifiable Example:

```python import xarray as xr import numpy as np

ds1 = xr.Dataset({ "myvar": (("x"), np.zeros(10)), "x": ("x", np.arange(10)), })
ds1.to_zarr("test.zarr", mode="w") ds2 = xr.open_zarr("test.zarr") ds2.myvar.map_blocks(lambda block: block) ```

Anything else we need to know?:

I wasn't sure whether to report this issue with dask or xcube. With dask 2021.04.1 the example runs without error, and it seems that dask PR 7309 introduced the breaking change. But my understanding of xarray's map_blocks implementation isn't sufficient to figure out where exactly the bug lies.

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.9.4 | packaged by conda-forge | (default, May 10 2021, 22:13:33) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.8.0-53-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 libhdf5: None libnetcdf: None xarray: 0.18.0 pandas: 1.2.4 numpy: 1.20.2 scipy: None netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.8.1 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2021.05.0 distributed: 2021.05.0 matplotlib: None cartopy: None seaborn: None numbagg: None pint: None setuptools: 49.6.0.post20210108 pip: 21.1.1 conda: None pytest: None IPython: None sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5331/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
868976909 MDU6SXNzdWU4Njg5NzY5MDk= 5224 xarray can't append to Zarrs with byte-string variables pont-us 9010180 open 0     3 2021-04-27T15:30:44Z 2021-04-28T17:15:43Z   NONE      

What happened:

I tried to use xarray to append a Dataset to a Zarr containing a |S1 (char string) datatype, and received this error:

ValueError: Invalid dtype for data variable: <xarray.DataArray 'x' ()> array(b'', dtype='|S1') dtype must be a subtype of number, datetime, bool, a fixed sized string, a fixed size unicode string or an object

What you expected to happen:

I expected the Dataset to be appended to the Zarr.

Minimal Complete Verifiable Example:

Note: this is not quite "minimal", since it also performs the append using the Zarr library directly and using a |U1 (Unicode) datatype in order to demonstrate that these variations work.

```python import numpy as np import xarray as xr import zarr

def test_append(data_type, zarr_path): print(f"Creating {data_type} Zarr...") ds = xr.Dataset({"x": np.array("", dtype=data_type)}) ds.to_zarr(zarr_path, mode="w")

print(f"Appending to {data_type} Zarr with Zarr library...")
zarr_to_append = zarr.open(zarr_path, mode="a")
zarr_to_append.x.append(np.array("", dtype=data_type))

print(f"Appending to {data_type} Zarr with xarray...")
ds_to_append = xr.Dataset({"x": np.array("", dtype=data_type)})
ds_to_append.to_zarr(zarr_path, mode="a")

test_append("|U1", "test-u.zarr") test_append("|S1", "test-s.zarr")

```

Anything else we need to know?:

I came across this problem when converting some NetCDFs from this dataset to a Zarr, appending them along the time axis. The latest data format vesion (1.4) includes a dimensionless variable crs with type char, which xarray reads as an |S1, causing the error described above when I attempt to append. Replacing crs with a |U1-typed variable works around the problem, but is undesirable since we need to reproduce the NetCDFs as closely as possible. The example above shows that the Zarr format and library themselves don't seem to have a problem with appending byte string variables.

The obvious fix would be to loosen the type check in xarray.backends.api._validate_datatypes_for_zarr_append:

python if ( not np.issubdtype(var.dtype, np.number) and not np.issubdtype(var.dtype, np.datetime64) and not np.issubdtype(var.dtype, np.bool_) and not coding.strings.is_unicode_dtype(var.dtype) and not coding.strings.is_bytes_dtype(var.dtype) # <- this line added to avoid "Invalid dtype" error and not var.dtype == object ):

This change makes the example above work, but I don't know if it would result in any unintended side-effects.

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: 0021cdab91f7466f4be0fb32dae92bf3f8290e19 python: 3.8.5 (default, Jan 27 2021, 15:41:15) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.8.0-50-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.7.3 xarray: 0.15.0 pandas: 0.25.3 numpy: 1.17.4 scipy: 1.3.3 netCDF4: 1.5.3 pydap: None h5netcdf: 0.7.1 h5py: 2.10.0 Nio: None zarr: 2.4.0+ds cftime: 1.1.0 nc_time_axis: None PseudoNetCDF: None rasterio: 1.1.3 cfgrib: None iris: None bottleneck: 1.2.1 dask: 2.8.1+dfsg distributed: None matplotlib: 3.1.2 cartopy: None seaborn: None numbagg: None pint: None setuptools: 45.2.0 pip: 20.0.2 conda: None pytest: 4.6.9 IPython: 7.13.0 sphinx: 1.8.5
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5224/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
844712857 MDU6SXNzdWU4NDQ3MTI4NTc= 5093 open_dataset uses cftime, not datetime64, when calendar attribute is "Gregorian" pont-us 9010180 closed 0     2 2021-03-30T15:12:09Z 2021-04-20T14:17:42Z 2021-04-18T10:17:08Z NONE      

What happened:

I used xarray.open_dataset to open a NetCDF file whose time coordinate had the calendar attribute set to Gregorian. All dates were within the Timestamp-valid range.

The resulting dataset represented the time co-ordinate as a cftime._cftime.DatetimeGregorian.

What you expected to happen:

I expected the dataset to represent the time co-ordinate as a datetime64[ns], as documented here and here.

Minimal Complete Verifiable Example:

```python import xarray as xr import numpy as np import pandas as pd

def print_time_type(dataset): print(dataset.time.dtype, type(dataset.time[0].item()))

da = xr.DataArray( data=[32, 16, 8], dims=["time"], coords=dict( time=pd.date_range("2014-09-06", periods=3), reference_time=pd.Timestamp("2014-09-05"), ), )

Create dataset and confirm type of time

ds1 = xr.Dataset({"myvar": da}) print_time_type(ds1) # prints "datetime64[ns]" <class 'int'>

Manually set time attributes to "Gregorian" rather

than default "proleptic_gregorian".

ds1.time.encoding["calendar"] = "Gregorian" ds1.reference_time.encoding["calendar"] = "Gregorian" ds1.to_netcdf("test-capitalized.nc")

ds2 = xr.open_dataset("test-capitalized.nc") print_time_type(ds2)

prints "object <class 'cftime._cftime.DatetimeGregorian'>"

Workaround: add "Gregorian" to list of standard calendars.

xr.coding.times._STANDARD_CALENDARS.add("Gregorian") ds3 = xr.open_dataset("test-capitalized.nc") print_time_type(ds3) # prints "datetime64[ns]" <class 'int'> ```

Anything else we need to know?:

The documentation for the use_cftime parameter of open_dataset says:

If None (default), attempt to decode times to np.datetime64[ns] objects; if this is not possible, decode times to cftime.datetime objects.

In practice, we are getting some cftime.datetimes even for times which are interpretable and representable as np.datetime64[ns]s. In particular, we have some NetCDF files in which the time variable has a calendar attribute with a value of Gregorian (with a capital ‘G’). CF conventions allow this:

When this standard defines string attributes that may take various prescribed values, the possible values are generally given in lower case. However, applications programs should not be sensitive to case in these attributes.

However, xarray regards Gregorian as a non-standard calendar and falls back to cftime.datetime. If (as in the example) Gregorian is added to xr.coding.times._STANDARD_CALENDARS, the times are read as np.datetime64[ns]s.

Suggested fix: in xarray.coding.times._decode_datetime_with_pandas, change ‘if calendar not in _STANDARD_CALENDARS:’ to ‘if calendar.lower() not in _STANDARD_CALENDARS:’.

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.9.2 | packaged by conda-forge | (default, Feb 21 2021, 05:02:46) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.8.0-48-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.17.1.dev39+g45b4436b pandas: 1.2.3 numpy: 1.20.2 scipy: None netCDF4: 1.5.6 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.4.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None pint: None setuptools: 49.6.0.post20210108 pip: 21.0.1 conda: None pytest: None IPython: None sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5093/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 1833.31ms · About: xarray-datasette