home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

6 rows where repo = 13221727, state = "closed" and user = 367900 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 3
  • pull 3

state 1

  • closed · 6 ✖

repo 1

  • xarray · 6 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1216178209 PR_kwDOAMm_X8420Do_ 6516 Use new importlib.metadata.entry_points interface where available bcbnz 367900 closed 0     1 2022-04-26T16:06:35Z 2022-04-27T06:01:08Z 2022-04-27T01:07:51Z CONTRIBUTOR   0 pydata/xarray/pulls/6516

With Python 3.10, the entry_points() method returning a SelectableGroups dict interface was deprecated. The preferred way is to now filter by group through a keyword argument.

  • [x] Closes #6514.
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6516/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
856728083 MDU6SXNzdWU4NTY3MjgwODM= 5146 HTML formatting of non-string attribute names fails bcbnz 367900 closed 0     3 2021-04-13T08:36:31Z 2021-11-11T18:21:31Z 2021-11-11T18:21:31Z CONTRIBUTOR      

Working in a notebook (and presumably, anywhere else that uses the HTML formatter to show an array), non-string attribute keys cause an exception. The output then falls back to the repr value.

```python console In [1]: import xarray as xr

In [2]: data = xr.DataArray([1, 2, 3], attrs={1: 3.14})

In [3]: data.attrs Out[3]: {1: 3.14}

In [4]: data python traceback


AttributeError Traceback (most recent call last) /usr/lib/python3.9/site-packages/IPython/core/formatters.py in call(self, obj) 343 method = get_real_method(obj, self.print_method) 344 if method is not None: --> 345 return method() 346 return None 347 else:

~/software/external/xarray/xarray/core/common.py in repr_html(self) 148 if OPTIONS["display_style"] == "text": 149 return f"

{escape(repr(self))}
" --> 150 return formatting_html.array_repr(self) 151 152 def _iter(self: Any) -> Iterator[Any]:

~/software/external/xarray/xarray/core/formatting_html.py in array_repr(arr) 269 sections.append(coord_section(arr.coords)) 270 --> 271 sections.append(attr_section(arr.attrs)) 272 273 return _obj_repr(arr, header_components, sections)

~/software/external/xarray/xarray/core/formatting_html.py in _mapping_section(mapping, name, details_func, max_items_collapse, enabled) 171 return collapsible_section( 172 name, --> 173 details=details_func(mapping), 174 n_items=n_items, 175 enabled=enabled,

~/software/external/xarray/xarray/core/formatting_html.py in summarize_attrs(attrs) 47 48 def summarize_attrs(attrs): ---> 49 attrs_dl = "".join( 50 f"<dt>{escape(k)} :</dt>" f"<dd>{escape(str(v))}</dd>" 51 for k, v in attrs.items()

~/software/external/xarray/xarray/core/formatting_html.py in <genexpr>(.0) 48 def summarize_attrs(attrs): 49 attrs_dl = "".join( ---> 50 f"<dt>{escape(k)} :</dt>" f"<dd>{escape(str(v))}</dd>" 51 for k, v in attrs.items() 52 )

/usr/lib/python3.9/html/init.py in escape(s, quote) 17 translated. 18 """ ---> 19 s = s.replace("&", "&") # Must be done first! 20 s = s.replace("<", "<") 21 s = s.replace(">", ">")

AttributeError: 'int' object has no attribute 'replace' python console Out[4]: <xarray.DataArray (dim_0: 3)> array([1, 2, 3]) Dimensions without coordinates: dim_0 Attributes: 1: 3.14 ```

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: c91983d4765b23e0474231c85057d31f9b6b2f33 python: 3.9.2 (default, Feb 20 2021, 18:40:11) [GCC 10.2.0] python-bits: 64 OS: Linux OS-release: 5.11.13-arch1-1 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_NZ.UTF-8 LOCALE: en_NZ.UTF-8 libhdf5: 1.12.0 libnetcdf: 4.7.4 xarray: 0.17.0 pandas: 1.2.3 numpy: 1.20.1 scipy: 1.6.2 netCDF4: 1.5.6 pydap: None h5netcdf: 0.10.0 h5py: 3.2.1 Nio: None zarr: None cftime: 1.4.1 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.2 cfgrib: None iris: None bottleneck: 1.3.2 dask: 2021.03.0 distributed: 2021.03.0 matplotlib: 3.4.1 cartopy: 0.18.0 seaborn: None numbagg: None pint: None setuptools: 54.2.0 pip: 20.3.1 conda: None pytest: 6.2.3 IPython: 7.22.0 sphinx: 3.5.4
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5146/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
856901056 MDExOlB1bGxSZXF1ZXN0NjE0NDA4MzQz 5149 Convert attribute and dimension names to strings when generating HTML repr bcbnz 367900 closed 0     5 2021-04-13T12:14:03Z 2021-05-04T03:39:00Z 2021-05-04T03:38:53Z CONTRIBUTOR   0 pydata/xarray/pulls/5149

The standard repr() already handled non-string attribute names, but the HTML formatter failed when trying to escape HTML entitites in non-string names. This just calls str() before escape(). It also includes tests for Dataset, DataArray and Variable.

Reported in #5146. ~~Note that there may be a need to do the same for dimension names if they are allowed to be strings. Currently dimensions must be created as strings but can later be renamed to non-strings, see #5148.~~ Dimensions can be non-str, updated.

  • [x] Tests added
  • [x] Passes pre-commit run --all-files
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5149/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
853473276 MDU6SXNzdWU4NTM0NzMyNzY= 5132 Backend caching should not use a relative path bcbnz 367900 closed 0     4 2021-04-08T13:27:03Z 2021-04-15T12:12:26Z 2021-04-15T12:12:26Z CONTRIBUTOR      

Datasets opened from disk are cached with a key based (amongst other things) on their filename. If you have the same filename in different directories, and open them after changing directory, a cache collision occurs as the filename is the same and so the first opened dataset is always returned.

Minimal Complete Verifiable Example:

```python import os from pathlib import Path import tempfile

import numpy as np import xarray as xr

with tempfile.TemporaryDirectory() as d: base = Path(d).resolve()

# Create some data in separate directories but with same filename.
(base / "zeros").mkdir()
z_fn = base / "zeros" / "data.nc"
xr.DataArray(np.zeros((5, 5), dtype=int)).to_netcdf(z_fn)
(base / "ones").mkdir()
o_fn = base / "ones" / "data.nc"
xr.DataArray(np.ones((5, 5), dtype=int)).to_netcdf(o_fn)

# Open with the absolute path and check we get what we expect.
z_abs = xr.open_dataarray(z_fn)
o_abs = xr.open_dataarray(o_fn)
assert (z_abs == 0).all(), "zeros with absolute path incorrect"
assert (o_abs == 1).all(), "zeros with absolute path incorrect"

# Open with relative path from base directory.
os.chdir(base)
z_base = xr.open_dataarray("zeros/data.nc")
o_base = xr.open_dataarray("ones/data.nc")
assert (z_base == 0).all(), "zeros with relative path from base incorrect"
assert (o_base == 1).all(), "zeros with relative path from base incorrect"

# Open from containing directory.
os.chdir(base / "zeros")
z_local = xr.open_dataarray("data.nc")
os.chdir(base / "ones")
o_local = xr.open_dataarray("data.nc")
assert (z_local == 0).all(), "zeros opened from containing dir incorrect"
assert (o_local == 1).all(), "ones opened from containing dir incorrect"

```

What happened: On master, the final assertion is triggered as the cache returns the zeros array instead of the ones.

What you expected to happen: No assertion.

Anything else we need to know?: This was introduced in 50d97e9d. I found this with the above test script (named cache_bug.py) with a Git bisect session:

console $ git bisect start master v0.16.2 Bisecting: 88 revisions left to test after this (roughly 7 steps) [d555172c7d069ca9cf7a9a32bfd5f422be133861] Allow swap_dims to take kwargs (#4841) $ git bisect run python cache_bug.py ... 50d97e9d35bac783850827fa66ff5eb768e62905 is the first bad commit ...

I then manually confirmed this by running the script on 50d97e9d and its parent.

The caching is performed by xarray.backends.file_manager.CachingFileManager. The obvious solution would be to use pathlib / os.path (whichever is preferred in xarray) to convert the paths to absolute before caching. For example, changing the default netCDF4 backend from

https://github.com/pydata/xarray/blob/e56905889c836c736152b11a7e6117a229715975/xarray/backends/netCDF4_.py#L375-L377

to

python manager = CachingFileManager( netCDF4.Dataset, os.path.abspath(filename), mode=mode, kwargs=kwargs )

fixes this for me. I guess this should be done (if needed) by each backend to keep CachingFileManager as general as possible.

If my analysis and proposed solution seems correct, I'm happy to work up a pull request with these fixes and some regression tests.

If you're wondering about the use case where I bumped into this problem: we're using Click for a CLI, and using its test helpers. One of these (isolated_filesystem) creates and changes into an empty temporary directory before running the CLI function under test, so we can use open_dataset("output.nc") to load the CLI output for checking. Since it does this in the same process, using a parametrized test function means the first created file is always loaded for checking. Took a while to track down what was happening!

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: ec4e8b5f279e28588eee8ff43a328ca6c2f89f01 python: 3.9.2 (default, Feb 20 2021, 18:40:11) [GCC 10.2.0] python-bits: 64 OS: Linux OS-release: 5.11.11-arch1-1 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_NZ.UTF-8 LOCALE: en_NZ.UTF-8 libhdf5: 1.12.0 libnetcdf: 4.7.4 xarray: 0.17.0 pandas: 1.2.3 numpy: 1.20.1 scipy: 1.6.2 netCDF4: 1.5.6 pydap: None h5netcdf: 0.9.0 h5py: 3.1.0 Nio: None zarr: None cftime: 1.4.1 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.1 cfgrib: None iris: None bottleneck: 1.3.2 dask: 2021.03.0 distributed: 2021.03.0 matplotlib: 3.4.1 cartopy: 0.18.0 seaborn: None numbagg: None pint: None setuptools: 54.2.0 pip: 20.3.1 conda: None pytest: 6.2.3 IPython: 7.22.0 sphinx: 3.5.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5132/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
808558647 MDExOlB1bGxSZXF1ZXN0NTczNTc5NzUx 4911 Fix behaviour of min_count in reducing functions bcbnz 367900 closed 0     6 2021-02-15T13:53:34Z 2021-02-19T08:12:39Z 2021-02-19T08:12:02Z CONTRIBUTOR   0 pydata/xarray/pulls/4911

The first commit modifies existing tests to check Dask-backed arrays are not computed. It also adds some specific checks that the correct result (NaN or a number as appropriate) is returned and some tests for checking membership of xarray.core.dtypes.NAT_TYPES. After this commit I get 89 test failures, and they seem to cover the cases reported in #4898.

The second commit fixes these failures:

  • The checks of the nan mask in xarray.core.nanops._maybe_null_out are changed to use np.where which allows lazy evaluation.

  • Previously, xarray.core.dtypes.NAT_TYPES was a tuple of datetime64 and timedelta64 instances; I've changed it to a set of the dtypes of these instances. It is only used for the membership check in _maybe_null_out so a set seems appropriate. The previous use of instances rather than dtypes caused a bug -- np.float64 in NAT_TYPES returned true even though it only contained datetime64/timedelta64. This meant that reducing operations over all axes (axis=None or ...) with float64 arrays ignored min_count as the membership check in _maybe_null_out caused it to be skipped.

  • [x] Closes #4898

  • [x] Tests added
  • [x] Passes pre-commit run --all-files
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4911/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
807089005 MDU6SXNzdWU4MDcwODkwMDU= 4898 Sum and prod with min_count forces evaluation bcbnz 367900 closed 0     5 2021-02-12T09:42:06Z 2021-02-19T08:12:02Z 2021-02-19T08:12:01Z CONTRIBUTOR      

If I use the sum method on a lazy array with min_count != None then evaluation is forced. If there is some limitation of the implementation which means it cannot be added to the computation graph for lazy evaluation then this should be mentioned in the docs.

Minimal Complete Verifiable Example:

```python import numpy as np import xarray as xr

def worker(da): if da.shape == (0, 0): return da

raise RuntimeError("I was evaluated")

da = xr.DataArray( np.random.normal(size=(20, 500)), dims=("x", "y"), coords=(np.arange(20), np.arange(500)), )

da = da.chunk(dict(x=5)) lazy = da.map_blocks(worker) result1 = lazy.sum("x", skipna=True) result2 = lazy.sum("x", skipna=True, min_count=5)

```

What happened: RuntimeError: I was evaluated

What you expected to happen: No output or exceptions, as the result1 and result2 arrays are not printed or saved.

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.9.1 (default, Feb 6 2021, 06:49:13) [GCC 10.2.0] python-bits: 64 OS: Linux OS-release: 5.10.15-arch1-1 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_NZ.UTF-8 LOCALE: en_NZ.UTF-8 libhdf5: 1.12.0 libnetcdf: 4.7.4 xarray: 0.16.2 pandas: 1.2.1 numpy: 1.20.0 scipy: 1.6.0 netCDF4: 1.5.5.1 pydap: None h5netcdf: 0.9.0 h5py: 3.1.0 Nio: None zarr: None cftime: 1.4.1 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.0 cfgrib: None iris: None bottleneck: 1.3.2 dask: 2020.12.0 distributed: 2020.12.0 matplotlib: 3.3.4 cartopy: 0.18.0 seaborn: None numbagg: None pint: None setuptools: 53.0.0 pip: 20.3.1 conda: None pytest: 6.2.1 IPython: 7.19.0 sphinx: 3.4.3
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4898/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 89.616ms · About: xarray-datasette