home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

2 rows where "created_at" is on date 2021-07-03, repo = 13221727 and user = 35968931 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 1
  • pull 1

state 2

  • closed 1
  • open 1

repo 1

  • xarray · 2 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
936313924 MDExOlB1bGxSZXF1ZXN0NjgzMDY3OTU5 5571 Rely on NEP-18 to dispatch to dask in duck_array_ops TomNicholas 35968931 closed 0     20 2021-07-03T19:24:33Z 2022-07-09T18:12:05Z 2021-09-29T17:48:40Z MEMBER   0 pydata/xarray/pulls/5571

Removes special-casing for dask in duck_array_ops.py, instead relying on NEP-18 to call it when the input is a dask array.

Probably actually don't need the _dask_or_eager_func() (now _module_func()) helper function at all, because all remaining instances look like pandas_isnull = _module_func("isnull", module=pd), which could just be pandas_isnull = pd.isnull.

Only problem is that I seem to have broken one (parameterized) test: test_duck_array_ops.py::test_min_count[True-True-None-sum-True-bool_-1] fails with

```python @pytest.mark.parametrize("dim_num", [1, 2]) @pytest.mark.parametrize("dtype", [float, int, np.float32, np.bool_]) @pytest.mark.parametrize("dask", [False, True]) @pytest.mark.parametrize("func", ["sum", "prod"]) @pytest.mark.parametrize("aggdim", [None, "x"]) @pytest.mark.parametrize("contains_nan", [True, False]) @pytest.mark.parametrize("skipna", [True, False, None]) def test_min_count(dim_num, dtype, dask, func, aggdim, contains_nan, skipna): if dask and not has_dask: pytest.skip("requires dask")

    da = construct_dataarray(dim_num, dtype, contains_nan=contains_nan, dask=dask)
    min_count = 3

    # If using Dask, the function call should be lazy.
    with raise_if_dask_computes():
      actual = getattr(da, func)(dim=aggdim, skipna=skipna, min_count=min_count)

/home/tegn500/Documents/Work/Code/xarray/xarray/tests/test_duck_array_ops.py:578:


/home/tegn500/Documents/Work/Code/xarray/xarray/core/common.py:56: in wrapped_func return self.reduce(func, dim, axis, skipna=skipna, kwargs) /home/tegn500/Documents/Work/Code/xarray/xarray/core/dataarray.py:2638: in reduce var = self.variable.reduce(func, dim, axis, keep_attrs, keepdims, kwargs) /home/tegn500/Documents/Work/Code/xarray/xarray/core/variable.py:1725: in reduce data = func(self.data, kwargs) /home/tegn500/Documents/Work/Code/xarray/xarray/core/duck_array_ops.py:328: in f return func(values, axis=axis, kwargs) /home/tegn500/Documents/Work/Code/xarray/xarray/core/nanops.py:106: in nansum a, mask = _replace_nan(a, 0) /home/tegn500/Documents/Work/Code/xarray/xarray/core/nanops.py:23: in _replace_nan mask = isnull(a) /home/tegn500/Documents/Work/Code/xarray/xarray/core/duck_array_ops.py:83: in isnull return pandas_isnull(data) /home/tegn500/Documents/Work/Code/xarray/xarray/core/duck_array_ops.py:40: in f return getattr(module, name)(args, kwargs) /home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/pandas/core/dtypes/missing.py:127: in isna return _isna(obj) /home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/pandas/core/dtypes/missing.py:166: in _isna return _isna_ndarraylike(np.asarray(obj), inf_as_na=inf_as_na) /home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/numpy/core/_asarray.py:102: in asarray return array(a, dtype, copy=False, order=order) /home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/dask/array/core.py:1502: in array x = self.compute() /home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/dask/base.py:285: in compute (result,) = compute(self, traverse=False, kwargs) /home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/dask/base.py:567: in compute results = schedule(dsk, keys, *kwargs)


self = <xarray.tests.CountingScheduler object at 0x7f0804db2310> dsk = {('xarray-<this-array>-29953318277423606f95b509ad1a9aa7', 0): array([False, False, False, False], dtype=object), ('xar...pe=object), ('xarray-<this-array>-29953318277423606f95b509ad1a9aa7', 3): array([nan, False, False, nan], dtype=object)} keys = [[('xarray-<this-array>-29953318277423606f95b509ad1a9aa7', 0), ('xarray-<this-array>-29953318277423606f95b509ad1a9aa7'...array-<this-array>-29953318277423606f95b509ad1a9aa7', 2), ('xarray-<this-array>-29953318277423606f95b509ad1a9aa7', 3)]] kwargs = {}

def __call__(self, dsk, keys, **kwargs):
    self.total_computes += 1
    if self.total_computes > self.max_computes:
      raise RuntimeError(
            "Too many computes. Total: %d > max: %d."
            % (self.total_computes, self.max_computes)
        )

E RuntimeError: Too many computes. Total: 1 > max: 0.

/home/tegn500/Documents/Work/Code/xarray/xarray/tests/init.py:118: RuntimeError ```

  • [x] Closes #5559
  • [x] Tests added
  • [x] Passes pre-commit run --all-files
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5571/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
936305081 MDU6SXNzdWU5MzYzMDUwODE= 5570 assert_equal does not handle wrapped duck arrays well TomNicholas 35968931 open 0     0 2021-07-03T18:27:11Z 2021-07-03T18:49:57Z   MEMBER      

Whilst trying to fix #5559 I noticed that xarray.testing.assert_equal (and xarray.testing.assert_equal) don't behave well with wrapped duck-typed arrays.

Firstly, they can give unhelpful AssertionError messages:

```python In [5]: a = np.array([1,2,3])

In [6]: q = pint.Quantity([1,2,3], units='m')

In [7]: da_np = xr.DataArray(a, dims='x')

In [8]: da_p = xr.DataArray(q, dims='x')

In [9]: da_np Out[9]: <xarray.DataArray (x: 3)> array([1, 2, 3]) Dimensions without coordinates: x

In [10]: da_p Out[10]: <xarray.DataArray (x: 3)> <Quantity([1 2 3], 'meter')> Dimensions without coordinates: x

In [11]: from xarray.testing import assert_equal

In [12]: assert_equal(da_np, da_p) /home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/xarray/core/duck_array_ops.py:265: UnitStrippedWarning: The unit of the quantity is stripped when downcasting to ndarray. flag_array = (arr1 == arr2) | (isnull(arr1) & isnull(arr2)) /home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/xarray/core/duck_array_ops.py:265: DeprecationWarning: elementwise comparison failed; this will raise an error in the future. flag_array = (arr1 == arr2) | (isnull(arr1) & isnull(arr2)) /home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/xarray/core/duck_array_ops.py:265: UnitStrippedWarning: The unit of the quantity is stripped when downcasting to ndarray. flag_array = (arr1 == arr2) | (isnull(arr1) & isnull(arr2)) /home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/xarray/core/duck_array_ops.py:265: DeprecationWarning: elementwise comparison failed; this will raise an error in the future. flag_array = (arr1 == arr2) | (isnull(arr1) & isnull(arr2)) /home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/numpy/core/_asarray.py:102: UnitStrippedWarning: The unit of the quantity is stripped when downcasting to ndarray. return array(a, dtype, copy=False, order=order)


AssertionError Traceback (most recent call last) <ipython-input-12-33b16d6b79ed> in <module> ----> 1 assert_equal(da_np, da_p)

[... skipping hidden 1 frame]

~/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/xarray/testing.py in assert_equal(a, b) 79 assert type(a) == type(b) 80 if isinstance(a, (Variable, DataArray)): ---> 81 assert a.equals(b), formatting.diff_array_repr(a, b, "equals") 82 elif isinstance(a, Dataset): 83 assert a.equals(b), formatting.diff_dataset_repr(a, b, "equals")

AssertionError: Left and right DataArray objects are not equal

Differing values: L array([1, 2, 3]) R array([1, 2, 3]) `` These are different, but not because the array values are different. At the moment.valuesis converting the wrapped array type by stripping the units too - it might be better to check the type of the wrapped array first, then use.valuesto compare. Or could we even do duck-typed testing by delegating viaexpected.data.equals(actual.data)? (EDIT: I don't think a.equals()method exists in the numpy API, but you could do the equivalent ofassert all(expected.data == actual.data)`

Secondly, given that we coerce before comparison, I think it's possible that assert_equal could say two different wrapped duck-type arrays are equal when they are not, just because np.asarray() coerces them to the same values.

EDIT2: Looks like there is some discussion here

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5570/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 244.483ms · About: xarray-datasette