home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

2 rows where milestone = 3801867 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue 2

state 1

  • closed 2

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
279883145 MDU6SXNzdWUyNzk4ODMxNDU= 1764 .groupby_bins fails when data is not contained in bins jbusecke 14314623 closed 0   0.11.1 3801867 2 2017-12-06T19:48:30Z 2019-10-22T14:53:31Z 2019-10-22T14:53:30Z CONTRIBUTOR      

Consider the following example. python import xarray as xr import numpy as np import dask.array as dsa from dask.diagnostics import ProgressBar ```

Groupby bins problem with small bins?

x_raw = np.arange(20) y_raw = np.arange(10) z_raw = np.arange(15)

x = xr.DataArray(dsa.from_array(x_raw, chunks=(-1)), dims=['x'], coords={'x':('x', x_raw)}) y = xr.DataArray(dsa.from_array(y_raw, chunks=(-1)), dims=['y'], coords={'y':('y', y_raw)}) z = xr.DataArray(dsa.from_array(z_raw, chunks=(-1)), dims=['z'], coords={'z':('z', z_raw)})

data = xr.DataArray(dsa.ones([20, 10, 15], chunks=[-1, -1, -1]), dims=['x', 'y', 'z'], coords={ 'x':x, 'y':y, 'z':z }) data <xarray.DataArray 'wrapped-bb05d395159047b749ca855110244cb7' (x: 20, y: 10, z: 15)> dask.array<shape=(20, 10, 15), dtype=float64, chunksize=(20, 10, 15)> Coordinates: * x (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 * y (y) int64 0 1 2 3 4 5 6 7 8 9 * z (z) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

This dask array only contains ones. If I now try to apply groupby_bins with a specified array of bins (which are all below 1) it fails with a rather cryptic error. ```

```

This doesnt work

bins = np.array([0, 20, 40, 60 , 80, 100])*1e-6

binned = data.groupby_bins(data, bins).sum() binned


StopIteration Traceback (most recent call last) <ipython-input-7-dc9283bee4ea> in <module>() 2 bins = np.array([0, 20, 40, 60 , 80, 100])*1e-6 3 ----> 4 binned = data.groupby_bins(data, bins).sum() 5 binned

~/Work/CODE/PYTHON/xarray/xarray/core/common.py in wrapped_func(self, dim, axis, skipna, keep_attrs, kwargs) 20 keep_attrs=False, kwargs): 21 return self.reduce(func, dim, axis, keep_attrs=keep_attrs, ---> 22 skipna=skipna, allow_lazy=True, **kwargs) 23 else: 24 def wrapped_func(self, dim=None, axis=None, keep_attrs=False,

~/Work/CODE/PYTHON/xarray/xarray/core/groupby.py in reduce(self, func, dim, axis, keep_attrs, shortcut, kwargs) 572 def reduce_array(ar): 573 return ar.reduce(func, dim, axis, keep_attrs=keep_attrs, kwargs) --> 574 return self.apply(reduce_array, shortcut=shortcut) 575 576 ops.inject_reduce_methods(DataArrayGroupBy)

~/Work/CODE/PYTHON/xarray/xarray/core/groupby.py in apply(self, func, shortcut, kwargs) 516 applied = (maybe_wrap_array(arr, func(arr, kwargs)) 517 for arr in grouped) --> 518 return self._combine(applied, shortcut=shortcut) 519 520 def _combine(self, applied, shortcut=False):

~/Work/CODE/PYTHON/xarray/xarray/core/groupby.py in _combine(self, applied, shortcut) 520 def _combine(self, applied, shortcut=False): 521 """Recombine the applied objects like the original.""" --> 522 applied_example, applied = peek_at(applied) 523 coord, dim, positions = self._infer_concat_args(applied_example) 524 if shortcut:

~/Work/CODE/PYTHON/xarray/xarray/core/utils.py in peek_at(iterable) 114 """ 115 gen = iter(iterable) --> 116 peek = next(gen) 117 return peek, itertools.chain([peek], gen) 118

StopIteration: ```

If however the last bin includes the value 1 it runs as expected: ```

If I include a larger value at the end it works

bins = np.array([0, 20, 40, 60 , 80, 100, 1e7])*1e-6

binned = data.groupby_bins(data, bins).sum() binned <xarray.DataArray 'wrapped-bb05d395159047b749ca855110244cb7' (wrapped-bb05d395159047b749ca855110244cb7_bins: 6)> dask.array<shape=(6,), dtype=float64, chunksize=(5,)> Coordinates: * wrapped-bb05d395159047b749ca855110244cb7_bins (wrapped-bb05d395159047b749ca855110244cb7_bins) object (0.0, 2e-05] ... ```

Problem description

Is this expected behaviour? I would prefer it if it returned nan values for the bins that capture no values. It took me a bit to find out why my script using this was failing, and if this is expected behavior could a more helpful error message be considered?

Expected Output

Output of xr.show_versions()

# Paste the output here xr.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.6.2.final.0 python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.0rc1-9-gdbf7b01 pandas: 0.20.3 numpy: 1.13.1 scipy: 0.19.1 netCDF4: 1.2.9 h5netcdf: 0.4.1 Nio: None bottleneck: 1.2.1 cyordereddict: None dask: 0.15.4 matplotlib: 2.0.2 cartopy: 0.15.1 seaborn: 0.8.1 setuptools: 36.3.0 pip: 9.0.1 conda: None pytest: 3.2.2 IPython: 6.1.0 sphinx: 1.6.5
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1764/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
281897468 MDU6SXNzdWUyODE4OTc0Njg= 1778 ValueError on empty selection with dask based DataArrays duncanwp 3169620 closed 0   0.11.1 3801867 2 2017-12-13T21:09:42Z 2019-07-12T13:41:08Z 2019-07-12T13:41:08Z CONTRIBUTOR      

Code Sample, a copy-pastable example if possible

```python import xarray as xr import numpy as np

da = xr.DataArray(np.random.rand(15), dims=['latitude'], coords={'latitude':np.linspace(90, -90, 15)})

This gives an empty latitude slice

print(da.sel(latitude=slice(20, 60)))

After converting the DataArray to dask...

da=da.chunk()

...this throws a ValueError due to 'conflicting sizes'

print(da.sel(latitude=slice(20, 60))) ```

Problem description

I would expect the dask based DataArray to return an empty slice just as the numpy one does.

Although arguably it would be nicer if both returned the latitude values between 20 and 60 - regardless of the direction of the coordinate. Perhaps the sel method could check whether the coordinate is increasing or decreasing?

Output of xr.show_versions()

# Paste the output here xr.show_versions() here xarray version: 0.9.6 numpy version: 1.13.3 dask version: 0.15.4
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1778/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 23.085ms · About: xarray-datasette
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows