home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

2 rows where comments = 17, repo = 13221727 and user = 1197350 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue 2

state 1

  • closed 2

repo 1

  • xarray · 2 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
180516114 MDU6SXNzdWUxODA1MTYxMTQ= 1026 multidim groupby on dask arrays: dask.array.reshape error rabernat 1197350 closed 0     17 2016-10-02T14:55:25Z 2018-05-24T17:59:31Z 2018-05-24T17:59:31Z MEMBER      

If I try to run a groupby operation using a multidimensional group, I get an error from dask about "dask.array.reshape requires that reshaped dimensions after the first contain at most one chunk".

This error is arises with dask 0.11.0 but NOT dask 0.8.0.

Consider the following test example:

``` python import dask.array as da import xarray as xr

nz, ny, nx = (10,20,30) data = da.ones((nz,ny,nx), chunks=(5,ny,nx)) coord_2d = da.random.random((ny,nx), chunks=(ny,nx))>0.5 ds = xr.Dataset({'thedata': (('z','y','x'), data)}, coords={'thegroup': (('y','x'), coord_2d)})

this works fine

ds.thedata.groupby('thegroup') ```

Now I rechunk one of the later dimensions and group again:

python ds.chunk({'x': 5}).thedata.groupby('thegroup')

This raises the following error and stack trace

``` ValueError Traceback (most recent call last) <ipython-input-16-1b0095ee24a0> in <module>() ----> 1 ds.chunk({'x': 5}).thedata.groupby('thegroup')

/Users/rpa/RND/open_source/xray/xarray/core/common.pyc in groupby(self, group, squeeze) 343 if isinstance(group, basestring): 344 group = self[group] --> 345 return self.groupby_cls(self, group, squeeze=squeeze) 346 347 def groupby_bins(self, group, bins, right=True, labels=None, precision=3,

/Users/rpa/RND/open_source/xray/xarray/core/groupby.pyc in init(self, obj, group, squeeze, grouper, bins, cut_kwargs) 170 # the copy is necessary here, otherwise read only array raises error 171 # in pandas: https://github.com/pydata/pandas/issues/12813> --> 172 group = group.stack({stacked_dim_name: orig_dims}).copy() 173 obj = obj.stack({stacked_dim_name: orig_dims}) 174 self._stacked_dim = stacked_dim_name

/Users/rpa/RND/open_source/xray/xarray/core/dataarray.pyc in stack(self, dimensions) 857 DataArray.unstack 858 """ --> 859 ds = self._to_temp_dataset().stack(dimensions) 860 return self._from_temp_dataset(ds) 861

/Users/rpa/RND/open_source/xray/xarray/core/dataset.pyc in stack(self, **dimensions) 1359 result = self 1360 for new_dim, dims in dimensions.items(): -> 1361 result = result._stack_once(dims, new_dim) 1362 return result 1363

/Users/rpa/RND/open_source/xray/xarray/core/dataset.pyc in _stack_once(self, dims, new_dim) 1322 shape = [self.dims[d] for d in vdims] 1323 exp_var = var.expand_dims(vdims, shape) -> 1324 stacked_var = exp_var.stack(**{new_dim: dims}) 1325 variables[name] = stacked_var 1326 else:

/Users/rpa/RND/open_source/xray/xarray/core/variable.pyc in stack(self, **dimensions) 801 result = self 802 for new_dim, dims in dimensions.items(): --> 803 result = result._stack_once(dims, new_dim) 804 return result 805

/Users/rpa/RND/open_source/xray/xarray/core/variable.pyc in _stack_once(self, dims, new_dim) 771 772 new_shape = reordered.shape[:len(other_dims)] + (-1,) --> 773 new_data = reordered.data.reshape(new_shape) 774 new_dims = reordered.dims[:len(other_dims)] + (new_dim,) 775

/Users/rpa/anaconda/lib/python2.7/site-packages/dask/array/core.pyc in reshape(self, *shape) 1101 if len(shape) == 1 and not isinstance(shape[0], Number): 1102 shape = shape[0] -> 1103 return reshape(self, shape) 1104 1105 @wraps(topk)

/Users/rpa/anaconda/lib/python2.7/site-packages/dask/array/core.pyc in reshape(array, shape) 2585 2586 if any(len(c) != 1 for c in array.chunks[ndim_same+1:]): -> 2587 raise ValueError('dask.array.reshape requires that reshaped ' 2588 'dimensions after the first contain at most one chunk') 2589

ValueError: dask.array.reshape requires that reshaped dimensions after the first contain at most one chunk ```

I am using the latest xarray master and dask version 0.11.0. Note that the example works fine if I use an earlier version of dask (e.g. 0.8.0, the only other one I tested.) This suggests an upstream issue with dask, but I wanted to bring it up here first.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1026/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
293913247 MDU6SXNzdWUyOTM5MTMyNDc= 1882 xarray tutorial at SciPy 2018? rabernat 1197350 closed 0     17 2018-02-02T14:52:11Z 2018-04-09T20:30:13Z 2018-04-09T20:30:13Z MEMBER      

It would be great to hold an xarray tutorial at SciPy 2018. Xarray has matured a lot recently, and it would be great to raise awareness of what it can do among the broader scipy community.

From the conference website:

Tutorials should be focused on covering a well-defined topic in a hands-on manner. We want to see attendees coding! We encourage submissions to be designed to allow at least 50% of the time for hands-on exercises even if this means the subject matter needs to be limited. Tutorials will be 4 hours in duration. In your tutorial application, you can indicate what prerequisite skills and knowledge will be needed for your tutorial, and the approximate expected level of knowledge of your students (i.e., beginner, intermediate, advanced).

I'm curious if anyone was already planning on submitting a tutorial. If not, let's put together a team. @jhamman has indicated interest in participating in, but not leading, the tutorial. Anyone else interested?

xref pangeo-data/pangeo#97

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1882/reactions",
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 38.287ms · About: xarray-datasette