home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 223817102

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/818#issuecomment-223817102 https://api.github.com/repos/pydata/xarray/issues/818 223817102 MDEyOklzc3VlQ29tbWVudDIyMzgxNzEwMg== 1197350 2016-06-05T14:47:12Z 2016-06-05T14:47:12Z MEMBER

@shoyer, @jhamman, could you give me some feedback on one outstanding issue with this PR? I am stuck on a kind of obscure edge case, but I really want to get this finished.

Consider the following groupby operation, which creates bins which are finer than the original coordinate. In other words, some bins are empty because there are too many bins.

python dat = xr.DataArray(np.arange(4)) dim_0_bins = np.arange(0,4.5,0.5) gb = dat.groupby_bins('dim_0', dim_0_bins) print(gb.groups)

gives

{'(0.5, 1]': [1], '(2.5, 3]': [3], '(1.5, 2]': [2]}

If I try a reducing apply operation, e.g. gb.mean(), it works fine. However, if I do

python gb.apply(lambda x: x - x.mean())

I get an error on the concat step

--> 433 combined = self._concat(applied, shortcut=shortcut) ... [long stack trace] IndexError: index 3 is out of bounds for axis 1 with size 3

I'm really not sure what the "correct behavior" should even be in this case. It is not even possible to reconstitute the original data array by doing gb.apply(lambda x: x). The same problem arises when the groups do not span the entire coordinate (e.g. dim_0_bins = [1,2,3]).

Do you have any thoughts / suggestions? I'm not sure I can solve this issue right now, but I would at least like to have a more useful error message.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  146182176
Powered by Datasette · Queries took 1.7ms · About: xarray-datasette