home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 398240724

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/2237#issuecomment-398240724 https://api.github.com/repos/pydata/xarray/issues/2237 398240724 MDEyOklzc3VlQ29tbWVudDM5ODI0MDcyNA== 1197350 2018-06-19T00:57:44Z 2018-06-19T00:57:44Z MEMBER

With groupby in xarray, we have two main cases:

  1. groupby with reduction -- (e.g. ds.groupby('baz').mean(dim='x')). There is currently no problem here. The new dimension becomes baz and the array is chunked as {'baz': 1}.
  2. groupby with no reduction -- (e.g. ds.groubpy('baz').apply(lambda x: x - x.mean())). In this case, the point of the out-of-order indexing is actually to put the array back together in its original order. In my last example above, according to the dot graph, it looks like there are four chunks right up until the end. They just have to be re-ordered. I imagine this should be cheap and simple, but I am probably overlooking something.

Case 2 seems similar to @shoyer's example: x[np.arange(4)[::-1]. Here we would just want to reorder the existing chunks.

If the chunk size before reindexing is not 1, then yes, one needs to do something more sophisticated. But I would argue that, if the array is being re-indexed along a dimension in which the chunk size is 1, a sensible default behavior would be to avoid aggregating into a big chunk and instead just pass the original chunks though in a new order.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  333312849
Powered by Datasette · Queries took 0.653ms · About: xarray-datasette