home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 233797167

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/906#issuecomment-233797167 https://api.github.com/repos/pydata/xarray/issues/906 233797167 MDEyOklzc3VlQ29tbWVudDIzMzc5NzE2Nw== 1217238 2016-07-19T23:29:57Z 2016-07-19T23:29:57Z MEMBER

You're basically doing a pick-by-index rebuild of the array, which does potentially random access to the whole input array - thus nullifying the benefits of the CPU cache. This is compared to a numpy.ndarray.reshape(), which has the cost of a memcpy().

This is true, but in the worst case (e.g., random order for the MultiIndex) we'll have this issue no matter what rule we pick for assigning unstacked coordinates.

I was going to add something about doing pick-by-index with a dask array will be even worse, when I realised that multiindex does not work at all when you chunk()... :(

MultiIndex should work with dask -- we have a few tests for this. If not, a bug report would be appreciated!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  166439490
Powered by Datasette · Queries took 0.545ms · About: xarray-datasette