home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 233794061

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/906#issuecomment-233794061 https://api.github.com/repos/pydata/xarray/issues/906 233794061 MDEyOklzc3VlQ29tbWVudDIzMzc5NDA2MQ== 6213168 2016-07-19T23:11:57Z 2016-07-19T23:11:57Z MEMBER

this workaround works:

python index2 = pandas.MultiIndex( levels=[['x0', 'x1'], ['first', 'second', 'third', 'fourth']], labels=[[0,0,0,0,1,1,1,1], [0,1,2,3,0,1,2,3]], names=['x', 'count']) xarray.DataArray(pandas.Series(list(range(8)), index2)).unstack('dim_0')

<xarray.DataArray (x: 2, count: 4)> array([[0, 1, 2, 3], [4, 5, 6, 7]], dtype=int64) Coordinates: * x (x) object 'x0' 'x1' * count (count) object 'first' 'second' 'third' 'fourth'

However, I think that the whole thing is incredibly convoluted. Namely, because everything looks good both if you visualize the original pandas Series/DataFrame, as well as the stacked DataArray. unstack() is causing an internal technicality of pandas to produce real change in the data.

I came through this issue because I am using pandas to load a multi-index CSV from disk, and then convert it to a n-dimensional xarray. In this situation, I have no control over the multiindex - short of manually rebuilding it after the CSV load. The pandas dataframe looks right, the stacked xarray looks right, the unstacked xarray gets magically sorted :$

Also I don't understand why you say there's no performance implications. You're basically doing a pick-by-index rebuild of the array, which does potentially random access to the whole input array - thus nullifying the benefits of the CPU cache. This is compared to a numpy.ndarray.reshape(), which has the cost of a memcpy().

I was going to add something about doing pick-by-index with a dask array will be even worse, when I realised that multiindex does not work at all when you chunk()... :(

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  166439490
Powered by Datasette · Queries took 0.827ms · About: xarray-datasette