home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 191545231

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/780#issuecomment-191545231 https://api.github.com/repos/pydata/xarray/issues/780 191545231 MDEyOklzc3VlQ29tbWVudDE5MTU0NTIzMQ== 1217238 2016-03-03T02:17:36Z 2016-03-03T02:17:36Z MEMBER

So I'm actually not sure whether to call this a bug or a feature. But I can explain why it works this way and maybe we can come up with something better.

With DataArray.to_series(), we are indeed careful to output the hierarchical index in the same order as the array dimensions. So it works there.

But on a Dataset, we don't necessarily have a unique ordering for the dimensions, because in general (though somewhat rarely in practice) the ordering of dimensions can differ between variables. This is why Dataset.dims returns a SortedKeysDict -- to avoid any implicit state derived off the order in which dimensions were added.

When converting a DataFrame, we currently build the MultiIndex independently of the data variables, so somewhat logically we simply take dimensions in sorted order. It might make more sense, though, to instead order levels in order of appearance on Dataset (non-index?) variables. I do try to avoid making heuristic choices like this, though, which is why it didn't make it into xarray already.

This code is pretty self-contained if you want to experiment and/or put together a PR: https://github.com/pydata/xarray/blob/v0.7.1/xarray/core/dataset.py#L1858-L1872

Basically, you need to ensure that ordered_dims is an OrderedDict with keys in the order you want for the resulting DataFrame.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  137920337
Powered by Datasette · Queries took 0.668ms · About: xarray-datasette