html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1603#issuecomment-557579503,https://api.github.com/repos/pydata/xarray/issues/1603,557579503,MDEyOklzc3VlQ29tbWVudDU1NzU3OTUwMw==,2067093,2019-11-22T15:34:57Z,2019-11-22T15:34:57Z,NONE,"> Thanks @NowanIlfideme for your feedback.
>
> Could you perhaps share a gist of code related to your use case?
The first example in this comment is similar to my use case: https://github.com/pydata/xarray/issues/3213#issuecomment-520741706 . There are several ""core"" dimensions, but some part of the coordinates may be hierarchical or cross-defined (e.g. country > province > city > building, but also country > province > voting district > building). We might have a full or nearly-full panel in the MultiIndex representation, but have a huge cross product (even if we keep strictly hierarchical dimensions out).
Meanwhile using a true COO sparse representation (as I understand it) will likely end up with slower operations overall, since nearly all machine learning models (think: linear regression) require a dense array input anyways.
I'll make an example of this when I find some free time, along with a contrasting one in Pandas. :)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,262642978
https://github.com/pydata/xarray/issues/1603#issuecomment-557563566,https://api.github.com/repos/pydata/xarray/issues/1603,557563566,MDEyOklzc3VlQ29tbWVudDU1NzU2MzU2Ng==,2067093,2019-11-22T14:59:29Z,2019-11-22T14:59:29Z,NONE,"I've noticed that basically all my current troubles with xarray lead to this issue (lack of MultiIndex support). I use xarray for machine learning/data science/econometrics. My current problem requires a semi-hierarchical indexing on one of the dimensions, and slicing/aggregation along some levels of those dimensions.
My first attempt was to just assume each dimension was orthogonal, which resulted in out-of-memory errors. I ended up using a MultiIndex for the hierarchy dimension to have a ""dense"" representation of a sparse subspace. Unfortunately, currently `.sel()` and such will cut out MultiIndex dimensions, and I've had to do boolean masking to keep all the dimensions I need.
Multidimensional groupby, especially within the MultiIndex, is a headache as it currently stands. I had to resort to making auxilliary dimensions with one-hot encoded levels (dummy variables) and doing multiply-aggregate operations by hand.
`xarray` is really beautiful and should be used more by data scientists, but it's really difficult to recommend it to colleagues when not all the familiar `pandas`-style operations are supported.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,262642978