issue_comments: 1013887301
This data as json
| html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
|---|---|---|---|---|---|---|---|---|---|---|---|
| https://github.com/pydata/xarray/issues/3213#issuecomment-1013887301 | https://api.github.com/repos/pydata/xarray/issues/3213 | 1013887301 | IC_kwDOAMm_X848brFF | 40465719 | 2022-01-16T14:35:29Z | 2022-01-16T14:40:13Z | NONE | I would prefer to retain the dense representation, but with tricks to keep the data of sparse type in memory. Look at the following example with pandas multiindex & sparse dtype:
The dense data uses ~40 MB of memory, while the dense representation with sparse dtypes uses only ~0.5 kB of memory! And while you can import dataframes with the sparse=True keyword, the size seems to be displayed inaccurately (both are the same size?), and we cannot examine the data like we can with pandas multiindex + sparse dtype:
Besides, a lot of operations are not available on sparse xarray data variables (i.e. if I wanted to group by price level for ffill & downsampling):
So, it would be nice if xarray adopted pandas’ approach of unstacking sparse data. In the end, you could extract all the non-NaN values and write them to a sparse storage format, such as TileDB sparse arrays. cc: @stavrospapadopoulos |
{
"total_count": 2,
"+1": 2,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
479942077 |


