issue_comments: 324586771
This data as json
| html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
|---|---|---|---|---|---|---|---|---|---|---|---|
| https://github.com/pydata/xarray/issues/1521#issuecomment-324586771 | https://api.github.com/repos/pydata/xarray/issues/1521 | 324586771 | MDEyOklzc3VlQ29tbWVudDMyNDU4Njc3MQ== | 6213168 | 2017-08-24T09:41:22Z | 2017-08-24T09:42:16Z | MEMBER |
This also leads to another inefficiency of open_dataset(chunks=...), where you may have your data e.g. shape=(50000, 2**30), chunks=(1, 2**30). If you pass the chunks above to open_dataset, it will break down the coords on the first dim into dask arrays of 1 element - which hardly benefits anybody. Things get worse if the dataset is compressed with zlib or whatever, but only the data vars were chunked at the moment of writing. Am I correct in understanding that the whole coord var will be read from disk 50000 times over? |
{
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
252541496 |