issue_comments: 324586771
This data as json
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/pydata/xarray/issues/1521#issuecomment-324586771 | https://api.github.com/repos/pydata/xarray/issues/1521 | 324586771 | MDEyOklzc3VlQ29tbWVudDMyNDU4Njc3MQ== | 6213168 | 2017-08-24T09:41:22Z | 2017-08-24T09:42:16Z | MEMBER |
This also leads to another inefficiency of open_dataset(chunks=...), where you may have your data e.g. shape=(50000, 2**30), chunks=(1, 2**30). If you pass the chunks above to open_dataset, it will break down the coords on the first dim into dask arrays of 1 element - which hardly benefits anybody. Things get worse if the dataset is compressed with zlib or whatever, but only the data vars were chunked at the moment of writing. Am I correct in understanding that the whole coord var will be read from disk 50000 times over? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
252541496 |