home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 324586771

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1521#issuecomment-324586771 https://api.github.com/repos/pydata/xarray/issues/1521 324586771 MDEyOklzc3VlQ29tbWVudDMyNDU4Njc3MQ== 6213168 2017-08-24T09:41:22Z 2017-08-24T09:42:16Z MEMBER

change open_dataset() to always eagerly load the coords to memory, regardless of the chunks parameter. Is there any valid use case where lazy coords are actually desirable?

This also leads to another inefficiency of open_dataset(chunks=...), where you may have your data e.g. shape=(50000, 2**30), chunks=(1, 2**30). If you pass the chunks above to open_dataset, it will break down the coords on the first dim into dask arrays of 1 element - which hardly benefits anybody. Things get worse if the dataset is compressed with zlib or whatever, but only the data vars were chunked at the moment of writing. Am I correct in understanding that the whole coord var will be read from disk 50000 times over?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  252541496
Powered by Datasette · Queries took 0.673ms · About: xarray-datasette