home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 806651823

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/2857#issuecomment-806651823 https://api.github.com/repos/pydata/xarray/issues/2857 806651823 MDEyOklzc3VlQ29tbWVudDgwNjY1MTgyMw== 2418513 2021-03-25T12:30:39Z 2021-03-25T12:46:26Z NONE

@shoyer This problem persisted all of this time, but since I faced it again, I did a bit of digging. (it's strange noone else noticed it so far as it's pretty bad)

I've line-profiled this snippet for various number of datasets already written to file (xarray.backends.api.to_netcdf):

https://github.com/pydata/xarray/blob/8452120e52862df564a6e629d1ab5a7d392853b0/xarray/backends/api.py#L1075-L1094

| Number of datasets in file | dump_to_store() | store_open() | store.close() | | --- | --- | --- | --- | | 0 | 88% | 1% | 10% | | 50 | 18% | 2% | 80% | | 200 | 4% | 2% | 94% |

The above can be measured simply in a notebook via %lprun -f xarray.backends.api.to_netcdf test_func(). The writing was done in mode='a', with blosc:zstd compression. All datasets are written into different groups (i.e. by passing group=...).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  427410885
Powered by Datasette · Queries took 1.055ms · About: xarray-datasette