html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2242#issuecomment-399495668,https://api.github.com/repos/pydata/xarray/issues/2242,399495668,MDEyOklzc3VlQ29tbWVudDM5OTQ5NTY2OA==,1554921,2018-06-22T16:10:45Z,2018-06-22T16:10:45Z,CONTRIBUTOR,"True, I would expect _some_ performance hit due to writing chunk-by-chunk, however that same performance hit is present in both of the test cases.
In addition to the snippet @shoyer mentioned, I found that xarray also intentionally uses `autoclose=True` when writing chunks to netCDF:
https://github.com/pydata/xarray/blob/73b476e4db6631b2203954dd5b138cb650e4fb8c/xarray/backends/netCDF4_.py#L45-L48
However, `ensure_open` only uses `autoclose` if the file isn't already open:
https://github.com/pydata/xarray/blob/73b476e4db6631b2203954dd5b138cb650e4fb8c/xarray/backends/common.py#L496-L503
So if the file is already open before getting to `BaseNetCDF4Array__setitem__`, it will remain open. If the file isn't yet opened, it will be opened, but then immediately closed after writing the chunk. I suspect this is what's happening in the delayed version - the starting state of `NetCDF4DataStore._isopen` is `False` for some reason, and so it is doomed to re-close itself for each chunk processed.
If I remove the `autoclose=True` from `BaseNetCDF4Array__setitem__`, the file remains open and performance is comparable between the two tests.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,334633212