issue_comments: 379294800
This data as json
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/pydata/xarray/issues/2040#issuecomment-379294800 | https://api.github.com/repos/pydata/xarray/issues/2040 | 379294800 | MDEyOklzc3VlQ29tbWVudDM3OTI5NDgwMA== | 1217238 | 2018-04-06T15:47:24Z | 2018-04-06T15:47:24Z | MEMBER | The main reason for preferring variable length strings was that netCDF4-python always properly decoded them as unicode strings, even on Python 3. Basically, it was required to properly round-trip strings to a netCDF file on Python 3. However, this is no longer the case, now that we specify an encoding when writing fixed length strings (https://github.com/pydata/xarray/pull/1648). So we could potentially revisit the default behavior. I'll admit I'm also a little surprised by how large the storage overhead turns out to be for variable length datatypes. The HDF5 docs claim it's 32 bytes per element, which would be about 10 MB or so for your dataset. And apparently it interacts poorly with compression, too. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
311578894 |