issue_comments: 400905262
This data as json
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/pydata/xarray/issues/2256#issuecomment-400905262 | https://api.github.com/repos/pydata/xarray/issues/2256 | 400905262 | MDEyOklzc3VlQ29tbWVudDQwMDkwNTI2Mg== | 4338975 | 2018-06-28T04:12:47Z | 2018-06-28T04:18:07Z | NONE | Yes I agree with you I started out with the ds.to_zarr for each file, the problem was that each property of the cycle e.g. lat and long ended up in it's own file. one float with 250 cycles ended up with over 70,000 small files one my file system, because of cluster size they occupied over 100meg of hard disk. as there are over 4000 floats lots of small files are not going to be viable.
Yep this line is funny. CYCLE_NUMBER increments up with each cycle and starts at 1. Sometimes a cycle might be delayed and added at a later date, so did not want to make the assumption that the list of files had been sorted into the order of the float cycles, so instead I want to build an array of cycles in order. Also if a file is replaced by a newer version then I want it to overwrite the profile in the array
A single float file end up with 194 small files in 68 directories total size 30.4 KB (31,223 bytes) but size on disk 776 KB (794,624 bytes) I have tried
but fails with:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
336458472 |