issue_comments: 1211197176
This data as json
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/pydata/xarray/issues/4285#issuecomment-1211197176 | https://api.github.com/repos/pydata/xarray/issues/4285 | 1211197176 | IC_kwDOAMm_X85IMWb4 | 35968931 | 2022-08-10T19:51:43Z | 2022-08-10T19:56:02Z | MEMBER |
Very interesting @jpivarski - that would make a good blog post / think piece if you ever felt like it.
I'm biased in thinking that (1) is true, but then I'm not a particle physicist - the closest I came was using ROOT in undergrad extremely briefly :smile: .
Now seems like a good time to list some potential use cases for a 1) Oceanography observation data NOAA's Global Drifter Program tracks the movement of floating buoys, each of which takes measurements at specified time intervals as it moves along. As each drifter may take a completely different path across the ocean, the length of their trajectories is variable. @dhruvbalwada pointed me to this notebook which compares analyzing drifter data using 1) xarray wrapping rectilinear arrays
2) pandas
3) Reading the notebook it seems that a new option (4) of ragged data within xarray might well be the best of both worlds for this particular use case. @selipot @philippemiron is creating a 2) Alleles in Genomics Allele data can have a wide variation in the number of alt alleles (most variants will have one, but a few could have thousands), as mentioned by @tomwhite in https://github.com/pystatgen/sgkit/issues/634. I'm not sure whether the I'm also unclear if this would be useful for ANNData https://github.com/scverse/anndata/issues/744 (cc @ivirshup) 3) Neutron scattering data Scipp is an xarray-like labelled data structure for neutron scattering experiment data. On their FAQ Q titled "Why is xarray not enough", one of the things they quote is
Would a 4) Other "Record"-like data A "Record" is for when you want to store multiple pieces of information (of possibly different types) about an "event". In Whilst I don't think we can store awkward arrays containing Records directly in xarray (though after @shoyer's comment I'm not so sure...), what we could do is have multiple named data variables, each of which contains a As an example of a quirky use case for record-like data, a biologist friend recently showed me a dataset of hummingbird feeding patterns. He had strapped RFID tags to hundreds of hummingbirds, then set up feeder stations equipped with radio antennae. When the birds came to feed an event would be recorded. As the resulting data varied with bird ID, date, and feeder, but each individual bird could visit any particular feeder any number of times on a given day, I thought he could store this data in a Ragged array within xarray with the dimension representing number of visits having variable length. There are probably a lot more possible use cases for a |
{ "total_count": 3, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 3, "rocket": 0, "eyes": 0 } |
667864088 |