html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/4118#issuecomment-1042660100,https://api.github.com/repos/pydata/xarray/issues/4118,1042660100,IC_kwDOAMm_X84-JbsE,1217238,2022-02-17T07:45:24Z,2022-02-17T07:45:24Z,MEMBER,"One thing that came up in our discussion about this in the developer
meeting today is that we could also pretty easily expose a ""low level"" API
for IO using dictionaries of xarray.Variable objects. This intermediate
representation could be useful for cleaning up data into a form suitable
for conversion into Dataset objects.
On Wed, Feb 16, 2022 at 11:39 PM Alessandro Amici ***@***.***>
wrote:
> @TomNicholas (cc @mraspaud
> )
>
> Do you have use cases which one of these designs could handle but the
> other couldn't?
>
> The two main classes of on-disk formats that, I know of, which cannot be
> always represented in the ""group is a Dataset"" approach are:
>
> - in netCDF following the CF conventions for groups
> ,
> it is legal for an array to refer to a dimension or a coordinate in a
> different group and so arrays in the same group may have dimensions with
> the same name, but different size / coordinate values,
> - the current spec for the Next-generation file formats (NGFF)
> for bio-imaging has all scales of
> the same 5D data in the same group.
>
> I don't have an example at hand, but my impression is that satellite
> products that use HDF5 file format also place arrays with inconsistent
> dimensions / coordinates in the same group.
>
> —
> Reply to this email directly, view it on GitHub
> ,
> or unsubscribe
>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,628719058
https://github.com/pydata/xarray/issues/4118#issuecomment-901598698,https://api.github.com/repos/pydata/xarray/issues/4118,901598698,IC_kwDOAMm_X841vU3q,1217238,2021-08-19T04:23:15Z,2021-08-19T04:23:15Z,MEMBER,"> However, if one of the variables has the same name as one of the groups (which I think is permitted in the netCDF format), then there is no easy way to access all the elements whilst retaining the nice syntax.
NetCDF does *not* allow variables and groups with the same name, e..g,
```python
import netCDF4
nc = netCDF4.Dataset('testing.nc', 'w')
nc.createVariable('foo', float)
nc.createGroup('foo')
# RuntimeError: NetCDF: String match to name in use
```
I'm pretty sure this is also prohibited for all HDF5 files, just like how you can't have a directory and file with the same name on most filesystems.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,628719058
https://github.com/pydata/xarray/issues/4118#issuecomment-873316602,https://api.github.com/repos/pydata/xarray/issues/4118,873316602,MDEyOklzc3VlQ29tbWVudDg3MzMxNjYwMg==,1217238,2021-07-03T00:40:55Z,2021-07-03T00:40:55Z,MEMBER,"> if you used tags wouldn't you lose the ability to round-trip a netCDF file with groups?
That sounds right to me -- a downside of tags is that they can't be (uniquely) expressed in a hierarchical arrangement like those found in HDF5/netCDF4 files.
But if this is a better way to organize data in memory, we could consider how to make an adapter layer for on disk storage.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,628719058
https://github.com/pydata/xarray/issues/4118#issuecomment-873227326,https://api.github.com/repos/pydata/xarray/issues/4118,873227326,MDEyOklzc3VlQ29tbWVudDg3MzIyNzMyNg==,1217238,2021-07-02T19:55:31Z,2021-07-02T19:55:31Z,MEMBER,"@martinitus raises a really interesting point about tags vs hierarchical structures over in https://github.com/pydata/xarray/issues/1092#issuecomment-868324949
> However, one point I didn't see in the discussion is the following:
>
> Hierarchical structures often force a user to come up with some arbitrary order of hierarchy levels. The classical example is document filing: do you put your health insurance documents under /insurance/health/2021, 2021/health/insurance,....?
>
> One solution to that is a tagging of documents instead of putting them into a hierarchy. This would give the full flexibility to retrieve any flat DataSet out of a TaggedDataSet by specifying the set of tags that the individual DataArrays must be listed under.
I think using tags is a really interesting alternative to hierarchies. I don't have a clear sense of the overall tradeoffs, though.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,628719058
https://github.com/pydata/xarray/issues/4118#issuecomment-807908489,https://api.github.com/repos/pydata/xarray/issues/4118,807908489,MDEyOklzc3VlQ29tbWVudDgwNzkwODQ4OQ==,1217238,2021-03-26T03:24:48Z,2021-03-26T03:24:48Z,MEMBER,"I'm excited to see this coming together! I would be happy to advise as well...
Side note: at some point, this would probably be worth adding to Xarray's official roadmap.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,628719058
https://github.com/pydata/xarray/issues/4118#issuecomment-638481215,https://api.github.com/repos/pydata/xarray/issues/4118,638481215,MDEyOklzc3VlQ29tbWVudDYzODQ4MTIxNQ==,1217238,2020-06-03T21:52:53Z,2020-06-03T23:08:47Z,MEMBER,"The data model you sketch out here looks very similar to what we discussed in #1092. I agree that the semantics are well defined.
The main question in my mind is whether it would make more sense to make an entirely new data structure (e.g., `xarray.TreeDataset`) or add in a new feature like `groups` to the existing `xarray.Dataset`.
Probably a new data structure would be easier at this point, because would keep `Dataset` simpler and wouldn't break existing code that works on `xarray.Dataset`.","{""total_count"": 5, ""+1"": 5, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,628719058
https://github.com/pydata/xarray/issues/4118#issuecomment-638478790,https://api.github.com/repos/pydata/xarray/issues/4118,638478790,MDEyOklzc3VlQ29tbWVudDYzODQ3ODc5MA==,1217238,2020-06-03T21:46:48Z,2020-06-03T21:46:48Z,MEMBER,"I would be open to exploring adding a hierarchical data structure into xarray (on an experimental basis, to start), but it would need someone with serious interest and time to make it happen. Certainly there are plenty of use cases across various fields.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,628719058