html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/66#issuecomment-485291578,https://api.github.com/repos/pydata/xarray/issues/66,485291578,MDEyOklzc3VlQ29tbWVudDQ4NTI5MTU3OA==,1217238,2019-04-21T23:55:02Z,2019-04-21T23:55:02Z,MEMBER,"Xarray will never be able to read arbitrary HDF5 files. The full HDF5 data model is far more complicated than any data structure xarray supports.
Using h5py directly is your best bet for HDF5 files that aren’t also netcdf files.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,29453809
https://github.com/pydata/xarray/issues/66#issuecomment-338782661,https://api.github.com/repos/pydata/xarray/issues/66,338782661,MDEyOklzc3VlQ29tbWVudDMzODc4MjY2MQ==,1217238,2017-10-23T20:14:36Z,2017-10-23T20:14:36Z,MEMBER,"> I've been looking at the h5netcdf code recently to understand better how dimensions are plumbed in netcdf4.
It's pretty messy, to be honest :). The HDF5 [dimension scale API](https://support.hdfgroup.org/HDF5/doc/HL/H5DS_Spec.pdf) is highly flexible, and netCDF4 only uses a small part of it.
> I'm exploring refactoring all my data model classes in scikit-allel to
build on xarray, I think the time is right, especially if xarray gets a
Zarr backend too.
Interesting -- I'd love to hear how this goes! Please don't hesitate to file issues when problems come up (though you're already off to a good start).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,29453809
https://github.com/pydata/xarray/issues/66#issuecomment-90828627,https://api.github.com/repos/pydata/xarray/issues/66,90828627,MDEyOklzc3VlQ29tbWVudDkwODI4NjI3,1217238,2015-04-08T07:20:32Z,2015-04-08T07:20:32Z,MEMBER,"Note that h5netcdf won't (yet) let you read any HDF5 files you couldn't already read with netCDF4-python -- it just gives us an alternative backend to use. One thing we could do that's not supported by netCDF is potentially read HDF5 [dimension labels](http://h5py.readthedocs.org/en/latest/high/dims.html). The original netCDF4 library only understands dimension scales -- which, to be honest, seems like a less natural fit to me than reading dimension labels.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,29453809
https://github.com/pydata/xarray/issues/66#issuecomment-90798866,https://api.github.com/repos/pydata/xarray/issues/66,90798866,MDEyOklzc3VlQ29tbWVudDkwNzk4ODY2,1217238,2015-04-08T04:21:15Z,2015-04-08T04:21:15Z,MEMBER,"I wrote a little library to read and write netCDF4 files via h5py the other day: https://github.com/shoyer/h5netcdf
I also merged a preliminary backend for it into xray that should work if you use `engine='h5netcdf'`. So I think we can consider this issue resolved!
I've also been looking into the [netCDF4 data model](https://www.unidata.ucar.edu/software/netcdf/docs/netcdf/Data-Model.html) in a bit more detail, and the good news is that it looks like it does, at least theoretically, support hierarchical dimension scales. This doesn't work in h5netcdf yet, but would be easy to add. Read support into xray would also be straightforward.
Figuring out how to write a hierarchy of xray datasets into the format is less obvious, however. We might need something like a HierarchicalDataset object. I guess using `/` with variable names in normal Dataset objects would work, though it would help to have something like a pandas MultiIndex to make it easier to actually work with all those variable names.
","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,29453809
https://github.com/pydata/xarray/issues/66#issuecomment-42872192,https://api.github.com/repos/pydata/xarray/issues/66,42872192,MDEyOklzc3VlQ29tbWVudDQyODcyMTky,1217238,2014-05-12T18:51:21Z,2014-05-12T18:51:21Z,MEMBER,"In principle, I think dimension scales are all we need to interpret HDF5 files as xray Datasets. That's also _most_ of what you need to make a netCDF4 file, but I would not be surprised if NetCDF libraries have issues with HDF5 files that don't conform to every last NetCDF convention. For reference, here is the full NetCDF4 spec (pretty short!):
https://www.unidata.ucar.edu/software/netcdf/docs/netcdf/NetCDF_002d4-Format.html
We don't yet support reading from groups or subgroups (other than the root group `'/'`), but I agree this would be a nice feature. It would seem straightforward enough to add some option to read variables from subgroups recursively, although I'm sure there are some subtleties to get the API right. Yours is an interesting use of dimension scales (and it makes complete sense), but I'm not sure if the NetCDF4 model supports that sort of thing.
To support HDF5 properly, including interesting use cases like yours, I think it we should probably write our own interface to h5py, instead of reading everything through the NetCDF libraries. Ideally, we could set this up to write HDF5 as (mostly) valid NetCDF4, at least in the simpler cases where that makes sense.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,29453809
https://github.com/pydata/xarray/issues/66#issuecomment-40737375,https://api.github.com/repos/pydata/xarray/issues/66,40737375,MDEyOklzc3VlQ29tbWVudDQwNzM3Mzc1,1217238,2014-04-17T17:03:36Z,2014-04-17T17:03:36Z,MEMBER,"I did a little bit of research into the HDF5 file-format last night and how it maps on the NetCDF data model: https://www.unidata.ucar.edu/software/netcdf/docs/netcdf/NetCDF_002d4-Format.html
HDF5 has a notion of ""dimension scales"" which implement shared dimensions. The bad news is that [pytables does not support them](https://www.mail-archive.com/pytables-users@lists.sourceforge.net/msg01374.html), although [h5py does](http://docs.h5py.org/en/latest/high/dims.html). As @ToddSmall shows in his example above, pytables supports getting file images for HDF5 files, but unfortunately [h5py does not implement file image operations](https://groups.google.com/forum/#!topic/h5py/W25rGn4msmM). So it looks like there are not currently any existing solutions that will let us implement our data model in HDF5 with file images :(.
On the plus side, it does look like it would be pretty simple to implement the NetCDF4 file format directly via h5py. This is something worth considering, because [the codebase for the h5py project](https://github.com/h5py/h5py) looks much cleaner than [netCDF4-python](https://github.com/Unidata/netcdf4-python/) and has better test coverage. I can also verify that it is straightforward to open and interpret NetCDF4 files via pytables or h5py.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,29453809
https://github.com/pydata/xarray/issues/66#issuecomment-38005951,https://api.github.com/repos/pydata/xarray/issues/66,38005951,MDEyOklzc3VlQ29tbWVudDM4MDA1OTUx,1217238,2014-03-19T00:32:38Z,2014-03-19T00:32:38Z,MEMBER,"Thanks @ToddSmall!
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,29453809