home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 370921380

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1970#issuecomment-370921380 https://api.github.com/repos/pydata/xarray/issues/1970 370921380 MDEyOklzc3VlQ29tbWVudDM3MDkyMTM4MA== 1217238 2018-03-06T20:43:29Z 2018-03-06T20:44:18Z MEMBER

What is the role of the netCDF API in the backend API?

A netCDF-like API is a good starting place for xarray backends, since our data model is strongly modeled on netCDF. But that's not quite unambiguous enough for us. There are lots of details like indexing, dtypes and locking that need awareness of both how xarray works and the specific backend. So I think we are unlikely to be able to eliminate the need for adapter classes.

My understanding of the point of h5netcdf was to provide a netCDF-like interface for HDF5, thereby making it easier to interface with xarray.

Yes, this was a large point of h5netcdf, although there are also users of h5netcdf without xarray. The main reason why it's a separate project is facilitate separation of concerns: xarray backends should be about how to adapt storage systems to work with xarray, not focused on details of another file format.

h5netcdf is now up to about 1500 lines of code (including tests), and that's definitely big enough that I'm happy I wrote it as a separate project. The full netCDF4 data model turns out to involve a fair amount of nuance.

Alternatively, if adaptation to the netCDF data model is easy (e.g., <100 lines of code), then it may not be worth the separate package. This is currently the case for zarr.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  302806158
Powered by Datasette · Queries took 75.453ms · About: xarray-datasette