home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 1016705107

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/6174#issuecomment-1016705107 https://api.github.com/repos/pydata/xarray/issues/6174 1016705107 IC_kwDOAMm_X848mbBT 35968931 2022-01-19T17:37:12Z 2022-01-19T18:05:07Z MEMBER

I would like to have a function xr.to_netcdf that writes a list (or a dictionary) of datasets to a single NetCDF4 file.

If you've read through all of #4118 you will have seen that there is a prototype package providing a nested data structure which can handle groups. Using DataTree we can easily write a dictionary of datasets to a single netCDF file as groups:

```python from datatree import DataTree

dt = DataTree.from_dict(ds_dict) dt.to_netcdf('filepath.nc') ```

(Here if you want groups within groups then the keys in the dictionary should be specified like filepaths, e.g. /group1/group2/ds_name.)

Ideally there should also be a way to read many datasets at once from a single NetCDF4 file using xr.open_dataset.

Again DataTree allows you to open all the groups at once, returning a tree-like structure which contains all the groups:

python dt = open_datatree('filepath.nc')

To extract all the groups as individual datasets you can do this to recreate the dictionary of datasets:

python ds_dict = {node.pathstr: node.ds for node in dt.subtree}

However, this is really slow when you have many (hundreds or thousands of) small datasets because the file is opened and closed in every iteration.

Currently, I'm using the following read/write functions to achieve the same:

Is your solution noticeably faster? We (@jhamman and I) haven't really thought about speed of DataTree I/O yet I don't think, preferring to just make something simple which works for now. The current I/O code for DataTree is here.

Despite that project only being a prototype, it is still probably the best solution to your problem that we currently have (at least the neatest). If you are interested in trying it out and reporting any problems then that would be greatly appreciated!

EDIT: The idea discussed here might also be of interest to you.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1108138101
Powered by Datasette · Queries took 315.74ms · About: xarray-datasette