home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 511174605

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/3096#issuecomment-511174605 https://api.github.com/repos/pydata/xarray/issues/3096 511174605 MDEyOklzc3VlQ29tbWVudDUxMTE3NDYwNQ== 1217238 2019-07-14T05:28:22Z 2019-07-14T05:28:43Z MEMBER

With regards to open_mfdataset(), I checked the code and realized under the hood it's only calling multiple open_dataset(). I was worried it would load the values (and not only metadata) in memory, but I checked it on one file and it apparently does not. Can you confirm this ? In this case I could probably open my whole dataset at once, which would be very convenient.

Yes, this is the suggested workflow! open_mfdataset opens a collection of files lazily (with dask) into a single xarray dataset, suitable for converting into zarr all at once with to_zarr().

It is definitely possible to create a zarr dataset and then write to it in parallel with a bunch of processes, but not via xarray's to_zarr() method -- which can only parallelize with dask. You would have to create the dataset and write to it with the zarr Python API directly.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  466994138
Powered by Datasette · Queries took 0.708ms · About: xarray-datasette