home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 490421774

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/2946#issuecomment-490421774 https://api.github.com/repos/pydata/xarray/issues/2946 490421774 MDEyOklzc3VlQ29tbWVudDQ5MDQyMTc3NA== 10809480 2019-05-08T09:44:25Z 2019-05-08T09:49:02Z NONE

interesting fact i just learned. when you have to process over a huge dataset, first export it as a complete single netcdf file, then calculate its aggregation function.

Its a workaround, i suppose bottleneck or dask needs to have its complete set first. For mean it just simply works because of the easy calculation method, for std i think dask or bottleneck assume a nan as a zero for calculation purposes.

python data = xr.open_mfdataset(list_to_input_files, parallel=True, concat_dim="time") (...) data.to_netcdf("help_netcdf_file.nc") data.close() data = xr.open_dataset("help_netcdf_file.nc") data.mean(...).to_netcdf("mean_netcdf_file.nc") data.std(...).to_netcdf("mean_netcdf_file.nc")

It could be problematic by huuuuge datasets in the tb size.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  441222339
Powered by Datasette · Queries took 0.778ms · About: xarray-datasette