home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 308879158

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1440#issuecomment-308879158 https://api.github.com/repos/pydata/xarray/issues/1440 308879158 MDEyOklzc3VlQ29tbWVudDMwODg3OTE1OA== 2443309 2017-06-15T22:07:33Z 2017-06-16T00:12:43Z MEMBER

@Zac-HD - I'm about to put up a PR with some initial benchmarking functionality (#1457). Are you open to putting together PR for the features you've described above? Hopefully, these two can work together.

As for the API changes related to this issue, I'd propose the following:

Use the chunks keyword to support 3 additional options

python def open_dataset(filename_or_obj, ..., chunks=None, ...): """Load and decode a dataset from a file or file-like object. Parameters ---------- .... chunks : int or dict or set or 'auto' or 'disk', optional If chunks is provided, it used to load the new dataset into dask arrays. ``chunks={}`` loads the dataset with dask using a single chunk for all arrays. ... """

  • int: chunk each dimension by chunks
  • dict: Dictionary with keys given by dimension names and values given by chunk sizes. In general, these should divide the dimensions of each dataset
  • set (or list or tuple) of str: chunk the dimension(s) provided by some heuristic, try to keep the chunk shape/size compatible with the storage of the data on disk and for use with dask
  • 'auto' (str): chunk the array(s) using some auto-magical heuristic that is compatible with the storage of the data on disk and is semi-optimized (in size) for use with dask
  • 'disk' (str): use the chunksize of the netCDF variable directly.
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  233350060
Powered by Datasette · Queries took 77.46ms · About: xarray-datasette