home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 1188965542

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1188965542 I_kwDOAMm_X85G3iym 6433 Rename/reword `parallel=True` option to `open_mfdataset` 3309802 open 0     2 2022-03-31T22:52:09Z 2022-04-01T11:15:39Z   NONE      

What is your issue?

Based on its name, I was surprised to find that open_mfdataset(..., parallel=True) computed the whole dataset eagerly, whereas parallel=False just returned it in dask form. (I generally think of "dask" as related to "parallel".)

I guess the docs do technically say this, but it's a bit hard to parse:

If True, the open and preprocess steps of this function will be performed in parallel using dask.delayed. Default is False.

The docstring could maybe instead mention "If False (default), the data is returned in dask form. If True, it will be computed immediately (using dask), then returned in NumPy form".

More intuitive to me would be renaming the argument to compute=False. Or even deprecating the argument entirely and having a load_mfdataset function, in the same way that load_dataset is the eager version of open_dataset.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6433/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 5.99ms · About: xarray-datasette