home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 731226031

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
731226031 MDExOlB1bGxSZXF1ZXN0NTExMzczNjI5 4547 Update signature open_dataset for API v2 35919497 closed 0     2 2020-10-28T08:35:54Z 2021-02-11T01:50:09Z 2020-11-06T14:43:10Z COLLABORATOR   0 pydata/xarray/pulls/4547

Proposal for the new API of open_dataset(). It is implemented in apiv2.py and it doesn't modify the current behavior of api.open_dataset(). It is something in between the first and second alternative suggested at https://github.com/pydata/xarray/issues/4490#issue-715374721, see the related quoted text:

Describe alternatives you've considered

For the overall approach:

  1. We could keep the current design, with separate keyword arguments for decoding options, and just be very careful about passing around these arguments. This seems pretty painful for the backend refactor, though.
  2. We could keep the current design only for the user facing open_dataset() interface, and then internally convert into the DecodingOptions() struct for passing to backend constructors. This would provide much needed flexibility for backend authors, but most users wouldn't benefit from the new interface. Perhaps this would make sense as an intermediate step?

Instead of a class for the decoders, I have added a function: resolve_decoders_kwargs.
resolve_decoders_kwargs performs two tasks: - If decode_cf is False, it sets to False all the decoders supported by the backend (using inspect). - It filters out the None decoder keywords.

So xarray manages the keyword decode_cf and passes on only the non-default decoders to the backend. If the user sets to a non-None value a decoder not supported by the backend, the backend will raise an error.

With this implementation drop_variable should be always supported by the backend. But I think this could be implemented easely by all the backends. I wouldn't group it with the decoders: to me, it seems to be more a filter rather than a decoder.

The behavior decode_cf is unchanged.

PRO: - the user doesn't need to import and instantiate a class. - users get the argument completion on open_dataset. - the backend defines directly in open_backend_dataset_${engine} API the accepted decoders. - xarray manages decode_cf, not the backends.

Missing points: - deccode_cf should be renamed decode. Probably, the behavior of decode should be modified for two reason: - currently If decode_cf is False, it sets the decoders to False, but there is no check on the other values. The accepted values should be: None (it keeps decoders default values), True (it sets all the decoders to True), False (it sets all the decoders to False). - currently we can set both a decoder and decode_cf without any warning. , but the - Deprecate backend_kwargs (or kwargs). - Separate mask_and_scale?

I think that we need a different PR for the three of them.

  • [x] related to https://github.com/pydata/xarray/issues/4490#
  • [x] Passes isort . && black . && mypy . && flake8
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4547/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 pull

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 0.859ms · About: xarray-datasette