home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 1989356758

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1989356758 I_kwDOAMm_X852kyzW 8447 Improve discoverability of backend engine options 4160723 open 0     5 2023-11-12T11:14:56Z 2023-12-12T20:30:28Z   MEMBER      

Is your feature request related to a problem?

Backend engine options are not easily discoverable and we need to know or figure out them before passing it as kwargs to xr.open_dataset().

Describe the solution you'd like

The solution is similar to the one proposed in #8002 for setting a new index.

The API could look like this:

```python import xarray as xr

ds = xr.open_dataset( file_or_obj, engine=xr.backends.engine("myengine").with_options( option1=True, option2=100, ), ) ```

where xr.backends.engine("myengine") returns the MyEngineBackendEntrypoint subclass.

We would need to extend the API for BackendEntrypoint with a .with_options() factory method:

```python class BackendEntrypoint: _open_dataset_options: dict[str, Any]

@classmethod
def with_options(cls):
    """This backend does not implement `with_options`."""
    raise NotImplementedError()

```

Such that

```python class MyEngineBackendEntryPoint(BackendEntrypoint): open_dataset_parameters = ("option1", "option2")

@classmethod
def with_options(
    cls,
    option1: bool = False,
    option2: int | None = None,
):
    """Get the backend with user-defined options.

    Parameters
    -----------
    option1 : bool, optional
        This is option1.
    option2 : int, optional
        This is option2.

    """
    obj = cls()

    # maybe validate the given input options
    if option2 is None:
        option2 = 1

    obj._options = {"option1": option1, "option2": option2}

    return obj

def open_dataset(
    self,
    filename_or_obj: str | os.PathLike[Any] | BufferedIOBase | AbstractDataStore,
    *,
    drop_variables: str | Iterable[str] | None = None,
    **kwargs,    # no static checker error (liskov substitution principle)
):
    # kwargs passed directly to open_dataset take precedence to options
    # or alternatively raise an error?
    option1 = kwargs.get("option1", self._options.get("option1", False))

    ...

```

Pros:

  • Using .with_options(...) would seamlessly work with IDE auto-completion, static type checkers (I guess? I'm not sure how static checkers support entry-points), documentation, etc.
  • There is no breaking change (xr.open_dataset(obj, engine=...) accepts either a string or a BackenEntryPoint subtype but not yet a BackendEntryPoint object) and this feature could be adopted progressively by existing 3rd-party backends.

Cons:

  • The possible duplicated declaration of options among open_dataset_parameters, .with_options() and .open_dataset() does not look super nice but I don't really know how to avoid that.

Describe alternatives you've considered

A BackendEntryPoint.with_options() factory is not really needed and we could just go with BackendEntryPoint.__init__() instead. Perhaps with_options looks a bit clearer and leaves room for more flexibility in __init__ , though?

Additional context

cc @jsignell https://github.com/stac-utils/pystac/issues/846#issuecomment-1405758442

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8447/reactions",
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 0 rows from issue in issue_comments
Powered by Datasette · Queries took 0.532ms · About: xarray-datasette