home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 602256880

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
602256880 MDU6SXNzdWU2MDIyNTY4ODA= 3981 [Proposal] Expose Variable without Pandas dependency 2443309 open 0     23 2020-04-17T22:00:10Z 2024-04-24T17:19:55Z   MEMBER      

This issue proposes exposing Xarray's Variable class as a stand-alone array class with named axes (dims) and arbitrary metadata (attrs) but without coordinates (indexes). Yes, this already exists but the Variable class in currently inseparable from our Pandas dependency, despite not utilizing any of its functionality. What would this entail?

The biggest change would be in making Pandas an optional dependency and isolating any imports. This change could be confined to the Variable object or could be propagated further as the Explicit Indexes work proceeds (#1603).

Why?

Within Xarray, the Variable class is a vital building block for many of our internal data structures. Recently, the utility of a simple array with named dimensions has been highlighted by a few potential user communities:

  • Scikit-learn: https://github.com/scikit-learn/enhancement_proposals/pull/18
  • PyTorch: (https://pytorch.org/tutorials/intermediate/named_tensor_tutorial.html, http://nlp.seas.harvard.edu/NamedTensor)

An example from the above linked SLEP as to why users may not want Pandas a dependency in Xarray:

@amueller: ...If we go this route, I think we need to make xarray, and therefore pandas, a mandatory dependency... ... @adrinjalali: ...And we still do have the option of making a NamedArray. xarray uses the pandas' index classes for the indexing and stuff, which is something we really don't need...

Since we already have a class developed that meets these applications' use cases, its seems only prudent to evaluate the feasibility in exposing the Variable as a low-level api object.

In conclusion, I'm not sure this is currently worth the effort but its probably worth exploring at this point.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3981/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 19 rows from issue in issue_comments
Powered by Datasette · Queries took 0.724ms · About: xarray-datasette