issues: 1977485456
This data as json
| id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1977485456 | I_kwDOAMm_X8513giQ | 8413 | Add a perception of a __xarray__ magic method | 6273919 | open | 0 | 4 | 2023-11-04T19:55:14Z | 2023-11-05T18:50:14Z | NONE | Is your feature request related to a problem?I am often moving data from external objects (of all sorts!) into xarray. This is a common use case Much of this code would be greatly simplified if there was a way of giving non-xarray classes a way of declaring to xarray how these objects can be marshaled into Describe the solution you'd likeSo here is an initial proposal for comment. Much of this could be implemented in a third party library. But doing this in xarray itself would likely be best. Magic MethodsIt would be great to see these magic method signatures become integrated throughout the library:
Conversion RegistryAnd these extension functions to register converters:
Ideally, also, "deregister" versions (.e.g deregister would also be available. So context managers that change marshaling behavior could easily be constructed. User APIAlong with the following new user API functions:
"as_xarray" returns (in order of precedence: - x unaltered if it is an xarray objects - registered_xarray_converter(x, args, kwargs) if it is callable and does not throw an exception - registered_dataarray_converter(x, args, kwargs) if it is callable and does not throw an exception - registered_dataarray_converter(x, *args, kwargs) if it is callable and does not throw an exception - x.xarray(args, kwargs), if it exits, is callable, and does not throw an exception - x.xarray_dataset(args, kwargs), if it exists, is callable, and does not throw an exception - x.xarray_dataarray(*args, kwargs), if it exists, is callable, and does not throw an exception - well known aliases of xarray_dataarray, such as x.to_xarray(args, *kwargs) (see pandas) - [DESIGN DECISION] convert and return tuple[dims, data, [attr, encoding] to DataArray? - [DESIGN DECISION] convert and return tuple encoding of DataSet? - [DESIGN DECISION] return DataArray wrapped duck-typed array in DataArray? The rationale for putting the registered functions first is that this would enable "as_dataarrray" would be slimilar, but it would only call x.xarray_dataarray and well known aliases. "as_dataset" would be slimilar, but it would only call x.xarray_dataset, well known aliases, and perhaps falling back to calling x.xarray_dataarray and converting the return a dataset if it has a name attribute. "as_datatree" would be slimilar, but it would only call x.xarray_datatree, and perhaps falling back to calling x.xarray_dataarray and wrapping it in a single node datatree. (Though of course at this point this method would probably be implemented by the DataTree package, not xarray) The design decisions are flexible from my point of view, and might be decided in a way that makes the code base simplest or most usable. There is also a question of whether or not this method should default the backup methods. These decisions also can be deferred entirely by delegating to the converter registry. Across the Xarray LibraryFinally, across the xarray library, there may be places where passing input arguments through as_xarray, as_dataarray, or as_dataset would make a lot of sense. This could be the final thing to do, but cannot be handled by a third party library. Doing this would give give another pathway for third party libraries to integrate with xarray, with a far easier way than the converter registry or explicit calls to as_* functions. Describe alternatives you've consideredThis can be done with a private library. But it seems to a lot of code that is pretty useful to other use cases. Most of this (but not all) can accomplished in a 3rd party library, but it wouldn't allow the seamless sort of integration with (for example) xarray use of repr_html to integrate with pandas. The existing backend hooks work great when we are marshaling from file-based sources. See, for example, tiffslide-xarray (https://github.com/swamidasslab/tiffslide-xarray). This approach is seemless for reading files, but cannot marshal objects. For example, this is possible:
But this doesn't work.
This is an important use case because there are cases where we want to create an xarray like this from objects that are never stored on the filesystem. Additional contextNo response |
{
"url": "https://api.github.com/repos/pydata/xarray/issues/8413/reactions",
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
13221727 | issue |