issues: 2128501296

This data as json

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
2128501296	I_kwDOAMm_X85-3low	8733	A basic default ChunkManager for arrays that report their own chunks	90008	open	0			21	2024-02-10T14:36:55Z	2024-03-10T17:26:13Z		CONTRIBUTOR				Is your feature request related to a problem? I'm creating duckarrays for various file backed datastructures for mine that are naturally "chunked". i.e. different parts of the array may appear in completely different files. Using these "chunks" and the "strides" algorithms can better decide on how to iterate in a convenient manner. For example, an MP4 file's chunks may be defined as being delimited by I frames, while images stored in a TIFF may be delimited by a page. So for me, chunks are not so useful for parallel computing, but more for computing locally and choosing the appropriate way to iterate through a large arrays (TB of uncompressed data). Describe the solution you'd like I think a default Chunk manager could simply implement `compute` as `np.asarray` as a default instance, and be a catchall to all other instances. Advanced users could then go in an reimplement their own chunkmanager, but I was unable to use my duckarrays that incldued a `chunk` property because they weren't associated with any chunk manager. Something as simple as: ```patch diff --git a/xarray/core/parallelcompat.py b/xarray/core/parallelcompat.py index c009ef48..bf500abb 100644 --- a/xarray/core/parallelcompat.py +++ b/xarray/core/parallelcompat.py @@ -681,3 +681,26 @@ class ChunkManagerEntrypoint(ABC, Generic[T_ChunkedArray]): cubed.store """ raise NotImplementedError() + + +class DefaultChunkManager(ChunkMangerEntrypoint): + def init(self) -> None: + self.array_cls = None + + def is_chunked_array(self, data: Any) -> bool: + return is_duck_array(data) and hasattr(data, "chunks") + + def chunks(self, data: T_ChunkedArray) -> T_NormalizedChunks: + return data.chunks + + def compute(self, data: T_ChunkedArray \| Any, kwargs) -> tuple[np.ndarray, ...]: + raise tuple(np.asarray(d) for d in data) + + def normalize_chunks(self, args, *kwargs): + raise NotImplementedError() + + def from_array(self, args,** kwargs): + raise NotImplementedError() + + def apply_gufunc(self, args, *kwargs): + raise NotImplementedError() ``` Describe alternatives you've considered I created my own chunk manager, with my own chunk manager entry point. Kinda tedious... Additional context It seems that this is related to: https://github.com/pydata/xarray/pull/7019	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8733/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }			13221727	issue

Links from other tables

3 rows from issues_id in issues_labels
0 rows from issue in issue_comments