issue_comments
23 rows where author_association = "MEMBER" and issue = 1368740629 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
issue 1
- Generalize handling of chunked array types · 23 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1553453937 | https://github.com/pydata/xarray/pull/7019#issuecomment-1553453937 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85cl9Nx | jhamman 2443309 | 2023-05-18T18:27:52Z | 2023-05-18T18:27:52Z | MEMBER | 👏 Congrats @TomNicholas on getting this in! Such an important contribution. 👏 |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1553395594 | https://github.com/pydata/xarray/pull/7019#issuecomment-1553395594 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85clu-K | TomNicholas 35968931 | 2023-05-18T17:37:22Z | 2023-05-18T17:37:22Z | MEMBER | Woooo thanks @dcherian ! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1553390072 | https://github.com/pydata/xarray/pull/7019#issuecomment-1553390072 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85cltn4 | dcherian 2448579 | 2023-05-18T17:34:01Z | 2023-05-18T17:34:01Z | MEMBER | Thanks @TomNicholas Big change! |
{ "total_count": 3, "+1": 0, "-1": 0, "laugh": 0, "hooray": 3, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1550564976 | https://github.com/pydata/xarray/pull/7019#issuecomment-1550564976 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85ca75w | TomNicholas 35968931 | 2023-05-17T01:39:08Z | 2023-05-17T01:39:08Z | MEMBER | @Illviljan thanks for all your comments! Would you (or @keewis?) be willing to approve this PR now? I would really like to merge this so that I can release a version of xarray that I can use as a dependency for cubed-xarray. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1534087275 | https://github.com/pydata/xarray/pull/7019#issuecomment-1534087275 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85bcFBr | TomNicholas 35968931 | 2023-05-04T04:41:22Z | 2023-05-04T04:41:22Z | MEMBER | (Okay now the failures are from https://github.com/pydata/xarray/pull/7815 which I've separated out, and from https://github.com/pydata/xarray/pull/7561 being recently merged into main which is definitely not my fault :sweat_smile: https://github.com/pydata/xarray/pull/7019/commits/316c63d55f4e2c317b028842f752a40596f16c6d shows that this PR passes the tests by itself.) |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1531992793 | https://github.com/pydata/xarray/pull/7019#issuecomment-1531992793 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85bUFrZ | TomNicholas 35968931 | 2023-05-02T18:58:23Z | 2023-05-02T19:01:20Z | MEMBER | I would like to merge this now please! It works, it passes the tests, including mypy. The main feature not in this PR is using If we merge this I can start properly testing cubed with xarray (in cubed-xarray). @shoyer @dcherian if one of you could merge this or otherwise tell me anything else you think is still required! |
{ "total_count": 4, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 2, "rocket": 2, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1502378655 | https://github.com/pydata/xarray/pull/7019#issuecomment-1502378655 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85ZjHqf | dcherian 2448579 | 2023-04-10T21:57:04Z | 2023-04-10T21:57:04Z | MEMBER |
Seems OK to me. The other option is to xfail the broken tests on old dask versions |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1499791533 | https://github.com/pydata/xarray/pull/7019#issuecomment-1499791533 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85ZZQCt | TomNicholas 35968931 | 2023-04-07T00:32:47Z | 2023-04-07T00:59:03Z | MEMBER |
Update on this rabbit hole: This commit to dask changed the behaviour of dask's auto-chunking logic, such that if I run my little test script
```python
from xarray.core.variable import IndexVariable
from dask.array.core import normalize_chunks # import the
import itertools
from numbers import Number
import dask
import dask.array as da
import xarray as xr
import numpy as np
# This function is copied from xarray, but calls dask.array.core.normalize_chunks
# It is used in open_dataset, but not in Dataset.chunk
def _get_chunk(var, chunks):
"""
Return map from each dim to chunk sizes, accounting for backend's preferred chunks.
"""
if isinstance(var, IndexVariable):
return {}
dims = var.dims
shape = var.shape
# Determine the explicit requested chunks.
preferred_chunks = var.encoding.get("preferred_chunks", {})
preferred_chunk_shape = tuple(
preferred_chunks.get(dim, size) for dim, size in zip(dims, shape)
)
if isinstance(chunks, Number) or (chunks == "auto"):
chunks = dict.fromkeys(dims, chunks)
chunk_shape = tuple(
chunks.get(dim, None) or preferred_chunk_sizes
for dim, preferred_chunk_sizes in zip(dims, preferred_chunk_shape)
)
chunk_shape = normalize_chunks(
chunk_shape, shape=shape, dtype=var.dtype, previous_chunks=preferred_chunk_shape
)
# Warn where requested chunks break preferred chunks, provided that the variable
# contains data.
if var.size:
for dim, size, chunk_sizes in zip(dims, shape, chunk_shape):
try:
preferred_chunk_sizes = preferred_chunks[dim]
except KeyError:
continue
# Determine the stop indices of the preferred chunks, but omit the last stop
# (equal to the dim size). In particular, assume that when a sequence
# expresses the preferred chunks, the sequence sums to the size.
preferred_stops = (
range(preferred_chunk_sizes, size, preferred_chunk_sizes)
if isinstance(preferred_chunk_sizes, Number)
else itertools.accumulate(preferred_chunk_sizes[:-1])
)
# Gather any stop indices of the specified chunks that are not a stop index
# of a preferred chunk. Again, omit the last stop, assuming that it equals
# the dim size.
breaks = set(itertools.accumulate(chunk_sizes[:-1])).difference(
preferred_stops
)
if breaks:
warnings.warn(
"The specified Dask chunks separate the stored chunks along "
f'dimension "{dim}" starting at index {min(breaks)}. This could '
"degrade performance. Instead, consider rechunking after loading."
)
return dict(zip(dims, chunk_shape))
chunks = 'auto'
encoded_chunks = 100
dask_arr = da.from_array(
np.ones((500, 500), dtype="float64"), chunks=encoded_chunks
)
var = xr.core.variable.Variable(data=dask_arr, dims=['x', 'y'])
with dask.config.set({"array.chunk-size": "1MiB"}):
chunks_suggested = _get_chunk(var, chunks)
print(chunks_suggested)
```
Anyway what this means is as this PR vendors I think one simple way to fix this failure without should be to upgrade the minimum version of dask to >=2022.9.2 (from 2022.1.1 where it currently is). EDIT: I tried changing the minimum version of dask-core in EDIT2: Another way to fix this should be to un-vendor |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1499432372 | https://github.com/pydata/xarray/pull/7019#issuecomment-1499432372 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85ZX4W0 | TomNicholas 35968931 | 2023-04-06T18:03:48Z | 2023-04-06T18:07:24Z | MEMBER | I'm having problems with ensuring the behaviour of the What's weird is that all tests pass for me locally but these failures occur on just some of the CI jobs (and which CI jobs is not even consistent apparently???). I have no idea why this would behave differently on only some of the CI jobs, especially after double-checking that |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1483155153 | https://github.com/pydata/xarray/pull/7019#issuecomment-1483155153 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85YZybR | TomNicholas 35968931 | 2023-03-24T17:19:44Z | 2023-03-24T17:21:32Z | MEMBER | I've made a bare-bones cubed-xarray package to store the |
{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 1, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1481626515 | https://github.com/pydata/xarray/pull/7019#issuecomment-1481626515 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85YT9OT | TomNicholas 35968931 | 2023-03-23T17:47:35Z | 2023-03-23T17:47:51Z | MEMBER | Thanks for the review @dcherian! I agree with basically everything you wrote. The main difficulty I have at this point is non-reproducible failures as described here |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1481504146 | https://github.com/pydata/xarray/pull/7019#issuecomment-1481504146 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85YTfWS | TomNicholas 35968931 | 2023-03-23T16:26:53Z | 2023-03-23T17:36:41Z | MEMBER |
Actually testing cubed with xarray in an environment without dask is currently blocked by rechunker's explicitly dependency on dask, see https://github.com/pangeo-data/rechunker/issues/139 EDIT: We can hack around this by pip installing cubed, then pip uninstalling dask as mentioned here |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1478328995 | https://github.com/pydata/xarray/pull/7019#issuecomment-1478328995 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85YHYKj | TomNicholas 35968931 | 2023-03-21T17:40:36Z | 2023-03-21T17:40:36Z | MEMBER |
Yes I think it does @headtr1ck - thanks for the reminder about that. I now want to finish this PR by exposing the "chunk manager" interface as a new entrypoint, copying the pattern used for xarray's backends. That would allow me to move the cubed-specific |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1472766481 | https://github.com/pydata/xarray/pull/7019#issuecomment-1472766481 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85XyKIR | TomNicholas 35968931 | 2023-03-16T21:26:36Z | 2023-03-16T21:26:36Z | MEMBER | Thanks @dcherian ! Once I copied that explicit indexer business I was able to get serialization to and from zarr working with cubed! ```python In [1]: import xarray as xr In [2]: from cubed import Spec In [3]: ds = xr.open_dataset( ...: 'airtemps.zarr', ...: chunks={}, ...: from_array_kwargs={ ...: 'manager': 'cubed', ...: 'spec': Spec(work_dir="tmp", max_mem=20e6), ...: } ...: ) /home/tom/Documents/Work/Code/xarray/xarray/backends/plugins.py:139: RuntimeWarning: 'netcdf4' fails while guessing warnings.warn(f"{engine!r} fails while guessing", RuntimeWarning) /home/tom/Documents/Work/Code/xarray/xarray/backends/plugins.py:139: RuntimeWarning: 'scipy' fails while guessing warnings.warn(f"{engine!r} fails while guessing", RuntimeWarning) In [4]: ds['air'] Out[4]: <xarray.DataArray 'air' (time: 2920, lat: 25, lon: 53)> cubed.Array<array-004, shape=(2920, 25, 53), dtype=float32, chunks=((730, 730, 730, 730), (13, 12), (27, 26))> Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 Attributes: GRIB_id: 11 ... In [5]: ds.isel(time=slice(100, 300)).to_zarr("cubed_subset.zarr") /home/tom/Documents/Work/Code/xarray/xarray/core/dataset.py:2118: SerializationWarning: saving variable None with floating point data as an integer dtype without any _FillValue to use for NaNs return to_zarr( # type: ignore Out[5]: <xarray.backends.zarr.ZarrStore at 0x7f34953033c0> ``` |
{ "total_count": 8, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 6, "rocket": 0, "eyes": 2 } |
Generalize handling of chunked array types 1368740629 | |
1469955680 | https://github.com/pydata/xarray/pull/7019#issuecomment-1469955680 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85Xnb5g | dcherian 2448579 | 2023-03-15T12:54:09Z | 2023-03-15T12:54:09Z | MEMBER |
This is unused, we use |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1469050607 | https://github.com/pydata/xarray/pull/7019#issuecomment-1469050607 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85Xj-7v | TomNicholas 35968931 | 2023-03-15T00:30:08Z | 2023-03-15T10:03:11Z | MEMBER | I tried opening a zarr store into xarray with chunking via cubed, but I got an error inside the indexing adapter classes. Somehow the type is completely wrong - would be good to type hint this part of the code, because this happens despite mypy passing now. ```python create example zarr storeorig = xr.tutorial.open_dataset("air_temperature") orig.to_zarr('air2.zarr') open it as a cubed arrayds = xr.open_dataset('air2.zarr', engine='zarr', chunks={}, from_array_kwargs={'manager': 'cubed'}) fails at this pointds.load() ```
```python
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
File ~/miniconda3/envs/cubed/lib/python3.9/site-packages/tenacity/__init__.py:382, in Retrying.__call__(self, fn, *args, **kwargs)
381 try:
--> 382 result = fn(*args, **kwargs)
383 except BaseException: # noqa: B902
File ~/miniconda3/envs/cubed/lib/python3.9/site-packages/cubed/runtime/executors/python.py:10, in exec_stage_func(func, *args, **kwargs)
8 @retry(stop=stop_after_attempt(3))
9 def exec_stage_func(func, *args, **kwargs):
---> 10 return func(*args, **kwargs)
File ~/miniconda3/envs/cubed/lib/python3.9/site-packages/cubed/primitive/blockwise.py:66, in apply_blockwise(out_key, config)
64 args.append(arg)
---> 66 result = config.function(*args)
67 if isinstance(result, dict): # structured array with named fields
File ~/miniconda3/envs/cubed/lib/python3.9/site-packages/cubed/core/ops.py:439, in map_blocks.<locals>.func_with_block_id.<locals>.wrap(*a, **kw)
438 block_id = offset_to_block_id(a[-1].item())
--> 439 return func(*a[:-1], block_id=block_id, **kw)
File ~/miniconda3/envs/cubed/lib/python3.9/site-packages/cubed/core/ops.py:572, in map_direct.<locals>.new_func.<locals>.wrap(block_id, *a, **kw)
571 args = a + arrays
--> 572 return func(*args, block_id=block_id, **kw)
File ~/miniconda3/envs/cubed/lib/python3.9/site-packages/cubed/core/ops.py:76, in _from_array(e, x, outchunks, asarray, block_id)
75 def _from_array(e, x, outchunks=None, asarray=None, block_id=None):
---> 76 out = x[get_item(outchunks, block_id)]
77 if asarray:
File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:627, in CopyOnWriteArray.__getitem__(self, key)
626 def __getitem__(self, key):
--> 627 return type(self)(_wrap_numpy_scalars(self.array[key]))
File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:534, in LazilyIndexedArray.__getitem__(self, indexer)
533 return array[indexer]
--> 534 return type(self)(self.array, self._updated_key(indexer))
File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:500, in LazilyIndexedArray._updated_key(self, new_key)
499 def _updated_key(self, new_key):
--> 500 iter_new_key = iter(expanded_indexer(new_key.tuple, self.ndim))
501 full_key = []
AttributeError: 'tuple' object has no attribute 'tuple'
The above exception was the direct cause of the following exception:
RetryError Traceback (most recent call last)
Cell In[69], line 1
----> 1 ds.load()
File ~/Documents/Work/Code/xarray/xarray/core/dataset.py:761, in Dataset.load(self, **kwargs)
758 chunkmanager = get_chunked_array_type(*lazy_data.values())
760 # evaluate all the chunked arrays simultaneously
--> 761 evaluated_data = chunkmanager.compute(*lazy_data.values(), **kwargs)
763 for k, data in zip(lazy_data, evaluated_data):
764 self.variables[k].data = data
File ~/Documents/Work/Code/xarray/xarray/core/parallelcompat.py:451, in CubedManager.compute(self, *data, **kwargs)
448 def compute(self, *data: "CubedArray", **kwargs) -> np.ndarray:
449 from cubed import compute
--> 451 return compute(*data, **kwargs)
File ~/miniconda3/envs/cubed/lib/python3.9/site-packages/cubed/core/array.py:300, in compute(executor, callbacks, optimize_graph, *arrays, **kwargs)
297 executor = PythonDagExecutor()
299 _return_in_memory_array = kwargs.pop("_return_in_memory_array", True)
--> 300 plan.execute(
301 executor=executor,
302 callbacks=callbacks,
303 optimize_graph=optimize_graph,
304 array_names=[a.name for a in arrays],
305 **kwargs,
306 )
308 if _return_in_memory_array:
309 return tuple(a._read_stored() for a in arrays)
File ~/miniconda3/envs/cubed/lib/python3.9/site-packages/cubed/core/plan.py:154, in Plan.execute(self, executor, callbacks, optimize_graph, array_names, **kwargs)
152 if callbacks is not None:
153 [callback.on_compute_start(dag) for callback in callbacks]
--> 154 executor.execute_dag(
155 dag, callbacks=callbacks, array_names=array_names, **kwargs
156 )
157 if callbacks is not None:
158 [callback.on_compute_end(dag) for callback in callbacks]
File ~/miniconda3/envs/cubed/lib/python3.9/site-packages/cubed/runtime/executors/python.py:22, in PythonDagExecutor.execute_dag(self, dag, callbacks, array_names, **kwargs)
20 if stage.mappable is not None:
21 for m in stage.mappable:
---> 22 exec_stage_func(stage.function, m, config=pipeline.config)
23 if callbacks is not None:
24 event = TaskEndEvent(array_name=name)
File ~/miniconda3/envs/cubed/lib/python3.9/site-packages/tenacity/__init__.py:289, in BaseRetrying.wraps.<locals>.wrapped_f(*args, **kw)
287 @functools.wraps(f)
288 def wrapped_f(*args: t.Any, **kw: t.Any) -> t.Any:
--> 289 return self(f, *args, **kw)
File ~/miniconda3/envs/cubed/lib/python3.9/site-packages/tenacity/__init__.py:379, in Retrying.__call__(self, fn, *args, **kwargs)
377 retry_state = RetryCallState(retry_object=self, fn=fn, args=args, kwargs=kwargs)
378 while True:
--> 379 do = self.iter(retry_state=retry_state)
380 if isinstance(do, DoAttempt):
381 try:
File ~/miniconda3/envs/cubed/lib/python3.9/site-packages/tenacity/__init__.py:326, in BaseRetrying.iter(self, retry_state)
324 if self.reraise:
325 raise retry_exc.reraise()
--> 326 raise retry_exc from fut.exception()
328 if self.wait:
329 sleep = self.wait(retry_state)
RetryError: RetryError[<Future at 0x7fc0c69be4f0 state=finished raised AttributeError>
```
This still works fine for dask. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1469232494 | https://github.com/pydata/xarray/pull/7019#issuecomment-1469232494 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85XkrVu | dcherian 2448579 | 2023-03-15T02:54:59Z | 2023-03-15T02:54:59Z | MEMBER |
This means you're indexing a |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1468732253 | https://github.com/pydata/xarray/pull/7019#issuecomment-1468732253 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85XixNd | TomNicholas 35968931 | 2023-03-14T19:52:38Z | 2023-03-14T19:52:38Z | MEMBER | Thanks @tomwhite - I think it might make sense for me to remove the Places There are a few remaining places where I haven't generalised to remove specific
I would like to get to the point where you can use xarray with a chunked array without ever importing dask. I think this PR gets very close, but that would be tricky to test because cubed depends on dask (so I can't just run the test suite without dask in the environment), and there are not yet any other parallel chunk-aware frameworks I know of (ramba and arkouda don't have a chunks attribute so wouldn't require this PR). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1460980509 | https://github.com/pydata/xarray/pull/7019#issuecomment-1460980509 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85XFMsd | TomNicholas 35968931 | 2023-03-08T22:37:21Z | 2023-03-08T22:39:46Z | MEMBER | I'm making progress with this PR, and now that @tomwhite implemented ```python In [1]: import xarray as xr In [2]: da = xr.DataArray([1, 2, 3], dims='x') In [3]: da_chunked = da.chunk(from_array_kwargs={'manager': 'cubed'}) In [4]: da_chunked Out[4]: <xarray.DataArray (x: 3)> cubed.Array<array-003, shape=(3,), dtype=int64, chunks=((3,),)> Dimensions without coordinates: x In [5]: da_chunked.mean() Out[5]: <xarray.DataArray ()> cubed.Array<array-006, shape=(), dtype=int64, chunks=()> In [6]: da_chunked.mean().compute() [cubed.Array<array-009, shape=(), dtype=int64, chunks=()>] Out[6]: <xarray.DataArray ()> array(2) ``` (You need to install both I still have a fair bit more to do on this PR (see checklist at top), but for testing should I:
I would prefer not to have this PR grow to be thousands of lines by including tests in it, but also waiting for #6908 might take a while because that's also a fairly ambitious PR. The fact that the tests are currently green for this PR (ignoring some mypy stuff) is evidence that the decoupling of dask from xarray is working so far. (I have already added some tests for the ability to register custom |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1400867666 | https://github.com/pydata/xarray/pull/7019#issuecomment-1400867666 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85Tf4tS | dcherian 2448579 | 2023-01-23T19:29:06Z | 2023-01-23T19:29:06Z | MEMBER |
I suspect Arkouda is similar in that this is not a detail the user is expected to worry about. (cc @sdbachman) |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1400848135 | https://github.com/pydata/xarray/pull/7019#issuecomment-1400848135 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85Tfz8H | TomNicholas 35968931 | 2023-01-23T19:14:26Z | 2023-01-23T19:14:26Z | MEMBER | @drtodd13 mentioned today that ramba doesn't actually require explicit chunks to work, which I hadn't realised. So forcing wrapped libraries to implement an explicit chunks method might be too restrictive. Ramba could possibly work entirely through the numpy array API standard. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1400846131 | https://github.com/pydata/xarray/pull/7019#issuecomment-1400846131 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85Tfzcz | TomNicholas 35968931 | 2023-01-23T19:12:57Z | 2023-01-23T19:12:57Z | MEMBER | @drtodd13 tagging you here and linking my notes from today's distributed arrays working group meeting for the links and references to this PR. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 | |
1255310245 | https://github.com/pydata/xarray/pull/7019#issuecomment-1255310245 | https://api.github.com/repos/pydata/xarray/issues/7019 | IC_kwDOAMm_X85K0oOl | TomNicholas 35968931 | 2022-09-22T17:06:42Z | 2022-09-22T17:06:42Z | MEMBER |
Yeah I was kind of asking whether this was unnecessarily abstract, and if there was a simpler design that achieved the same flexibility. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Generalize handling of chunked array types 1368740629 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 3