id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 2196272235,PR_kwDOAMm_X85qKODl,8856,Migrate indexing and broadcasting logic to `xarray.namedarray` (Part 1),13301940,open,0,,,0,2024-03-19T23:51:46Z,2024-05-03T17:08:11Z,,MEMBER,,1,pydata/xarray/pulls/8856," This pull request is the first part of migrating the indexing and broadcasting logic from `xarray.core.variable` to `xarray.namedarray`. I intend to open follow-up pull requests to address additional changes related to this refactoring, as outlined in the [proposal for decoupling lazy indexing functionality from NamedArray](https://github.com/pydata/xarray/blob/main/design_notes/named_array_design_doc.md#plan-for-decoupling-lazy-indexing-functionality-from-namedarray). - [ ] Closes #xxxx - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8856/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 2231711080,PR_kwDOAMm_X85sCbN-,8921,"Revert `.oindex` and `.vindex` additions in `_ElementwiseFunctionArray`, `NativeEndiannessArray`, and `BoolTypeArray` classes",13301940,open,0,,,9,2024-04-08T17:11:08Z,2024-04-30T06:49:46Z,,MEMBER,,0,pydata/xarray/pulls/8921," As noted in https://github.com/pydata/xarray/issues/8909, the use of `.oindex` and `.vindex` properties in coding/* appears to have broken some backends (e.g. scipy). This PR reverts those changes. We plan to bundle these changes into a separate backends feature branch (see [this comment](https://github.com/pydata/xarray/pull/8885#issuecomment-2036001828), which will be merged once we are confident about its impact on downstream dependencies. - [ ] Closes #8909 - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8921/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 2115621781,I_kwDOAMm_X85-GdOV,8696,🐛 compatibility issues with ArrayAPI and SparseAPI Protocols in `namedarray`,13301940,open,0,,,2,2024-02-02T19:27:07Z,2024-02-03T10:55:04Z,,MEMBER,,,,"### What happened? i'm experiencing compatibility issues when using `_arrayfunction_or_api` and `_sparsearrayfunction_or_api` with the sparse arrays with `dtype=object`. specifically, runtime checks using `isinstance` with these protocols are failing, despite the sparse array object appearing to meet the necessary criteria (attributes and methods). ### What did you expect to happen? i expected that since COO arrays from the sparse library provide the necessary attributes and methods, they would pass the `isinstance` checks with the defined protocols. ```python In [56]: from xarray.namedarray._typing import _arrayfunction_or_api, _sparsearrayfunc ...: tion_or_api In [57]: import xarray as xr, sparse, numpy as np, sparse, pandas as pd ``` - numeric dtypes work ```python In [58]: x = np.random.random((10)) In [59]: x[x < 0.9] = 0 In [60]: s = sparse.COO(x) In [61]: isinstance(s, _arrayfunction_or_api) Out[61]: True In [62]: s Out[62]: ``` - string dtypes work ```python In [63]: p = sparse.COO(np.array(['a', 'b'])) In [64]: p Out[64]: In [65]: isinstance(s, _arrayfunction_or_api) Out[65]: True ``` - object dtype doesn't work ```python In [66]: q = sparse.COO(np.array(['a', 'b']).astype(object)) In [67]: isinstance(s, _arrayfunction_or_api) Out[67]: True In [68]: isinstance(q, _arrayfunction_or_api) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) File ~/mambaforge/envs/xarray-tests/lib/python3.9/site-packages/sparse/_umath.py:606, in _Elemwise._get_func_coords_data(self, mask) 605 try: --> 606 func_data = self.func(*func_args, dtype=self.dtype, **self.kwargs) 607 except TypeError: TypeError: real() got an unexpected keyword argument 'dtype' During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) File ~/mambaforge/envs/xarray-tests/lib/python3.9/site-packages/sparse/_umath.py:611, in _Elemwise._get_func_coords_data(self, mask) 610 out = np.empty(func_args[0].shape, dtype=self.dtype) --> 611 func_data = self.func(*func_args, out=out, **self.kwargs) 612 except TypeError: TypeError: real() got an unexpected keyword argument 'out' During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) Cell In[68], line 1 ----> 1 isinstance(q, _arrayfunction_or_api) File ~/mambaforge/envs/xarray-tests/lib/python3.9/typing.py:1149, in _ProtocolMeta.__instancecheck__(cls, instance) 1147 return True 1148 if cls._is_protocol: -> 1149 if all(hasattr(instance, attr) and 1150 # All *methods* can be blocked by setting them to None. 1151 (not callable(getattr(cls, attr, None)) or 1152 getattr(instance, attr) is not None) 1153 for attr in _get_protocol_attrs(cls)): 1154 return True 1155 return super().__instancecheck__(instance) File ~/mambaforge/envs/xarray-tests/lib/python3.9/typing.py:1149, in (.0) 1147 return True 1148 if cls._is_protocol: -> 1149 if all(hasattr(instance, attr) and 1150 # All *methods* can be blocked by setting them to None. 1151 (not callable(getattr(cls, attr, None)) or 1152 getattr(instance, attr) is not None) 1153 for attr in _get_protocol_attrs(cls)): 1154 return True 1155 return super().__instancecheck__(instance) File ~/mambaforge/envs/xarray-tests/lib/python3.9/site-packages/sparse/_sparse_array.py:900, in SparseArray.real(self) 875 @property 876 def real(self): 877 """"""The real part of the array. 878 879 Examples (...) 898 numpy.real : NumPy equivalent function. 899 """""" --> 900 return self.__array_ufunc__(np.real, ""__call__"", self) File ~/mambaforge/envs/xarray-tests/lib/python3.9/site-packages/sparse/_sparse_array.py:340, in SparseArray.__array_ufunc__(self, ufunc, method, *inputs, **kwargs) 337 inputs = tuple(reversed(inputs_transformed)) 339 if method == ""__call__"": --> 340 result = elemwise(ufunc, *inputs, **kwargs) 341 elif method == ""reduce"": 342 result = SparseArray._reduce(ufunc, *inputs, **kwargs) File ~/mambaforge/envs/xarray-tests/lib/python3.9/site-packages/sparse/_umath.py:49, in elemwise(func, *args, **kwargs) 12 def elemwise(func, *args, **kwargs): 13 """""" 14 Apply a function to any number of arguments. 15 (...) 46 it is necessary to convert Numpy arrays to :obj:`COO` objects. 47 """""" ---> 49 return _Elemwise(func, *args, **kwargs).get_result() File ~/mambaforge/envs/xarray-tests/lib/python3.9/site-packages/sparse/_umath.py:480, in _Elemwise.get_result(self) 477 if not any(mask): 478 continue --> 480 r = self._get_func_coords_data(mask) 482 if r is not None: 483 coords_list.append(r[0]) File ~/mambaforge/envs/xarray-tests/lib/python3.9/site-packages/sparse/_umath.py:613, in _Elemwise._get_func_coords_data(self, mask) 611 func_data = self.func(*func_args, out=out, **self.kwargs) 612 except TypeError: --> 613 func_data = self.func(*func_args, **self.kwargs).astype(self.dtype) 615 unmatched_mask = ~equivalent(func_data, self.fill_value) 617 if not unmatched_mask.any(): ValueError: invalid literal for int() with base 10: 'a' In [69]: q Out[69]: ``` the failing case appears to be a well know issue - https://github.com/pydata/sparse/issues/104 ### Minimal Complete Verifiable Example ```Python In [69]: q Out[69]: In [70]: n = xr.NamedArray(data=q, dims=['x']) ``` ### MVCE confirmation - [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [ ] Complete example — the example is self-contained, including all data and the text of any traceback. - [ ] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [ ] New issue — a search of GitHub Issues suggests this is not a duplicate. - [ ] Recent environment — the issue occurs with the latest version of xarray and its dependencies. ### Relevant log output ```Python In [71]: n.data Out[71]: In [72]: n Out[72]: --------------------------------------------------------------------------- TypeError Traceback (most recent call last) File ~/mambaforge/envs/xarray-tests/lib/python3.9/site-packages/sparse/_umath.py:606, in _Elemwise._get_func_coords_data(self, mask) 605 try: --> 606 func_data = self.func(*func_args, dtype=self.dtype, **self.kwargs) 607 except TypeError: TypeError: real() got an unexpected keyword argument 'dtype' During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) File ~/mambaforge/envs/xarray-tests/lib/python3.9/site-packages/sparse/_umath.py:611, in _Elemwise._get_func_coords_data(self, mask) 610 out = np.empty(func_args[0].shape, dtype=self.dtype) --> 611 func_data = self.func(*func_args, out=out, **self.kwargs) 612 except TypeError: TypeError: real() got an unexpected keyword argument 'out' During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) File ~/mambaforge/envs/xarray-tests/lib/python3.9/site-packages/IPython/core/formatters.py:708, in PlainTextFormatter.__call__(self, obj) 701 stream = StringIO() 702 printer = pretty.RepresentationPrinter(stream, self.verbose, 703 self.max_width, self.newline, 704 max_seq_length=self.max_seq_length, 705 singleton_pprinters=self.singleton_printers, 706 type_pprinters=self.type_printers, 707 deferred_pprinters=self.deferred_printers) --> 708 printer.pretty(obj) 709 printer.flush() 710 return stream.getvalue() File ~/mambaforge/envs/xarray-tests/lib/python3.9/site-packages/IPython/lib/pretty.py:410, in RepresentationPrinter.pretty(self, obj) 407 return meth(obj, self, cycle) 408 if cls is not object \ 409 and callable(cls.__dict__.get('__repr__')): --> 410 return _repr_pprint(obj, self, cycle) 412 return _default_pprint(obj, self, cycle) 413 finally: File ~/mambaforge/envs/xarray-tests/lib/python3.9/site-packages/IPython/lib/pretty.py:778, in _repr_pprint(obj, p, cycle) 776 """"""A pprint that just redirects to the normal repr function."""""" 777 # Find newlines and replace them with p.break_() --> 778 output = repr(obj) 779 lines = output.splitlines() 780 with p.group(): File ~/devel/pydata/xarray/xarray/namedarray/core.py:987, in NamedArray.__repr__(self) 986 def __repr__(self) -> str: --> 987 return formatting.array_repr(self) File ~/mambaforge/envs/xarray-tests/lib/python3.9/reprlib.py:21, in recursive_repr..decorating_function..wrapper(self) 19 repr_running.add(key) 20 try: ---> 21 result = user_function(self) 22 finally: 23 repr_running.discard(key) File ~/devel/pydata/xarray/xarray/core/formatting.py:665, in array_repr(arr) 658 name_str = """" 660 if ( 661 isinstance(arr, Variable) 662 or _get_boolean_with_default(""display_expand_data"", default=True) 663 or isinstance(arr.variable._data, MemoryCachedArray) 664 ): --> 665 data_repr = short_data_repr(arr) 666 else: 667 data_repr = inline_variable_array_repr(arr.variable, OPTIONS[""display_width""]) File ~/devel/pydata/xarray/xarray/core/formatting.py:633, in short_data_repr(array) 631 if isinstance(array, np.ndarray): 632 return short_array_repr(array) --> 633 elif isinstance(internal_data, _arrayfunction_or_api): 634 return limit_lines(repr(array.data), limit=40) 635 elif getattr(array, ""_in_memory"", None): File ~/mambaforge/envs/xarray-tests/lib/python3.9/typing.py:1149, in _ProtocolMeta.__instancecheck__(cls, instance) 1147 return True 1148 if cls._is_protocol: -> 1149 if all(hasattr(instance, attr) and 1150 # All *methods* can be blocked by setting them to None. 1151 (not callable(getattr(cls, attr, None)) or 1152 getattr(instance, attr) is not None) 1153 for attr in _get_protocol_attrs(cls)): 1154 return True 1155 return super().__instancecheck__(instance) File ~/mambaforge/envs/xarray-tests/lib/python3.9/typing.py:1149, in (.0) 1147 return True 1148 if cls._is_protocol: -> 1149 if all(hasattr(instance, attr) and 1150 # All *methods* can be blocked by setting them to None. 1151 (not callable(getattr(cls, attr, None)) or 1152 getattr(instance, attr) is not None) 1153 for attr in _get_protocol_attrs(cls)): 1154 return True 1155 return super().__instancecheck__(instance) File ~/mambaforge/envs/xarray-tests/lib/python3.9/site-packages/sparse/_sparse_array.py:900, in SparseArray.real(self) 875 @property 876 def real(self): 877 """"""The real part of the array. 878 879 Examples (...) 898 numpy.real : NumPy equivalent function. 899 """""" --> 900 return self.__array_ufunc__(np.real, ""__call__"", self) File ~/mambaforge/envs/xarray-tests/lib/python3.9/site-packages/sparse/_sparse_array.py:340, in SparseArray.__array_ufunc__(self, ufunc, method, *inputs, **kwargs) 337 inputs = tuple(reversed(inputs_transformed)) 339 if method == ""__call__"": --> 340 result = elemwise(ufunc, *inputs, **kwargs) 341 elif method == ""reduce"": 342 result = SparseArray._reduce(ufunc, *inputs, **kwargs) File ~/mambaforge/envs/xarray-tests/lib/python3.9/site-packages/sparse/_umath.py:49, in elemwise(func, *args, **kwargs) 12 def elemwise(func, *args, **kwargs): 13 """""" 14 Apply a function to any number of arguments. 15 (...) 46 it is necessary to convert Numpy arrays to :obj:`COO` objects. 47 """""" ---> 49 return _Elemwise(func, *args, **kwargs).get_result() File ~/mambaforge/envs/xarray-tests/lib/python3.9/site-packages/sparse/_umath.py:480, in _Elemwise.get_result(self) 477 if not any(mask): 478 continue --> 480 r = self._get_func_coords_data(mask) 482 if r is not None: 483 coords_list.append(r[0]) File ~/mambaforge/envs/xarray-tests/lib/python3.9/site-packages/sparse/_umath.py:613, in _Elemwise._get_func_coords_data(self, mask) 611 func_data = self.func(*func_args, out=out, **self.kwargs) 612 except TypeError: --> 613 func_data = self.func(*func_args, **self.kwargs).astype(self.dtype) 615 unmatched_mask = ~equivalent(func_data, self.fill_value) 617 if not unmatched_mask.any(): ValueError: invalid literal for int() with base 10: 'a' ``` ### Anything else we need to know? i was trying to replace instances of `is_duck_array` with the protocol runtime checks (as part of https://github.com/pydata/xarray/pull/8319), and i've come to a realization that these runtime checks are rigid to accommodate the diverse behaviors of different array types, and `is_duck_array()` the function-based approach might be more manageable. @Illviljan, are there any changes that could be made to both protocols without making them too complex? ### Environment
```python INSTALLED VERSIONS ------------------ commit: 541049f45edeb518a767cb3b23fa53f6045aa508 python: 3.9.18 | packaged by conda-forge | (main, Dec 23 2023, 16:35:41) [Clang 16.0.6 ] python-bits: 64 OS: Darwin OS-release: 23.2.0 machine: arm64 processor: arm byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.14.3 libnetcdf: 4.9.2 xarray: 2024.1.2.dev50+g78dec61f pandas: 2.2.0 numpy: 1.26.3 scipy: 1.12.0 netCDF4: 1.6.5 pydap: installed h5netcdf: 1.3.0 h5py: 3.10.0 Nio: None zarr: 2.16.1 cftime: 1.6.3 nc_time_axis: 1.4.1 iris: 3.7.0 bottleneck: 1.3.7 dask: 2024.1.1 distributed: 2024.1.1 matplotlib: 3.8.2 cartopy: 0.22.0 seaborn: 0.13.2 numbagg: 0.7.1 fsspec: 2023.12.2 cupy: None pint: 0.23 sparse: 0.15.1 flox: 0.9.0 numpy_groupies: 0.9.22 setuptools: 67.7.2 pip: 23.3.2 conda: None pytest: 8.0.0 mypy: 1.8.0 IPython: 8.14.0 sphinx: None ```
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8696/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 775502974,MDU6SXNzdWU3NzU1MDI5NzQ=,4738,ENH: Compute hash of xarray objects,13301940,open,0,,,11,2020-12-28T17:18:57Z,2023-12-06T18:24:59Z,,MEMBER,,,," **Is your feature request related to a problem? Please describe.** I'm working on some caching/data-provenance functionality for xarray objects, and I realized that there's no standard/efficient way of computing hashes for xarray objects. **Describe the solution you'd like** It would be useful to have a configurable, reliable/standard `.hexdigest()` method on xarray objects. For example, zarr provides a digest method that returns you a digest/hash of the data. ```python In [16]: import zarr In [17]: z = zarr.zeros(shape=(10000, 10000), chunks=(1000, 1000)) In [18]: z.hexdigest() # uses sha1 by default for speed Out[18]: '7162d416d26a68063b66ed1f30e0a866e4abed60' In [20]: z.hexdigest(hashname='sha256') Out[20]: '46fc6e52fc1384e37cead747075f55201667dd539e4e72d0f372eb45abdcb2aa' ``` I'm thinking that an xarray's built-in hashing mechanism would provide a more reliable way to treat metadata such as global attributes, encoding, etc... during the hash computation... **Describe alternatives you've considered** So far, I am using joblib's default hasher: `joblib.hash()` function. However, I am in favor of having a configurable/built-in hasher that is aware of xarray's data model and quirks :) ```python In [1]: import joblib In [2]: import xarray as xr In [3]: ds = xr.tutorial.open_dataset('rasm') In [5]: joblib.hash(ds, hash_name='sha1') Out[5]: '3e5e3f56daf81e9e04a94a3dff9fdca9638c36cf' In [8]: ds.attrs = {} In [9]: joblib.hash(ds, hash_name='sha1') Out[9]: 'daab25fe735657e76514040608fadc67067d90a0' ``` **Additional context** Add any other context about the feature request here. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4738/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1916603412,PR_kwDOAMm_X85bZS7E,8244,Migrate VariableArithmetic to NamedArrayArithmetic,13301940,open,0,,,6,2023-09-28T02:29:15Z,2023-10-11T17:03:02Z,,MEMBER,,1,pydata/xarray/pulls/8244," - [ ] towards #8238 - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8244/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1903416932,I_kwDOAMm_X85xc9Zk,8210,Inconsistent Type Hinting for dims Parameter in xarray Methods,13301940,open,0,,,7,2023-09-19T17:15:43Z,2023-09-20T15:03:45Z,,MEMBER,,,,"`None` is not really practical in current xarray so not allowing it as a dimension is probably the easiest path, but type hinting will not be correct. I want `dims` to have a type hint that is consistent, easy to read and understand. In a dream world it would look something like this: ```python InputDim = Hashable # Single dimension InputDims = Iterable[InputDim , ...] # multiple dimensions InputDimOrDims = Union[InputDim, InputDims] # Single or multiple dimensions ``` Then we can easily go through our xarray methods and easily replace `dim` and `dims` arguments. `Hashable` could be fine in `NamedArray`, we haven't introduced `None` as a typical default value there yet. But it wouldn't be easy in xarray because we use `None` as default value a lot, which will (I suspect) lead to a bunch of refactoring and deprecations. I haven't tried it maybe it's doable? Another idea is to try and make a HashableExcludingNone: ```python HashableExcludingNone = Union[int, str, tuple, ...] # How many more Hashables are there? InputDim = HashableExcludingNone # Single dimension InputDims = Iterable[InputDim , ...] # multiple dimensions InputDimOrDims = Union[InputDim, InputDims] # Single or multiple dimensions ``` I suspect this is harder than it seems. Another idea is drop the idea of Hashable and just allow a few common ones that are used: ```python InputDim = str # Single dimension InputDims = tuple[InputDim , ...] # multiple dimensions InputDimOrDims = Union[InputDim, InputDims] # Single or multiple dimensions ``` Very clear! I think a few users (and maintainers) will be sad because of the lack of flexibility though. No easy paths, and trying to be backwards compatible is very demotivating. _Originally posted by @Illviljan in https://github.com/pydata/xarray/pull/8075#discussion_r1330437962_ ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8210/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 576502871,MDU6SXNzdWU1NzY1MDI4NzE=,3834,encode_cf_datetime() casts dask arrays to NumPy arrays,13301940,open,0,,,2,2020-03-05T20:11:37Z,2022-04-09T03:10:49Z,,MEMBER,,,," Currently, when `xarray.coding.times.encode_cf_datetime()` is called, it always casts the input to a NumPy array. This is not what I would expect when the input is a dask array. I am wondering if we could make this operation lazy when the input is a dask array? https://github.com/pydata/xarray/blob/01462d65c7213e5e1cddf36492c6a34a7e53ce55/xarray/coding/times.py#L352-L354 ```python In [46]: import numpy as np In [47]: import xarray as xr In [48]: import pandas as pd In [49]: times = pd.date_range(""2000-01-01"", ""2001-01-01"", periods=11) In [50]: time_bounds = np.vstack((times[:-1], times[1:])).T In [51]: arr = xr.DataArray(time_bounds).chunk() In [52]: arr Out[52]: dask.array, shape=(10, 2), dtype=datetime64[ns], chunksize=(10, 2), chunktype=numpy.ndarray> Dimensions without coordinates: dim_0, dim_1 In [53]: xr.coding.times.encode_cf_datetime(arr) Out[53]: (array([[ 0, 52704], [ 52704, 105408], [105408, 158112], [158112, 210816], [210816, 263520], [263520, 316224], [316224, 368928], [368928, 421632], [421632, 474336], [474336, 527040]]), 'minutes since 2000-01-01 00:00:00', 'proleptic_gregorian') ``` Cc @jhamman ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3834/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 653442225,MDU6SXNzdWU2NTM0NDIyMjU=,4209,`xr.save_mfdataset()` doesn't honor `compute=False` argument,13301940,open,0,,,4,2020-07-08T16:40:11Z,2022-04-09T01:25:56Z,,MEMBER,,,," **What happened**: While using `xr.save_mfdataset()` function with `compute=False` I noticed that the function returns a `dask.delayed` object, but **it doesn't actually defer the computation** i.e. it actually writes datasets right away. **What you expected to happen**: I expect the datasets to be written when I explicitly call `.compute()` on the returned delayed object. **Minimal Complete Verifiable Example**: ```python In [2]: import xarray as xr In [3]: ds = xr.tutorial.open_dataset('rasm', chunks={}) In [4]: ds Out[4]: Dimensions: (time: 36, x: 275, y: 205) Coordinates: * time (time) object 1980-09-16 12:00:00 ... 1983-08-17 00:00:00 xc (y, x) float64 dask.array yc (y, x) float64 dask.array Dimensions without coordinates: x, y Data variables: Tair (time, y, x) float64 dask.array Attributes: title: /workspace/jhamman/processed/R1002RBRxaaa01a/l... institution: U.W. source: RACM R1002RBRxaaa01a output_frequency: daily output_mode: averaged convention: CF-1.4 references: Based on the initial model of Liang et al., 19... comment: Output from the Variable Infiltration Capacity... nco_openmp_thread_number: 1 NCO: ""4.6.0"" history: Tue Dec 27 14:15:22 2016: ncatted -a dimension... In [5]: path = ""test.nc"" In [7]: ls -ltrh test.nc ls: cannot access test.nc: No such file or directory In [8]: tasks = xr.save_mfdataset(datasets=[ds], paths=[path], compute=False) In [9]: tasks Out[9]: Delayed('list-aa0b52e0-e909-4e65-849f-74526d137542') In [10]: ls -ltrh test.nc -rw-r--r-- 1 abanihi ncar 14K Jul 8 10:29 test.nc ``` **Anything else we need to know?**: **Environment**:
Output of xr.show_versions() ```python INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 | packaged by conda-forge | (default, Jun 1 2020, 18:57:50) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 3.10.0-693.21.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.4 xarray: 0.15.1 pandas: 0.25.3 numpy: 1.18.5 scipy: 1.5.0 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: 1.2.0 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.20.0 distributed: 2.20.0 matplotlib: 3.2.1 cartopy: None seaborn: None numbagg: None setuptools: 49.1.0.post20200704 pip: 20.1.1 conda: None pytest: None IPython: 7.16.1 sphinx: None ```
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4209/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 726020233,MDU6SXNzdWU3MjYwMjAyMzM=,4527,Refactor `xr.save_mfdataset()` to automatically save an xarray object backed by dask arrays to multiple files,13301940,open,0,,,2,2020-10-20T23:48:21Z,2020-10-22T17:06:46Z,,MEMBER,,,," **Is your feature request related to a problem? Please describe.** Currently, when a user wants to write multiple netCDF files in parallel with xarray and dask, they can take full advantage of `xr.save_mfdataset()` function. This function in its current state works fine, but the existing API requires that - the user generates file paths themselves - the user maps each chunk or dataset to a corresponding output file A few months ago, I wrote a [blog post](https://ncar.github.io/xdev/posts/writing-multiple-netcdf-files-in-parallel-with-xarray-and-dask/) showing how to save an xarray dataset backed by dask into multiple netCDF files, and since then I've been meaning to request a new feature to make this process convenient for users. **Describe the solution you'd like** **Would it be useful to actually refactor the existing `xr.save_mfdataset()` to automatically save an xarray object backed by dask arrays to multiple files without needing to create paths ourselves?** Today, this can be achieved via `xr.map_blocks`. In other words, is it possible to have something analogous to `to_zarr(....)` but for netCDF: ```python ds.save_mfdataset(prefix=""directory/my-dataset"") # or xr.save_mfdataset(ds, prefix=""directoy/my-dataset"") ``` ----> ```bash directory/my-dataset-chunk-1.nc directory/my-dataset-chunk-2.nc directory/my-dataset-chunk-3.nc .... ``` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4527/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue