id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 2152779535,PR_kwDOAMm_X85n15Fl,8784,Do not attempt to broadcast when global option ``arithmetic_broadcast=False``,45271239,closed,0,,,1,2024-02-25T14:00:57Z,2024-03-13T15:36:34Z,2024-03-13T15:36:34Z,CONTRIBUTOR,,0,pydata/xarray/pulls/8784,"**Follow-up PR after #8698** - [x] Closes #6806 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [x] What's new entry - [x] Refer to PR ID (cannot be done before the PR has been created) - [x] New functions/methods are listed in `api.rst` - No new functions/methods. ## Motive Refer to #8698 for history In this PR more specifically: - Added a global option, `arithmetic_broadcast`, `=True` by default (current state) - If `arithmetic_broadcast=False` , [`_binary_op`](https://github.com/pydata/xarray/pull/8784/files#diff-43c76e9be8425b5b6897dcecab4b240c32580447455c0c8c0b9b7fd84ce8a15dR2270) raises an error with message: ``` arithmetic broadcast is disabled via global option ``` ## Unrelated Also adds a decorator to handle the optional dependency `dask_expr`","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8784/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 2140225209,PR_kwDOAMm_X85nLLgJ,8761,Use ruff for formatting,45271239,open,0,,,10,2024-02-17T16:04:18Z,2024-02-27T20:11:57Z,,CONTRIBUTOR,,1,pydata/xarray/pulls/8761," - [ ] Closes #8760 ~~- [ ] Tests added~~ ~~- [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst`~~ ~~- [ ] New functions/methods are listed in `api.rst`~~ Note: many inline `...` obtain their own line. Running `black .` would have produced the same result ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8761/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 2140173727,I_kwDOAMm_X85_kHWf,8760,Use `ruff` for formatting ,45271239,open,0,,,0,2024-02-17T15:07:17Z,2024-02-26T05:58:53Z,,CONTRIBUTOR,,,,"### What is your issue? ## Use `ruff` for formatting ### Context Ruff was introduced in https://github.com/pydata/xarray/issues/7458. Arguments in favor were that it is faster, and combines multiple tools in a single tool (eg `flake8`, `pyflakes`, `isort`, `pyupgrade` ). > This switches our primary linter to Ruff. As adervertised, Ruff is very fast. Plust we get the benefit of using a single tool that combines the previous functionality of pyflakes, isort, and pyupgrade. ### Suggestion Suggestion: To move on with ruff replacement of tools, introduce ruff-format to replace `black` (See [ruff Usage](https://github.com/astral-sh/ruff?tab=readme-ov-file#usage) for integration with pre-commit). See issue Pandas uses ruff and ruff-format: https://github.com/pandas-dev/pandas/blob/63dc0f76faa208450b8aaa57246029fcf94d015b/.pre-commit-config.yaml#L24 Ruff is capable of dosctring formatting: https://docs.astral.sh/ruff/formatter/#docstring-formatting Ruff can format Jupyter Notebooks: https://docs.astral.sh/ruff/faq/#does-ruff-support-jupyter-notebooks So, introducing the ruff formatter might remove the need for `black` and `blackdoc`","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8760/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 2127814221,PR_kwDOAMm_X85mhHB1,8729,Reinforce alignment checks when `join='exact'`,45271239,closed,0,,,0,2024-02-09T20:36:46Z,2024-02-25T12:51:54Z,2024-02-25T12:51:54Z,CONTRIBUTOR,,1,pydata/xarray/pulls/8729," :information_source: Companion PR of #8698 Aims to check the consequences of transforming `join='exact'` aligments into `join='strict'` ones.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8729/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 2116618415,PR_kwDOAMm_X85l7Cdb,8698,New alignment option: `join='strict'`,45271239,closed,0,,,5,2024-02-03T17:58:43Z,2024-02-25T09:09:37Z,2024-02-25T09:09:37Z,CONTRIBUTOR,,0,pydata/xarray/pulls/8698," Title: New alignment option: `join='strict'` - [ ] Closes #8231 - [x] Closes #6806 - [x] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [x] What's new entry - [x] Refer to PR ID (cannot be done before the PR has been created) - [x] New functions/methods are listed in `api.rst` - No new functions/methods. ## Motive This PR is motivated by solving of the following issues: - xr.concat concatenates along dimensions that it wasn't asked to #8231 - New alignment option: ""exact"" without broadcasting OR Turn off automatic broadcasting #6806 **The current PR does not solve the unexpected issue described in #8231 without a change in user-code**. Indeed, in the tests written, it is shown that to get the said expected behavior, the user would have to use the new `join='strict'` mode suggested in #6806 for the concatenation operation. Only in that case, the uniqueness of the indexed dimensions' names will be checked, re-using the same logic that was already applied for `join='override'` in `Aligner.find_matching_indexes` This may not be enough to fix #8231. If that isn't, I can split the PR into two, first one for adding the `join='strict'` for #6806 and later on one for #8321. ## Technical Details I try to detail here my thought process. Please correct me if there is anything wrong. This is my first time digging into this core logic! Here is my understanding of the terms: - An **indexed dimension** is attached to a coordinate variable - An **unindexed dimension** is not attached to a coordinate variable (_""Dimensions without coordinates""_) Input data for Scenario 1, tested in `test_concat_join_coordinate_variables_non_asked_dims` ```python ds1 = Dataset( coords={ ""x_center"": (""x_center"", [1, 2, 3]), ""x_outer"": (""x_outer"", [0.5, 1.5, 2.5, 3.5]), }, ) ds2 = Dataset( coords={ ""x_center"": (""x_center"", [4, 5, 6]), ""x_outer"": (""x_outer"", [4.5, 5.5, 6.5]), }, ) ``` Input data for Scenario 2, tested in `test_concat_join_non_coordinate_variables` ```python ds1 = Dataset( data_vars={ ""a"": (""x_center"", [1, 2, 3]), ""b"": (""x_outer"", [0.5, 1.5, 2.5, 3.5]), }, ) ds2 = Dataset( data_vars={ ""a"": (""x_center"", [4, 5, 6]), ""b"": (""x_outer"", [4.5, 5.5, 6.5]), }, ) ``` Logic for non-indexed dimensions logic was working ""as expected"", as it relies on `Aligner.assert_unindexed_dim_sizes_equal`, checking that unindexed dimension sizes are equal as its name suggests. (Scenario 1) However, the logic for indexed dimensions was surprising as such an expected check on dimensions' sizes was not performed. A check exists in `Aligner.find_matching_indexes` but was only applied to `join='override'`. Applying it for `join='strict'` too is suggested in this Pull Request. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8698/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 2117187646,PR_kwDOAMm_X85l85Qf,8702,Add a simple `nbytes` representation in DataArrays and Dataset `repr`,45271239,closed,0,,,23,2024-02-04T16:37:41Z,2024-02-20T11:15:51Z,2024-02-07T20:47:37Z,CONTRIBUTOR,,0,pydata/xarray/pulls/8702," Edit: in contrary to what the title suggest, this is not an opt-in feature, it is enabled by default - [x] Closes #8690 - (or at least is a proposal) - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - ~~[ ] New functions/methods are listed in `api.rst`~~ ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8702/reactions"", ""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 2140968762,I_kwDOAMm_X85_nJc6,8763,"Documentation 404 not found for ""Suggest Edit"" link in ""API Reference"" pages",45271239,open,0,,,0,2024-02-18T12:39:25Z,2024-02-18T12:39:25Z,,CONTRIBUTOR,,,,"### What happened? Concrete example: let's say I am currently reading the documentation of [DataArray.resample](https://docs.xarray.dev/en/stable/generated/xarray.DataArray.resample.html). I would like to have a look at the internals and see the code directly on GitHub. ![Screenshot from 2024-02-18 13-32-03](https://github.com/pydata/xarray/assets/45271239/841b7c81-a0e1-4a1d-b5d7-fc9f3a286219) We can see a GitHub icon, with 3 links: - Repositry: leads to the home page of the repo: https://github.com/pydata/xarray - Suggest edit: leads to a [404 not found](https://github.com/pydata/xarray/edit/main/doc/generated/xarray.DataArray.resample.rst) as it points to the generated documentation - Open issue (generic link to open an issue) The `[source]` link does what is expected: it leads to the source code https://github.com/pydata/xarray/blob/main/xarray/core/dataset.py#L10471-L10565 ### What did you expect to happen? The second link ""Suggest edit"" should actually lead to the source code, as the documentation is auto-generated from the docstrings themselves. Maybe it could be renamed like ""View source"" Example of other repos having this feature: ### Minimal Complete Verifiable Example ```Python N/A ``` ### MVCE confirmation - [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [ ] Complete example — the example is self-contained, including all data and the text of any traceback. - [ ] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [ ] New issue — a search of GitHub Issues suggests this is not a duplicate. - [ ] Recent environment — the issue occurs with the latest version of xarray and its dependencies. ### Relevant log output ```Python N/A ``` ### Anything else we need to know? _No response_ ### Environment N/A","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8763/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 2135262747,I_kwDOAMm_X85_RYYb,8749,Lack of resilience towards missing `_ARRAY_DIMENSIONS` xarray's special zarr attribute #280 ,45271239,open,0,,,2,2024-02-14T21:52:34Z,2024-02-15T19:15:59Z,,CONTRIBUTOR,,,,"### What is your issue? _Original issue: https://github.com/xarray-contrib/datatree/issues/280_ _Note: this issue description was generated from a [notebook](https://github.com/etienneschalk/datatree-experimentation/blob/main/notebooks/datatree-zarr.ipynb). You can use it to reproduce locally the bug._ # Lack of resilience towards missing `_ARRAY_DIMENSIONS` xarray's special Zarr attribute ```python from pathlib import Path import json from typing import Any import numpy as np import xarray as xr ``` ## \_Utilities _This section only declares utilities functions and do not contain any additional value for the reader_ ```python # Set to True to get rich HTML representations in an interactive Notebook session # Set to False to get textual representations ready to be converted to markdown for issue report INTERACTIVE = False # Convert to markdown with # jupyter nbconvert --to markdown notebooks/datatree-zarr.ipynb ``` ```python def show(obj: Any) -> Any: if isinstance(obj, Path): if INTERACTIVE: return obj.resolve() else: print(obj) else: if INTERACTIVE: return obj else: print(obj) def load_json(path: Path) -> dict: with open(path, encoding=""utf-8"") as fp: return json.load(fp) ``` ## Data Creation I create a dummy Dataset containing a single `(label, z)`-dimensional DataArray named `my_xda`. ```python xda = xr.DataArray( np.arange(3 * 18).reshape(3, 18), coords={""label"": list(""abc""), ""z"": list(range(18))}, ) xda = xda.chunk({""label"": 2, ""z"": 4}) show(xda) ``` dask.array, shape=(3, 18), dtype=int64, chunksize=(2, 4), chunktype=numpy.ndarray> Coordinates: * label (label) Dimensions: (label: 3, z: 18) Coordinates: * label (label) ## Data Writing I persist the Dataset to Zarr ```python zarr_path = Path() / ""../generated/zarrounet.zarr"" xds.to_zarr(zarr_path, mode=""w"") show(zarr_path) ``` ../generated/zarrounet.zarr ## Data Initial Reading I read successfully the Dataset ```python show(xr.open_zarr(zarr_path).my_xda) ``` dask.array Coordinates: * label (label) Dimensions: (label: 3, z: 18) Coordinates: * label (label) However, the last alteration, which is removing the `_ARRAY_DIMENSIONS` key-value pair from one of the variables in the `.zmetadata` file present at the root of the zarr, results in an exception when reading. The error message is explicit: `KeyError: '_ARRAY_DIMENSIONS'` ❌ This means xarray cannot open any Zarr file, but only those who possess an xarray's special private attribute, `_ARRAY_DIMENSIONS`. > Because of these choices, Xarray cannot read arbitrary array data, but only Zarr data with valid `_ARRAY_DIMENSIONS` See https://docs.xarray.dev/en/latest/internals/zarr-encoding-spec.html In a first phase, the error message can probably be more explicit (better than a low-level `KeyError`), explaining that xarray cannot yet open arbitrary Zarr data. ```python zmetadata_path = zarr_path / "".zmetadata"" assert zmetadata_path.is_file() zmetadata = load_json(zmetadata_path) zmetadata[""metadata""][""z/.zattrs""] = {} zmetadata_path.write_text(json.dumps(zmetadata, indent=4)) ``` 1925 ```python show(xr.open_zarr(zarr_path)) ``` --------------------------------------------------------------------------- KeyError Traceback (most recent call last) File ~/.cache/pypoetry/virtualenvs/datatree-experimentation-Sa4oWCLA-py3.10/lib/python3.10/site-packages/xarray/backends/zarr.py:212, in _get_zarr_dims_and_attrs(zarr_obj, dimension_key, try_nczarr) 210 try: 211 # Xarray-Zarr --> 212 dimensions = zarr_obj.attrs[dimension_key] 213 except KeyError as e: File ~/.cache/pypoetry/virtualenvs/datatree-experimentation-Sa4oWCLA-py3.10/lib/python3.10/site-packages/zarr/attrs.py:73, in Attributes.__getitem__(self, item) 72 def __getitem__(self, item): ---> 73 return self.asdict()[item] KeyError: '_ARRAY_DIMENSIONS' During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) Cell In[11], line 1 ----> 1 show(xr.open_zarr(zarr_path)) File ~/.cache/pypoetry/virtualenvs/datatree-experimentation-Sa4oWCLA-py3.10/lib/python3.10/site-packages/xarray/backends/zarr.py:900, in open_zarr(store, group, synchronizer, chunks, decode_cf, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, consolidated, overwrite_encoded_chunks, chunk_store, storage_options, decode_timedelta, use_cftime, zarr_version, chunked_array_type, from_array_kwargs, **kwargs) 886 raise TypeError( 887 ""open_zarr() got unexpected keyword arguments "" + "","".join(kwargs.keys()) 888 ) 890 backend_kwargs = { 891 ""synchronizer"": synchronizer, 892 ""consolidated"": consolidated, (...) 897 ""zarr_version"": zarr_version, 898 } --> 900 ds = open_dataset( 901 filename_or_obj=store, 902 group=group, 903 decode_cf=decode_cf, 904 mask_and_scale=mask_and_scale, 905 decode_times=decode_times, 906 concat_characters=concat_characters, 907 decode_coords=decode_coords, 908 engine=""zarr"", 909 chunks=chunks, 910 drop_variables=drop_variables, 911 chunked_array_type=chunked_array_type, 912 from_array_kwargs=from_array_kwargs, 913 backend_kwargs=backend_kwargs, 914 decode_timedelta=decode_timedelta, 915 use_cftime=use_cftime, 916 zarr_version=zarr_version, 917 ) 918 return ds File ~/.cache/pypoetry/virtualenvs/datatree-experimentation-Sa4oWCLA-py3.10/lib/python3.10/site-packages/xarray/backends/api.py:573, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, chunked_array_type, from_array_kwargs, backend_kwargs, **kwargs) 561 decoders = _resolve_decoders_kwargs( 562 decode_cf, 563 open_backend_dataset_parameters=backend.open_dataset_parameters, (...) 569 decode_coords=decode_coords, 570 ) 572 overwrite_encoded_chunks = kwargs.pop(""overwrite_encoded_chunks"", None) --> 573 backend_ds = backend.open_dataset( 574 filename_or_obj, 575 drop_variables=drop_variables, 576 **decoders, 577 **kwargs, 578 ) 579 ds = _dataset_from_backend_dataset( 580 backend_ds, 581 filename_or_obj, (...) 591 **kwargs, 592 ) 593 return ds File ~/.cache/pypoetry/virtualenvs/datatree-experimentation-Sa4oWCLA-py3.10/lib/python3.10/site-packages/xarray/backends/zarr.py:982, in ZarrBackendEntrypoint.open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, synchronizer, consolidated, chunk_store, storage_options, stacklevel, zarr_version) 980 store_entrypoint = StoreBackendEntrypoint() 981 with close_on_error(store): --> 982 ds = store_entrypoint.open_dataset( 983 store, 984 mask_and_scale=mask_and_scale, 985 decode_times=decode_times, 986 concat_characters=concat_characters, 987 decode_coords=decode_coords, 988 drop_variables=drop_variables, 989 use_cftime=use_cftime, 990 decode_timedelta=decode_timedelta, 991 ) 992 return ds File ~/.cache/pypoetry/virtualenvs/datatree-experimentation-Sa4oWCLA-py3.10/lib/python3.10/site-packages/xarray/backends/store.py:43, in StoreBackendEntrypoint.open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta) 29 def open_dataset( # type: ignore[override] # allow LSP violation, not supporting **kwargs 30 self, 31 filename_or_obj: str | os.PathLike[Any] | BufferedIOBase | AbstractDataStore, (...) 39 decode_timedelta=None, 40 ) -> Dataset: 41 assert isinstance(filename_or_obj, AbstractDataStore) ---> 43 vars, attrs = filename_or_obj.load() 44 encoding = filename_or_obj.get_encoding() 46 vars, attrs, coord_names = conventions.decode_cf_variables( 47 vars, 48 attrs, (...) 55 decode_timedelta=decode_timedelta, 56 ) File ~/.cache/pypoetry/virtualenvs/datatree-experimentation-Sa4oWCLA-py3.10/lib/python3.10/site-packages/xarray/backends/common.py:210, in AbstractDataStore.load(self) 188 def load(self): 189 """""" 190 This loads the variables and attributes simultaneously. 191 A centralized loading function makes it easier to create (...) 207 are requested, so care should be taken to make sure its fast. 208 """""" 209 variables = FrozenDict( --> 210 (_decode_variable_name(k), v) for k, v in self.get_variables().items() 211 ) 212 attributes = FrozenDict(self.get_attrs()) 213 return variables, attributes File ~/.cache/pypoetry/virtualenvs/datatree-experimentation-Sa4oWCLA-py3.10/lib/python3.10/site-packages/xarray/backends/zarr.py:519, in ZarrStore.get_variables(self) 518 def get_variables(self): --> 519 return FrozenDict( 520 (k, self.open_store_variable(k, v)) for k, v in self.zarr_group.arrays() 521 ) File ~/.cache/pypoetry/virtualenvs/datatree-experimentation-Sa4oWCLA-py3.10/lib/python3.10/site-packages/xarray/core/utils.py:471, in FrozenDict(*args, **kwargs) 470 def FrozenDict(*args, **kwargs) -> Frozen: --> 471 return Frozen(dict(*args, **kwargs)) File ~/.cache/pypoetry/virtualenvs/datatree-experimentation-Sa4oWCLA-py3.10/lib/python3.10/site-packages/xarray/backends/zarr.py:520, in (.0) 518 def get_variables(self): 519 return FrozenDict( --> 520 (k, self.open_store_variable(k, v)) for k, v in self.zarr_group.arrays() 521 ) File ~/.cache/pypoetry/virtualenvs/datatree-experimentation-Sa4oWCLA-py3.10/lib/python3.10/site-packages/xarray/backends/zarr.py:496, in ZarrStore.open_store_variable(self, name, zarr_array) 494 data = indexing.LazilyIndexedArray(ZarrArrayWrapper(name, self)) 495 try_nczarr = self._mode == ""r"" --> 496 dimensions, attributes = _get_zarr_dims_and_attrs( 497 zarr_array, DIMENSION_KEY, try_nczarr 498 ) 499 attributes = dict(attributes) 501 # TODO: this should not be needed once 502 # https://github.com/zarr-developers/zarr-python/issues/1269 is resolved. File ~/.cache/pypoetry/virtualenvs/datatree-experimentation-Sa4oWCLA-py3.10/lib/python3.10/site-packages/xarray/backends/zarr.py:222, in _get_zarr_dims_and_attrs(zarr_obj, dimension_key, try_nczarr) 220 # NCZarr defines dimensions through metadata in .zarray 221 zarray_path = os.path.join(zarr_obj.path, "".zarray"") --> 222 zarray = json.loads(zarr_obj.store[zarray_path]) 223 try: 224 # NCZarr uses Fully Qualified Names 225 dimensions = [ 226 os.path.basename(dim) for dim in zarray[""_NCZARR_ARRAY""][""dimrefs""] 227 ] File ~/.pyenv/versions/3.10.12/lib/python3.10/json/__init__.py:339, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw) 337 else: 338 if not isinstance(s, (bytes, bytearray)): --> 339 raise TypeError(f'the JSON object must be str, bytes or bytearray, ' 340 f'not {s.__class__.__name__}') 341 s = s.decode(detect_encoding(s), 'surrogatepass') 343 if (cls is None and object_hook is None and 344 parse_int is None and parse_float is None and 345 parse_constant is None and object_pairs_hook is None and not kw): TypeError: the JSON object must be str, bytes or bytearray, not dict ## `xr.show_versions()` ```python import warnings warnings.filterwarnings(""ignore"") xr.show_versions() ``` INSTALLED VERSIONS ------------------ commit: None python: 3.10.12 (main, Aug 15 2023, 11:50:32) [GCC 9.4.0] python-bits: 64 OS: Linux OS-release: 5.15.0-92-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2023.10.1 pandas: 2.1.3 numpy: 1.25.2 scipy: None netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.16.1 cftime: None nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: None dask: 2023.11.0 distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None fsspec: 2023.10.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 67.8.0 pip: 23.1.2 conda: None pytest: None mypy: None IPython: 8.17.2 sphinx: None ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8749/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 2117299976,I_kwDOAMm_X85-M28I,8705,"More granularity in the CI, separating code and docs changes?",45271239,open,0,,,7,2024-02-04T20:54:30Z,2024-02-15T14:51:12Z,,CONTRIBUTOR,,,,"### What is your issue? Hi, TLDR: Is there a way to only run relevant CI checks (eg documentation) when a new commit is pushed on a PR's branch? The following issue is written from a naive user point of view. Indeed I do not know how the CI works on this project. I constated that when updating an existing Pull Request, the whole test battery is re-executed. However, it is a common scenario that someone wants to update only the documentation, for instance. In that case, it might make sense to only retrigger the documentation checks. A little bit like `pre-commit` that only runs on the updated files. Achieving such a level of granularity is not desirable as even a small code change could make _geographically remote tests_ in the code fail, however, a high-level separation between code and docs for instance, might relieve a little bit the pipelines. This is assuming the code does not depend at all on the code. Maybe other separations exists, but the first I can think of is code vs docs. Another separation would be to have an ""order"" / ""dependency system"" in the pipeline. Eg, `A -> B -> C` ; if `A` fails, there is no point into taking resources to compute `B` as we know for sure the rest will fail. Such a hierarchy might be difficult for the test matrix that is unordered (eg Python Version x OS, on this project it seems to be more or less `(3.9, 3.10, 3.11, 3.12) x (Ubuntu, macOS, Windows)` There is also a notion of frequency and execution time: pipelines' stages that are the most empirically likely to fail and the shortest to runshould be ran first, to avoid having them fail due to flakiness and out of bad luck when all the other checks passed before. Such a stage exists: ` CI / ubuntu-latest py3.10 flaky` (it is in the name). Taking that into account, the `CI Additional / Mypy ` stage qualifies for both criteria should be ran before everything else for instance. Indeed, it is static code checking and very likely to fail, something a developer might also run locally before committing / pushing, and only takes one minute to run (compared to several minutes for each of stages of the Python Version x OS matrix). The goal here is to save resources (at the cost of losing the ""completeness"" of the CI run) ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8705/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 2128415253,I_kwDOAMm_X85-3QoV,8732,Failing doctest CI Job: `The current Dask DataFrame implementation is deprecated.`,45271239,closed,0,,,1,2024-02-10T13:12:23Z,2024-02-10T23:44:25Z,2024-02-10T23:44:25Z,CONTRIBUTOR,,,,"### What happened? The [doctest CI job for my Pull Request](https://github.com/pydata/xarray/actions/runs/7854959732/job/21436224544?pr=8698) failed. The failure seems at first glance to be unrelated to my code changes. It seems related to a Dask warning. Note: I create this issue for logging purposes ; it might become relevant only once another unrelated PR is subject to the same bug. ### What did you expect to happen? I expected the `doctest` CI Job to pass. This error happens both on the online CI and locally when running ``` python -m pytest --doctest-modules xarray --ignore xarray/tests --ignore xarray/datatree_ -Werror ``` (the command is taken from the CI definition file: https://github.com/pydata/xarray/actions/runs/7854959732/workflow?pr=8698#L83) ### Minimal Complete Verifiable Example ```Python N/A ``` ### MVCE confirmation - [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [ ] Complete example — the example is self-contained, including all data and the text of any traceback. - [ ] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [ ] New issue — a search of GitHub Issues suggests this is not a duplicate. - [ ] Recent environment — the issue occurs with the latest version of xarray and its dependencies. ### Relevant log output ```Python =================================== FAILURES =================================== _________ [doctest] xarray.core.dataarray.DataArray.to_dask_dataframe __________ 7373 ... dims=(""time"", ""lat"", ""lon""), 7374 ... coords={ 7375 ... ""time"": np.arange(4), 7376 ... ""lat"": [-30, -20], 7377 ... ""lon"": [120, 130], 7378 ... }, 7379 ... name=""eg_dataarray"", 7380 ... attrs={""units"": ""Celsius"", ""description"": ""Random temperature data""}, 7381 ... ) 7382 >>> da.to_dask_dataframe([""lat"", ""lon"", ""time""]).compute() UNEXPECTED EXCEPTION: DeprecationWarning(""The current Dask DataFrame implementation is deprecated. \nIn a future release, Dask DataFrame will use new implementation that\ncontains several improvements including a logical query planning.\nThe user-facing DataFrame API will remain unchanged.\n\nThe new implementation is already available and can be enabled by\ninstalling the dask-expr library:\n\n $ pip install dask-expr\n\nand turning the query planning option on:\n\n >>> import dask\n >>> dask.config.set({'dataframe.query-planning': True})\n >>> import dask.dataframe as dd\n\nAPI documentation for the new implementation is available at\n[https://docs.dask.org/en/stable/dask-expr-api.html\n\nAny](https://docs.dask.org/en/stable/dask-expr-api.html/n/nAny) feedback can be reported on the Dask issue tracker\nhttps://github.com/dask/dask/issues \n"") Traceback (most recent call last): File ""/home/runner/micromamba/envs/xarray-tests/lib/python3.11/doctest.py"", line 1353, in __run exec(compile(example.source, filename, ""single"", File """", line 1, in File ""/home/runner/work/xarray/xarray/xarray/core/dataarray.py"", line 7408, in to_dask_dataframe return ds.to_dask_dataframe(dim_order, set_index) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ""/home/runner/work/xarray/xarray/xarray/core/dataset.py"", line 7369, in to_dask_dataframe import dask.dataframe as dd File ""/home/runner/micromamba/envs/xarray-tests/lib/python3.11/site-packages/dask/dataframe/__init__.py"", line 162, in warnings.warn( DeprecationWarning: The current Dask DataFrame implementation is deprecated. In a future release, Dask DataFrame will use new implementation that contains several improvements including a logical query planning. The user-facing DataFrame API will remain unchanged. The new implementation is already available and can be enabled by installing the dask-expr library: $ pip install dask-expr and turning the query planning option on: >>> import dask >>> dask.config.set({'dataframe.query-planning': True}) >>> import dask.dataframe as dd API documentation for the new implementation is available at https://docs.dask.org/en/stable/dask-expr-api.html Any feedback can be reported on the Dask issue tracker https://github.com/dask/dask/issues /home/runner/work/xarray/xarray/xarray/core/dataarray.py:7382: UnexpectedException =========================== short test summary info ============================ FAILED xarray/core/dataarray.py::xarray.core.dataarray.DataArray.to_dask_dataframe ============= 1 failed, 301 passed, 2 skipped in 78.04s (0:01:18) ============== Error: Process completed with exit code 1. ``` ### Anything else we need to know? _No response_ ### Environment N/A","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8732/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 2123948734,PR_kwDOAMm_X85mT5_9,8719,Test formatting platform,45271239,closed,0,,,2,2024-02-07T21:41:23Z,2024-02-09T03:01:35Z,2024-02-09T03:01:35Z,CONTRIBUTOR,,0,pydata/xarray/pulls/8719,"Follow up #8702 / https://github.com/pydata/xarray/pull/8702#issuecomment-1932851112 The goal is to remove the not elegant OS-dependant checks introduced during the testing of #8702 A simple way to do so is to use unsigned integer as dtypes for tests involving data array representations on multiple OSes. Indeed, this solves the issue of the default dtypes being not printed in the repr, with default dtyps varying according to the OS. The tests show that the concerned dtypes are `int32` (for the Windows CI) and `int64` (for Ubuntu and macOS CIs). Using `uint64` should fix both the varying size and the varying numpy array repr. ~~- [ ] Closes #xxxx~~ - [x] Tests added ~~- [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst`~~ ~~- [ ] New functions/methods are listed in `api.rst`~~ ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8719/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull