id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1160062673,I_kwDOAMm_X85FJSbR,6333,Expressing dimension's preferred chunks as tuple of integers causes TypeError,38358698,closed,0,,,0,2022-03-04T21:23:21Z,2022-04-08T17:18:50Z,2022-04-08T17:18:50Z,CONTRIBUTOR,,,,"### What happened? When opening a dataset containing a variable that has preferred chunks expressed along some dimension as a tuple of integers, *xarray* raises a `TypeError`. ### What did you expect to happen? I expected to open the dataset with its preferred chunks, as described in the documentation on preferred chunks within ""How to add a new backend"". ### Minimal Complete Verifiable Example ```Python import xarray as xr class PassThroughBackendEntrypoint(xr.backends.BackendEntrypoint): def open_dataset(self, dataset, *, drop_variables=None): return dataset initial = xr.Dataset( { ""data"": xr.Variable( (""dim"",), [0, 0], encoding={""preferred_chunks"": {""dim"": (1, 1)}} ) } ) final = xr.open_dataset(initial, engine=PassThroughBackendEntrypoint, chunks={}) ``` ### Relevant log output ```Python [Paths simplified.] Traceback (most recent call last): File """", line 1, in File ""...\xarray\backends\api.py"", line 501, in open_dataset ds = _dataset_from_backend_dataset( File ""...\xarray\backends\api.py"", line 317, in _dataset_from_backend_dataset ds = _chunk_ds( File ""...\xarray\backends\api.py"", line 287, in _chunk_ds var_chunks = _get_chunk(var, chunks) File ""...\xarray\core\dataset.py"", line 409, in _get_chunk _check_chunks_compatibility(var, output_chunks, preferred_chunks) File ""...\xarray\core\dataset.py"", line 371, in _check_chunks_compatibility if any(s % preferred_chunks_dim for s in chunks_dim): File ""...\xarray\core\dataset.py"", line 371, in if any(s % preferred_chunks_dim for s in chunks_dim): TypeError: unsupported operand type(s) for %: 'int' and 'tuple' ``` ### Anything else we need to know? The behavior exhibited above touches on the following related issues: * The `_check_chunks_compatibility` function assumes that a dimension expresses its preferred chunks only as an integer, not a sequence of integers. In contrast, *Dask* will handle either within the `previous_chunks` argument to its `normalize_chunks` function. * The examples in the documentation of `""preferred_chunks""` mappings, namely `{“dim1”: 1000, “dim2”: 2000}` and `{“dim1”: [1000, 100], “dim2”: [2000, 2000, 2000]]}`, have syntax errors: The quotation marks are curly instead of straight, and the second example has an extra closing bracket. * After correcting the syntax errors, the lists in the second example lead to `TypeError: unhashable type: 'list'`. *Dask* raises the exception when it tries to test a mutable list for set membership, as in the following (with simplified paths): ```python >>> dask.array.core.normalize_chunks([[1000, 100], [2000, 2000, 2000]], (1100, 6000)) Traceback (most recent call last): File """", line 1, in File ""...\dask\array\core.py"", line 2900, in normalize_chunks chunks = tuple(c if c not in {None, -1} else s for c, s in zip(chunks, shape)) File ""...\dask\array\core.py"", line 2900, in chunks = tuple(c if c not in {None, -1} else s for c, s in zip(chunks, shape)) TypeError: unhashable type: 'list' ``` If one omits the second argument (the shape) to that call, it succeeds. This may be a bug in *Dask*. * The tests in *xarray* don't exercise behaviors related to preferred chunks. [Edited for grammar.] ### Environment ``` INSTALLED VERSIONS ------------------ commit: None python: 3.8.12 | packaged by conda-forge | (default, Oct 12 2021, 21:22:46) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 94 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('English_United States', '1252') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 0.20.3.dev52+gd3b6aa6d pandas: 1.4.1 numpy: 1.21.5 scipy: 1.8.0 netCDF4: 1.5.8 pydap: installed h5netcdf: 0.13.1 h5py: 3.6.0 Nio: None zarr: 2.11.0 cftime: 1.5.2 nc_time_axis: 1.4.0 PseudoNetCDF: installed rasterio: 1.2.10 cfgrib: None iris: 3.2.0.post0 bottleneck: 1.3.2 dask: 2022.02.0 distributed: 2022.02.0 matplotlib: 3.5.1 cartopy: 0.20.2 seaborn: 0.11.2 numbagg: 0.2.1 fsspec: 2022.01.0 cupy: None pint: 0.18 sparse: 0.13.0 setuptools: 59.8.0 pip: 22.0.3 conda: None pytest: 7.0.1 IPython: None sphinx: None ```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6333/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 1160073438,PR_kwDOAMm_X84z-iqi,6334,"In backends, support expressing a dimension's preferred chunk sizes as a tuple of integers",38358698,closed,0,,,5,2022-03-04T21:39:46Z,2022-04-08T17:18:50Z,2022-04-08T17:18:50Z,CONTRIBUTOR,,0,pydata/xarray/pulls/6334,"- [X] Closes #6333 - [X] Tests added - [X] User visible changes (including notable bug fixes) are documented in `whats-new.rst` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6334/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1159961396,PR_kwDOAMm_X84z-K39,6330,"In documentation on adding a new backend, add missing import and tweak headings",38358698,closed,0,,,0,2022-03-04T19:13:30Z,2022-03-07T14:17:29Z,2022-03-07T13:13:50Z,CONTRIBUTOR,,0,pydata/xarray/pulls/6330,"Adding the import resolves the following exception that *sphinx-build* raises when one builds only ""doc/internals/how-to-add-new-backend.rst"": ``` NameError Traceback (most recent call last) Input In [1], in ----> 1 var = xr.Variable( 2 dims=(""x"",), data=np.arange(10.0), attrs={""scale_factor"": 10, ""add_offset"": 2} 3 ) NameError: name 'xr' is not defined ``` While in the file, I revised headings to have only their first letter capitalized (except for case-sensitive code), which seems to be the majority convention in the documentation.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6330/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1124431593,PR_kwDOAMm_X84yGFBx,6237,Enable running sphinx-build on Windows,38358698,closed,0,,,7,2022-02-04T17:15:14Z,2022-03-04T19:15:42Z,2022-03-01T16:00:34Z,CONTRIBUTOR,,0,pydata/xarray/pulls/6237,"- [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` This PR enables one to build the documentation on Windows by manually invoking `sphinx-build`. The first commit enables *sphinx* to execute the ""conf.py"" file. Before the commit, *sphinx* complains: ``` Configuration error: There is a programmable error in your configuration file: ``` with the exception `FileNotFoundError: [WinError 2] The system cannot find the file specified` where it tries to invoke *conda*. On Windows, *conda* environments other than the base environment have ""conda.bat"" on their path rather than ""conda.exe"", and one must call `subprocess.run` with ""conda.bat"" instead of merely ""conda"". However, the `CONDA_EXE` environment variable correctly points to the executable and should do so across platforms; it requires *conda* version 4.5 or later for conda/conda#6923. The second commit enables the build to tolerate exceptions when deleting temporary files and tells Git to ignore more such files. Without the commit, builds occasionally failed at the `os.remove` calls with exceptions such as `PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'example.nc'`. Should ""whats-new.rst"" mention these changes, which are developer-visible but not user-visible? I welcome any feedback. Edit: Checked the ""whats-new"" task.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6237/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1158639113,PR_kwDOAMm_X84z5zPC,6326,"Lengthen underline, correct spelling, and reword",38358698,closed,0,,,1,2022-03-03T16:40:55Z,2022-03-03T17:27:42Z,2022-03-03T17:01:15Z,CONTRIBUTOR,,0,pydata/xarray/pulls/6326,"Lengthening the underline resolves the following warning from *sphinx-build*: ``` [...]\xarray\doc\whats-new.rst:20: WARNING: Title underline too short. v2022.03.1 (unreleased) --------------------- ```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6326/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1150484906,PR_kwDOAMm_X84zd6WA,6305,"On Windows, enable successful test of opening a dataset containing a cftime index",38358698,closed,0,,,3,2022-02-25T14:07:50Z,2022-02-28T16:06:33Z,2022-02-28T09:53:22Z,CONTRIBUTOR,,0,pydata/xarray/pulls/6305,"- [X] User visible changes (including notable bug fixes) are documented in `whats-new.rst` Previously, on Windows, the subject test unnecessarily failed, and the temporary directory and file remained, because the scheduler in the outer context prevented deleting the temporary directory upon exiting the inner context of the latter.
Example failure (with short traceback):

``` ================================================ FAILURES ================================================= __________________________ test_open_mfdataset_can_open_files_with_cftime_index ___________________________ C:\Users\stan.west\Programs\Miniconda3-64\envs\xarray-dev\lib\shutil.py:616: in _rmtree_unsafe os.unlink(fullname) E PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\STAN~1.WES\\AppData\\Local\\Temp\\tmpdpdajrgc\\test.nc' During handling of the above exception, another exception occurred: C:\Users\stan.west\Programs\Miniconda3-64\envs\xarray-dev\lib\tempfile.py:802: in onerror _os.unlink(path) E PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\STAN~1.WES\\AppData\\Local\\Temp\\tmpdpdajrgc\\test.nc' During handling of the above exception, another exception occurred: C:\Users\stan.west\Documents\Repositories\xarray\xarray\tests\test_distributed.py:128: in test_open_mfdataset_can_open_files_with_cftime_index assert_identical(tf[""test""], da) C:\Users\stan.west\Programs\Miniconda3-64\envs\xarray-dev\lib\tempfile.py:827: in __exit__ self.cleanup() C:\Users\stan.west\Programs\Miniconda3-64\envs\xarray-dev\lib\tempfile.py:831: in cleanup self._rmtree(self.name) C:\Users\stan.west\Programs\Miniconda3-64\envs\xarray-dev\lib\tempfile.py:813: in _rmtree _shutil.rmtree(name, onerror=onerror) C:\Users\stan.west\Programs\Miniconda3-64\envs\xarray-dev\lib\shutil.py:740: in rmtree return _rmtree_unsafe(path, onerror) C:\Users\stan.west\Programs\Miniconda3-64\envs\xarray-dev\lib\shutil.py:618: in _rmtree_unsafe onerror(os.unlink, fullname, sys.exc_info()) C:\Users\stan.west\Programs\Miniconda3-64\envs\xarray-dev\lib\tempfile.py:805: in onerror cls._rmtree(path) C:\Users\stan.west\Programs\Miniconda3-64\envs\xarray-dev\lib\tempfile.py:813: in _rmtree _shutil.rmtree(name, onerror=onerror) C:\Users\stan.west\Programs\Miniconda3-64\envs\xarray-dev\lib\shutil.py:740: in rmtree return _rmtree_unsafe(path, onerror) C:\Users\stan.west\Programs\Miniconda3-64\envs\xarray-dev\lib\shutil.py:599: in _rmtree_unsafe onerror(os.scandir, path, sys.exc_info()) C:\Users\stan.west\Programs\Miniconda3-64\envs\xarray-dev\lib\shutil.py:596: in _rmtree_unsafe with os.scandir(path) as scandir_it: E NotADirectoryError: [WinError 267] The directory name is invalid: 'C:\\Users\\STAN~1.WES\\AppData\\Local\\Temp\\tmpdpdajrgc\\test.nc' ========================================= short test summary info ========================================= FAILED xarray/tests/test_distributed.py::test_open_mfdataset_can_open_files_with_cftime_index - NotADirec... =========================================== 1 failed in 10.77s ============================================ ```

The first commit swaps the two contexts to resolve the issue. The second commit replaces the `tempfile.TemporaryDirectory()` context manager with *pytest*'s `tmp_path` fixture, which slightly simplifies the test and relies on *pytest* to remove the directory on a subsequent invocation.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6305/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull