id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1797233538,I_kwDOAMm_X85rH5uC,7971,Pint errors on python 3.11 and windows,14371165,closed,0,,,2,2023-07-10T17:44:51Z,2024-02-26T17:52:50Z,2024-02-26T17:52:50Z,MEMBER,,,,"### What happened? The CI seems to consistently crash on `test_units.py` now: ``` =========================== short test summary info =========================== FAILED xarray/tests/test_units.py::TestVariable::test_aggregation[int32-method_max] - TypeError: no implementation found for 'numpy.max' on types that implement __array_function__: [] FAILED xarray/tests/test_units.py::TestVariable::test_aggregation[int32-method_min] - TypeError: no implementation found for 'numpy.min' on types that implement __array_function__: [] FAILED xarray/tests/test_units.py::TestDataArray::test_aggregation[float64-function_max] - TypeError: no implementation found for 'numpy.max' on types that implement __array_function__: [] FAILED xarray/tests/test_units.py::TestDataArray::test_aggregation[float64-function_min] - TypeError: no implementation found for 'numpy.min' on types that implement __array_function__: [] FAILED xarray/tests/test_units.py::TestDataArray::test_aggregation[int32-function_max] - TypeError: no implementation found for 'numpy.max' on types that implement __array_function__: [] FAILED xarray/tests/test_units.py::TestDataArray::test_aggregation[int32-function_min] - TypeError: no implementation found for 'numpy.min' on types that implement __array_function__: [] FAILED xarray/tests/test_units.py::TestDataArray::test_aggregation[int32-method_max] - TypeError: no implementation found for 'numpy.max' on types that implement __array_function__: [] FAILED xarray/tests/test_units.py::TestDataArray::test_aggregation[int32-method_min] - TypeError: no implementation found for 'numpy.min' on types that implement __array_function__: [] FAILED xarray/tests/test_units.py::TestDataArray::test_unary_operations[float64-round] - TypeError: no implementation found for 'numpy.round' on types that implement __array_function__: [] FAILED xarray/tests/test_units.py::TestDataArray::test_unary_operations[int32-round] - TypeError: no implementation found for 'numpy.round' on types that implement __array_function__: [] FAILED xarray/tests/test_units.py::TestDataset::test_aggregation[int32-method_max] - TypeError: no implementation found for 'numpy.max' on types that implement __array_function__: [] FAILED xarray/tests/test_units.py::TestDataset::test_aggregation[int32-method_min] - TypeError: no implementation found for 'numpy.min' on types that implement __array_function__: [] = 12 failed, 14880 passed, 1649 skipped, 146 xfailed, 68 xpassed, 574 warnings in 737.19s (0:12:17) = ``` For more details: https://github.com/pydata/xarray/actions/runs/5438369625/jobs/9889561685?pr=7955 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7971/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 1795519181,I_kwDOAMm_X85rBXLN,7969,Upstream CI is failing,14371165,closed,0,,,2,2023-07-09T18:51:41Z,2023-07-10T17:34:12Z,2023-07-10T17:33:12Z,MEMBER,,,,"### What happened? The upstream CI has been failing for a while. Here's the latest: https://github.com/pydata/xarray/actions/runs/5501368493/jobs/10024902009#step:7:16 ```python Traceback (most recent call last): File """", line 1, in File ""/home/runner/work/xarray/xarray/xarray/__init__.py"", line 1, in from xarray import testing, tutorial File ""/home/runner/work/xarray/xarray/xarray/testing.py"", line 7, in import numpy as np ModuleNotFoundError: No module named 'numpy' ``` Digging a little in the logs ``` Installing build dependencies: started Installing build dependencies: finished with status 'error' error: subprocess-exited-with-error × pip subprocess to install build dependencies did not run successfully. │ exit code: 1 ╰─> [3 lines of output] Looking in indexes: https://pypi.anaconda.org/scipy-wheels-nightly/simple ERROR: Could not find a version that satisfies the requirement meson-python==0.13.1 (from versions: none) ERROR: No matching distribution found for meson-python==0.13.1 [end of output] ``` Might be some numpy problem? Should the CI be robust enough to handle these kinds of errors? Because I suppose we would like to get the automatic issue created anyway?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7969/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 1125040125,I_kwDOAMm_X85DDr_9,6244,Get pyupgrade to update the typing,14371165,closed,0,,,2,2022-02-05T21:56:56Z,2023-03-12T15:38:37Z,2023-03-12T15:38:37Z,MEMBER,,,,"### Is your feature request related to a problem? Use more up-to-date typing styles on all files. Will reduce number of imports and avoids big diffs when doing relatively minor changes because pre-commit/pyupgrade has been triggered somehow. Related to #6240 ### Describe the solution you'd like Add `from __future__ import annotations` on files with a lot of typing. Let pyupgrade do the rest. ### Describe alternatives you've considered _No response_ ### Additional context _No response_","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6244/reactions"", ""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 1377088142,I_kwDOAMm_X85SFLKO,7050,Type annotation guidelines,14371165,open,0,,,2,2022-09-18T15:04:54Z,2022-09-23T01:55:19Z,,MEMBER,,,,"Dask has a pretty nice guideline for type hinting, see https://github.com/dask/community/issues/255. Notable for us is to avoid adding typing in docstrings to avoid duplicating information.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7050/reactions"", ""total_count"": 4, ""+1"": 4, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1182697604,I_kwDOAMm_X85GfoiE,6416,xr.concat removes datetime information,14371165,closed,0,,,2,2022-03-27T23:19:30Z,2022-03-28T16:05:01Z,2022-03-28T16:05:01Z,MEMBER,,,,"### What happened? xr.concat removes datetime information and can't concatenate the arrays because they don't have compatible types anymore. ### What did you expect to happen? Succesful concatenation with the same type. ### Minimal Complete Verifiable Example ```Python import numpy as np import xarray as xr from datetime import datetime month = np.arange(1, 13, 1) data = np.sin(2 * np.pi * month / 12.0) darray = xr.DataArray(data, dims=[""time""]) darray.coords[""time""] = np.array([datetime(2017, m, 1) for m in month]) darray_nan = np.nan * darray.isel(**{""time"": -1}) darray = xr.concat([darray, darray_nan], dim=""time"") ``` ### Relevant log output ```Python Traceback (most recent call last): File """", line 2, in darray = xr.concat([darray, darray_nan], dim=""time"") File ""c:\users\j.w\documents\github\xarray\xarray\core\concat.py"", line 244, in concat return f( File ""c:\users\j.w\documents\github\xarray\xarray\core\concat.py"", line 642, in _dataarray_concat ds = _dataset_concat( File ""c:\users\j.w\documents\github\xarray\xarray\core\concat.py"", line 555, in _dataset_concat combined_idx = indexes[0].concat(indexes, dim, positions) File ""c:\users\j.w\documents\github\xarray\xarray\core\indexes.py"", line 318, in concat coord_dtype = np.result_type(*[idx.coord_dtype for idx in indexes]) File ""<__array_function__ internals>"", line 5, in result_type TypeError: The DType could not be promoted by . This means that no common DType exists for the given inputs. For example they cannot be stored in a single array unless the dtype is `object`. The full list of DTypes is: (, ) ``` ### Anything else we need to know? Similar to #6384. Happens around here: https://github.com/pydata/xarray/blob/728b648d5c7c3e22fe3704ba163012840408bf66/xarray/core/concat.py#L535 ### Environment
INSTALLED VERSIONS ------------------ commit: None python: 3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:37:25) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: en LOCALE: ('Swedish_Sweden', '1252') libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.16.3.dev99+gc19467fb pandas: 1.3.1 numpy: 1.21.5 scipy: 1.7.1 netCDF4: 1.5.6 pydap: installed h5netcdf: 0.11.0 h5py: 2.10.0 Nio: None zarr: 2.8.3 cftime: 1.5.0 nc_time_axis: 1.3.1 PseudoNetCDF: installed rasterio: 1.2.6 cfgrib: None iris: 3.0.4 bottleneck: 1.3.2 dask: 2021.10.0 distributed: 2021.10.0 matplotlib: 3.4.3 cartopy: 0.19.0.post1 seaborn: 0.11.1 numbagg: 0.2.1 fsspec: 2021.11.1 cupy: None pint: 0.17 sparse: 0.12.0 setuptools: 49.6.0.post20210108 pip: 21.2.4 conda: None pytest: 6.2.4 IPython: 7.31.0 sphinx: 4.3.2
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6416/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 957201551,MDU6SXNzdWU5NTcyMDE1NTE=,5655,Allow .attrs to use dict-likes,14371165,open,0,,,2,2021-07-31T08:31:55Z,2022-01-09T03:32:04Z,,MEMBER,,,," **Is your feature request related to a problem? Please describe.** Reading attributes from h5py-files is rather slow. So instead of retrieving it immediately I wanted to create a lazy dict-class that only retrieves the attribute values when necessary. But this is difficult to achieve since xarray keeps forcing the attrs to dicts in a lot of places. **Describe the solution you'd like** * Replace in https://github.com/pydata/xarray/blob/dddac11b01330791ffab4dfc72d226e71821973e/xarray/core/variable.py#L865 and https://github.com/pydata/xarray/blob/dddac11b01330791ffab4dfc72d226e71821973e/xarray/core/dataset.py#L798 with a `asdict(value)` function that checks if the input is a valid dict-like, if not convert to dict. Things that might be good to check: * `MutableMapping` * `hasattr(dict_like, ""copy"")` * `isinstance(dict_like, dict) == True` * Remove unneccessary conversions to dict. For example https://github.com/pydata/xarray/blob/dddac11b01330791ffab4dfc72d226e71821973e/xarray/core/merge.py#L523 should not be necessary as attrs from variables/dataarrays/datasets have already been forced to dicts when they were initialized. **Describe alternatives you've considered** * One could lazify with dicts as well, for example by replacing the value with a function. This however won't look good in reprs, that's why having a convienence class is nice. * `dict(LazyDict)` always forces to dict, it does not let it pass through unchanged even if `isinstance(LazyDict, dict) == True`. Interesting reading: https://stackoverflow.com/questions/16669367/setup-dictionary-lazily https://stackoverflow.com/questions/3387691/how-to-perfectly-override-a-dict ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5655/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1042698589,I_kwDOAMm_X84-JlFd,5928,Relax GitHub Actions first time contributor approval?,14371165,closed,0,,,2,2021-11-02T18:45:16Z,2021-11-02T21:44:54Z,2021-11-02T21:44:54Z,MEMBER,,,,"A while back GitHub made it so that new contributors cannot trigger GitHub Actions workflows and a maintainer has to hit ""Approve and Run"" every time they push a commit to their PR. This is rather annoying for both the contributor and the maintainer as the back and forth takes time. It however seems possible to relax this constraint: https://twitter.com/metcalfc/status/1448414192285806592?t=maeChQZTSUh2Ph0YFk-hGA&s=19 Shall we relax this constraint? ref: https://github.com/dask/community/issues/191","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5928/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 775875024,MDU6SXNzdWU3NzU4NzUwMjQ=,4739,Slow initilization of dataset.interp,14371165,closed,0,,,2,2020-12-29T12:46:05Z,2021-05-05T12:26:01Z,2021-05-05T12:26:01Z,MEMBER,,,," **What happened**: When interpolating a dataset with >2000 dask variables a lot of time is spent in `da.unifying_chunks` because `da.unifying_chunks` forces all variables and **coordinates** to a dask array. xarray on the other hand forces coordinates to pd.Index even if the coordinates was dask.array when the dataset was first created. **What you expected to happen**: If the coords of the dataset was initialized as dask arrays they should stay lazy. **Minimal Complete Verifiable Example**: ```python import xarray as xr import numpy as np import dask.array as da a = np.arange(0, 2000) b = np.core.defchararray.add(""long_variable_name"", a.astype(str)) coords = dict(time=da.array([0, 1])) data_vars = dict() for v in b: data_vars[v] = xr.DataArray( name=v, data=da.array([3, 4]), dims=[""time""], coords=coords ) ds0 = xr.Dataset(data_vars) ds0 = ds0.interp( time=da.array([0, 0.5, 1]), assume_sorted=True, kwargs=dict(fill_value=None), ) ``` **Anything else we need to know?**: Some thoughts: * Why can't coordinates be lazy? * Can we use dask.dataframe.Index instead of pd.Index when creating IndexVariables? * There's no time saved converting to dask arrays in `missing.interp_func`. But some time could be saved if we could convert them to dask arrays in `xr.Dataset.interp` before the variable loop starts. * Can we still store the dask array in IndexVariable and use a to_dask_array()-method to quickly get it? * Initializing the dataarrays will still be slow though since it still has to force the dask array to pd.Index. **Environment**:
Output of xr.show_versions() xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 libhdf5: 1.10.4 libnetcdf: None xarray: 0.16.2 pandas: 1.1.5 numpy: 1.17.5 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2020.12.0 distributed: 2020.12.0 matplotlib: 3.3.2 cartopy: None seaborn: 0.11.1 numbagg: None pint: None setuptools: 51.0.0.post20201207 pip: 20.3.3 conda: 4.9.2 pytest: 6.2.1 IPython: 7.19.0 sphinx: 3.4.0
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4739/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 775322346,MDU6SXNzdWU3NzUzMjIzNDY=,4736,Limit number of data variables shown in repr,14371165,closed,0,,,2,2020-12-28T10:15:26Z,2021-01-04T02:13:52Z,2021-01-04T02:13:52Z,MEMBER,,,," **What happened**: xarray feels very unresponsive when using datasets with >2000 data variables because it has to print all the 2000 variables everytime you print something to console. **What you expected to happen**: xarray should limit the number of variables printed to console. Maximum maybe 25? Same idea probably apply to dimensions, coordinates and attributes as well, pandas only shows 2 for reference, the first and last variables. **Minimal Complete Verifiable Example**: ```python import numpy as np import xarray as xr a = np.arange(0, 2000) b = np.core.defchararray.add(""long_variable_name"", a.astype(str)) data_vars = dict() for v in b: data_vars[v] = xr.DataArray( name=v, data=[3, 4], dims=[""time""], coords=dict(time=[0, 1]) ) ds = xr.Dataset(data_vars) # Everything above feels fast. Printing to console however takes about 13 seconds for me: print(ds) ``` **Anything else we need to know?**: Out of scope brainstorming: Though printing 2000 variables is probably madness for most people it is kind of nice to show all variables because you sometimes want to know what happened to a few other variables as well. Is there already an easy and fast way to create subgroup of the dataset, so we don' have to rely on the dataset printing everything to the console everytime? **Environment**:
Output of xr.show_versions() xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 libhdf5: 1.10.4 libnetcdf: None xarray: 0.16.2 pandas: 1.1.5 numpy: 1.17.5 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2020.12.0 distributed: 2020.12.0 matplotlib: 3.3.2 cartopy: None seaborn: 0.11.1 numbagg: None pint: None setuptools: 51.0.0.post20201207 pip: 20.3.3 conda: 4.9.2 pytest: 6.2.1 IPython: 7.19.0 sphinx: 3.4.0
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4736/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue