id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1680031454,I_kwDOAMm_X85kIz7e,7780,mypy does not understand output of binary operations,14371165,open,0,,,8,2023-04-23T13:38:55Z,2024-04-28T20:07:04Z,,MEMBER,,,,"### What happened? When doing operations on numpy arrays and xarray variables mypy does not understand that the output is always a xarray variable regardless of the order. See example. ### What did you expect to happen? mypy to pass for the example code. ### Minimal Complete Verifiable Example ```Python import numpy as np import xarray as xr x = np.array([1, 2, 4]) v = xr.Variable([""x""], x) # numpy first: xv = x * v xv.values # error: ""ndarray[Any, dtype[bool_]]"" has no attribute ""values"" [attr-defined] if isinstance(xv, xr.Variable): xv.values # variable first: vx = v * x vx.values if isinstance(vx, xr.Variable): vx.values ``` ### MVCE confirmation - [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [X] Complete example — the example is self-contained, including all data and the text of any traceback. - [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [X] New issue — a search of GitHub Issues suggests this is not a duplicate. ### Relevant log output _No response_ ### Anything else we need to know? Seen in #7741 ### Environment
xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.9.16 (main, Mar 8 2023, 10:39:24) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: en libhdf5: 1.10.6 libnetcdf: None xarray: 2023.4.2 pandas: 2.0.0 numpy: 1.23.5 scipy: 1.10.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: None dask: 2023.4.0 distributed: 2023.4.0 matplotlib: 3.5.3 cartopy: None seaborn: 0.12.2 numbagg: None fsspec: 2023.4.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 67.7.1 pip: 23.1.1 conda: 23.3.1 pytest: 7.3.1 mypy: 1.2.0 IPython: 8.12.0 sphinx: 6.1.3
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7780/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 2215603817,I_kwDOAMm_X86ED25p,8892,ffill's tolerance argument can be strings,14371165,open,0,,,1,2024-03-29T15:49:40Z,2024-04-02T01:50:34Z,,MEMBER,,,,"### What happened? `ffill`, `bfill` `reindex` etc. have tolerance arguments that also supports strings. And we test for it here: https://github.com/pydata/xarray/blob/2120808bbe45f3d4f0b6a01cd43bac4df4039092/xarray/tests/test_groupby.py#L2016-L2025 But our typing assumes it's floats only: https://github.com/pydata/xarray/blob/2120808bbe45f3d4f0b6a01cd43bac4df4039092/xarray/core/resample.py#L69-L94 ### What did you expect to happen? Since our pytests pass, mypy should pass as well. ### Minimal Complete Verifiable Example ```python import numpy as np import pandas as pd import xarray as xr # https://github.com/pydata/xarray/blob/2120808bbe45f3d4f0b6a01cd43bac4df4039092/xarray/tests/test_groupby.py#L2016 # Test tolerance keyword for upsample methods bfill, pad, nearest times = pd.date_range(""2000-01-01"", freq=""1D"", periods=2) times_upsampled = pd.date_range(""2000-01-01"", freq=""6h"", periods=5) array = xr.DataArray(np.arange(2), [(""time"", times)]) # Forward fill actual = array.resample(time=""6h"").ffill(tolerance=""12h"") expected = xr.DataArray([0.0, 0.0, 0.0, np.nan, 1.0], [(""time"", times_upsampled)]) xr.testing.assert_identical(expected, actual) ``` ### Environment master ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8892/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1826978659,I_kwDOAMm_X85s5Xtj,8028,Setting datarrays with non-dimension coordinates errors,14371165,open,0,,,6,2023-07-28T19:20:31Z,2023-08-10T15:25:23Z,,MEMBER,,,,"### What happened? I'm not sure if this is a bug or a feature but I was expecting this example to work since the new coord is just a slight rewrite of the original dimension coordinate: ```python import xarray as xr ds = xr.tutorial.open_dataset(""air_temperature"") # Change the first time value: ds[""air_new""] = ds.air.copy() air_new_changed = ds.air_new[{""time"": 0}] * 3 ds.air_new.loc[air_new_changed.coords] = air_new_changed # Works! :) # Add a another coord along time axis and change # the first time value: ds[""air_new""] = ds.air.copy().assign_coords( {""time_float"": ds.time.astype(float)} ) air_new_changed = ds.air_new[{""time"": 0}] * 4 ds.air_new.loc[air_new_changed.coords] = air_new_changed # Error! :( Traceback (most recent call last): Cell In[25], line 5 ds.air_new.loc[air_new_changed.coords] = air_new_changed File ~\AppData\Local\mambaforge\envs\jw\lib\site-packages\xarray\core\dataarray.py:222 in __setitem__ dim_indexers = map_index_queries(self.data_array, key).dim_indexers File ~\AppData\Local\mambaforge\envs\jw\lib\site-packages\xarray\core\indexing.py:182 in map_index_queries grouped_indexers = group_indexers_by_index(obj, indexers, options) File ~\AppData\Local\mambaforge\envs\jw\lib\site-packages\xarray\core\indexing.py:144 in group_indexers_by_index raise KeyError(f""no index found for coordinate {key!r}"") KeyError: ""no index found for coordinate 'time_float'"" ``` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8028/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1537068105,I_kwDOAMm_X85bncxJ,7450,Backend array documentation typo,14371165,open,0,,,0,2023-01-17T21:37:26Z,2023-01-17T21:56:12Z,,MEMBER,,,,"### What happened? https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html#indexing-examples I believe there's a typo in the BASIC indexing support example: ```python # shall support integers backend_array._raw_indexing_method(1, 1) ``` Should be: ```python # shall support integers backend_array._raw_indexing_method((1, 1)) ``` Suggestion of possible fixes: * Make sure it is a typo. * Create a valid custom MyBackendArray and initialize it. So it is easier to tell if it's a typo. * Add type hinting so mypy can easier catch these errors. ### What did you expect to happen? _No response_ ### Minimal Complete Verifiable Example _No response_ ### MVCE confirmation - [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [ ] Complete example — the example is self-contained, including all data and the text of any traceback. - [ ] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [ ] New issue — a search of GitHub Issues suggests this is not a duplicate. ### Relevant log output _No response_ ### Anything else we need to know? _No response_ ### Environment
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7450/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1410534774,I_kwDOAMm_X85UEw12,7170,Scatter plots overlap in facetgrid in 3d,14371165,open,0,,,0,2022-10-16T16:06:56Z,2022-10-16T16:08:55Z,,MEMBER,,,,"### What happened? Any matplotlib gurus have any ideas how to nicely fit 3d plots in facetgrid? ```python ds = xr.tutorial.scatter_example_dataset(seed=42) fg = ds.plot.scatter(x=""A"", y=""B"", z=""z"", hue=""y"", markersize=""x"", row=""x"", col=""w"") ``` ![image](https://user-images.githubusercontent.com/14371165/196045673-ad6322d4-63f4-4f94-aca6-a4cfdc682fea.png) 2d looks fine: ```python fg = ds.plot.scatter(x=""A"", y=""B"", hue=""y"", markersize=""x"", row=""x"", col=""w"") ``` ![image](https://user-images.githubusercontent.com/14371165/196045774-5d850317-c58a-4fac-ae4b-04660a140fd1.png) ### What did you expect to happen? No plots overlapping each other, even if rotating the plots. ### Minimal Complete Verifiable Example _No response_ ### MVCE confirmation - [x] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [x] Complete example — the example is self-contained, including all data and the text of any traceback. - [x] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [x] New issue — a search of GitHub Issues suggests this is not a duplicate. ### Relevant log output _No response_ ### Anything else we need to know? _No response_ ### Environment
xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:30:19) [MSC v.1929 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: en LOCALE: ('Swedish_Sweden', '1252') libhdf5: 1.12.2 libnetcdf: 4.8.1 xarray: 2022.9.1.dev266+gbd01f9cc.d20221006 pandas: 1.5.0 numpy: 1.23.3 scipy: 1.9.1 netCDF4: 1.6.1 pydap: installed h5netcdf: 1.0.2 h5py: 3.7.0 Nio: None zarr: 2.13.2 cftime: 1.6.2 nc_time_axis: 1.4.1 PseudoNetCDF: 3.2.2 rasterio: 1.3.2 cfgrib: None iris: 3.3.0 bottleneck: 1.3.5 dask: 2022.9.2 distributed: 2022.9.2 matplotlib: 3.6.0 cartopy: 0.21.0 seaborn: 0.12.0 numbagg: 0.2.1 fsspec: 2022.8.2 cupy: None pint: 0.19.2 sparse: 0.13.0 flox: 0.5.10.dev21+g91b6e19 numpy_groupies: 0.9.19 setuptools: 65.4.1 pip: 22.2.2 conda: None pytest: 7.1.3 IPython: 7.33.0 sphinx: 5.2.3 C:\Users\J.W\anaconda3\envs\xarray-tests\lib\site-packages\_distutils_hack\__init__.py:33: UserWarning: Setuptools is replacing distutils. warnings.warn(""Setuptools is replacing distutils."")
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7170/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1377088142,I_kwDOAMm_X85SFLKO,7050,Type annotation guidelines,14371165,open,0,,,2,2022-09-18T15:04:54Z,2022-09-23T01:55:19Z,,MEMBER,,,,"Dask has a pretty nice guideline for type hinting, see https://github.com/dask/community/issues/255. Notable for us is to avoid adding typing in docstrings to avoid duplicating information.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7050/reactions"", ""total_count"": 4, ""+1"": 4, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 597785475,MDU6SXNzdWU1OTc3ODU0NzU=,3962,"Interpolation - Support extrapolation method ""clip""",14371165,open,0,,,4,2020-04-10T09:07:13Z,2022-05-02T13:42:24Z,,MEMBER,,,,"Hello, I would like an option in `da.interp() `that instead of returning NaNs during extrapolation returns the data corresponding to the end of the breakpoint data set range. One way to do this is to limit the new coordinates to the array coordinates minimum and maximum value, I did a simple example with this solution down below. I think this is a rather safe way as we are just modifying the inputs to all the various interpolation classes that xarray is using at the moment. But it does look a little weird when printing the extrapolated value, the coordinates shows the limited value instead of the requested coordinates. Maybe this can be handled elegantly somewhere in the source code? MATLAB uses this quite frequently in their interpolation functions: * https://mathworks.com/help/simulink/ug/methods-for-estimating-missing-points.html * https://mathworks.com/help/simulink/slref/2dlookuptable.html #### MCVE Code Sample ```python import numpy as np import xarray as xr def interp(da, coords, extrapolation='clip'): """""" Linear interpolation that clips the inputs to the coords min and max value. Parameters ---------- da : DataArray DataArray to interpolate. coords : dict Coordinates for the interpolated value. """""" if extrapolation == 'clip': for k, v in da.coords.items(): coords[k] = np.maximum(coords[k], np.min(v.values)) coords[k] = np.minimum(coords[k], np.max(v.values)) return da.interp(coords) # Create coordinates: x = np.linspace(1000, 6000, 4) y = np.linspace(100, 1200, 3) # Create data: X = np.meshgrid(*[x, y], indexing='ij') data = X[0] * X[1] # Create DataArray: da = xr.DataArray(data=data, coords=[('x', x), ('y', y)], name='data') # Attempt to extrapolate: datai = interp(da, {'x': 7000, 'y': 375}) ``` #### Expected Output ````python print(datai) array(2250000.) Coordinates: x float64 6e+03 y float64 375.0 ```` #### Versions
Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.7.7 (default, Mar 23 2020, 23:19:08) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: en LOCALE: None.None libhdf5: 1.10.4 libnetcdf: None xarray: 0.15.0 pandas: 1.0.3 numpy: 1.18.1 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2.13.0 distributed: 2.13.0 matplotlib: 3.1.3 cartopy: None seaborn: 0.10.0 numbagg: None setuptools: 46.1.3.post20200330 pip: 20.0.2 conda: 4.8.3 pytest: 5.4.1 IPython: 7.13.0 sphinx: 2.4.4
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3962/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 957201551,MDU6SXNzdWU5NTcyMDE1NTE=,5655,Allow .attrs to use dict-likes,14371165,open,0,,,2,2021-07-31T08:31:55Z,2022-01-09T03:32:04Z,,MEMBER,,,," **Is your feature request related to a problem? Please describe.** Reading attributes from h5py-files is rather slow. So instead of retrieving it immediately I wanted to create a lazy dict-class that only retrieves the attribute values when necessary. But this is difficult to achieve since xarray keeps forcing the attrs to dicts in a lot of places. **Describe the solution you'd like** * Replace in https://github.com/pydata/xarray/blob/dddac11b01330791ffab4dfc72d226e71821973e/xarray/core/variable.py#L865 and https://github.com/pydata/xarray/blob/dddac11b01330791ffab4dfc72d226e71821973e/xarray/core/dataset.py#L798 with a `asdict(value)` function that checks if the input is a valid dict-like, if not convert to dict. Things that might be good to check: * `MutableMapping` * `hasattr(dict_like, ""copy"")` * `isinstance(dict_like, dict) == True` * Remove unneccessary conversions to dict. For example https://github.com/pydata/xarray/blob/dddac11b01330791ffab4dfc72d226e71821973e/xarray/core/merge.py#L523 should not be necessary as attrs from variables/dataarrays/datasets have already been forced to dicts when they were initialized. **Describe alternatives you've considered** * One could lazify with dicts as well, for example by replacing the value with a function. This however won't look good in reprs, that's why having a convienence class is nice. * `dict(LazyDict)` always forces to dict, it does not let it pass through unchanged even if `isinstance(LazyDict, dict) == True`. Interesting reading: https://stackoverflow.com/questions/16669367/setup-dictionary-lazily https://stackoverflow.com/questions/3387691/how-to-perfectly-override-a-dict ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5655/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 779938616,MDU6SXNzdWU3Nzk5Mzg2MTY=,4770,Interpolation always returns floats,14371165,open,0,,,1,2021-01-06T03:16:43Z,2021-01-12T16:30:54Z,,MEMBER,,,," **What happened**: When interpolating datasets integer arrays are forced to floats. **What you expected to happen**: To retain the same dtype after interpolation. **Minimal Complete Verifiable Example**: ```python import numpy as np import dask.array as da a = np.arange(0, 2) b = np.core.defchararray.add(""long_variable_name"", a.astype(str)) coords = dict(time=da.array([0, 1])) data_vars = dict() for v in b: data_vars[v] = xr.DataArray( name=v, data=da.array([0, 1], dtype=int), dims=[""time""], coords=coords, ) ds1 = xr.Dataset(data_vars) print(ds1) Out[35]: Dimensions: (time: 4) Coordinates: * time (time) float64 0.0 0.5 1.0 2.0 Data variables: long_variable_name0 (time) int32 dask.array long_variable_name1 (time) int32 dask.array # Interpolate: ds1 = ds1.interp( time=da.array([0, 0.5, 1, 2]), assume_sorted=True, method=""linear"", kwargs=dict(fill_value=""extrapolate""), ) # dask array thinks it's an integer array: print(ds1.long_variable_name0) Out[55]: dask.array Coordinates: * time (time) float64 0.0 0.5 1.0 2.0 # But once computed it turns out is a float: print(ds1.long_variable_name0.compute()) Out[38]: array([0. , 0.5, 1. , 2. ]) Coordinates: * time (time) float64 0.0 0.5 1.0 2.0 ``` **Anything else we need to know?**: An easy first step is to also force `np.float_` in `da.blockwise` in `missing.interp_func`. The more difficult way is to somehow be able to change back the dataarrays into the old dtype without affecting performance. I did a test simply adding `.astype() `to the returned value in `missing.interp` and it doubled the calculation time. I was thinking the conversion to floats in scipy could be avoided altogether by adding a (non-)public option to ignore any dtype checks and just let the user handle the ""unsafe"" interpolations. Related: https://github.com/scipy/scipy/issues/11093 **Environment**:
Output of xr.show_versions() xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows libhdf5: 1.10.4 libnetcdf: None xarray: 0.16.2 pandas: 1.1.5 numpy: 1.17.5 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2020.12.0 distributed: 2020.12.0 matplotlib: 3.3.2 cartopy: None seaborn: 0.11.1 numbagg: None pint: None setuptools: 51.0.0.post20201207 pip: 20.3.3 conda: 4.9.2 pytest: 6.2.1 IPython: 7.19.0 sphinx: 3.4.0
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4770/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue