id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 2259316341,I_kwDOAMm_X86Gqm51,8965,Support concurrent loading of variables,2448579,open,0,,,4,2024-04-23T16:41:24Z,2024-04-29T22:21:51Z,,MEMBER,,,,"### Is your feature request related to a problem? Today if users have to concurrently load multiple variables in a DataArray or Dataset, they *have* to use dask. It struck me that it'd be pretty easy for `.load` to gain an `executor` kwarg that accepts anything that follows the [`concurrent.futures` executor](https://docs.python.org/3/library/concurrent.futures.html) interface, and parallelize this loop. https://github.com/pydata/xarray/blob/b0036749542145794244dee4c4869f3750ff2dee/xarray/core/dataset.py#L853-L857 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8965/reactions"", ""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 2027147099,I_kwDOAMm_X854089b,8523,"tree-reduce the combine for `open_mfdataset(..., parallel=True, combine=""nested"")`",2448579,open,0,,,4,2023-12-05T21:24:51Z,2023-12-18T19:32:39Z,,MEMBER,,,,"### Is your feature request related to a problem? When `parallel=True` and a distributed client is active, Xarray reads every file in parallel, constructs a Dataset per file with indexed coordinates loaded, and then sends all of that back to the ""head node"" for the combine. Instead we can tree-reduce the combine ([example](https://gist.github.com/dcherian/345c81c69c3587873a89b49c949d1561)) by switching to `dask.bag` instead of `dask.delayed` and skip the overhead of shipping 1000s of copies of an indexed coordinate back to the head node. 1. The downside is the dask graph is ""worse"" but perhaps that shouldn't stop us. 2. I think this is only feasible for `combine=""nested""` cc @TomNicholas ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8523/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1603957501,I_kwDOAMm_X85fmnL9,7573,Add optional min versions to conda-forge recipe (`run_constrained`),2448579,closed,0,,,4,2023-02-28T23:12:15Z,2023-08-21T16:12:34Z,2023-08-21T16:12:21Z,MEMBER,,,,"### Is your feature request related to a problem? I opened this PR to add minimum versions for our optional dependencies: https://github.com/conda-forge/xarray-feedstock/pull/84/files to prevent issues like #7467 I think we'd need a policy to choose which ones to list. Here's the current list: ``` run_constrained: - bottleneck >=1.3 - cartopy >=0.20 - cftime >=1.5 - dask-core >=2022.1 - distributed >=2022.1 - flox >=0.5 - h5netcdf >=0.13 - h5py >=3.6 - hdf5 >=1.12 - iris >=3.1 - matplotlib-base >=3.5 - nc-time-axis >=1.4 - netcdf4 >=1.5.7 - numba >=0.55 - pint >=0.18 - scipy >=1.7 - seaborn >=0.11 - sparse >=0.13 - toolz >=0.11 - zarr >=2.10 ``` Some examples to think about: 1. `iris` seems like a bad one to force. It seems like people might use Iris and Xarray independently and Xarray shouldn't force a minimum version. 2. For backends, I arbitrarily kept `netcdf4`, `h5netcdf` and `zarr`. 3. It seems like we should keep array types: so `dask`, `sparse`, `pint`. ### Describe the solution you'd like _No response_ ### Describe alternatives you've considered _No response_ ### Additional context _No response_","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7573/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 1789989152,I_kwDOAMm_X85qsREg,7962,Better chunk manager error,2448579,closed,0,,,4,2023-07-05T17:27:25Z,2023-07-24T22:26:14Z,2023-07-24T22:26:13Z,MEMBER,,,,"### What happened? I just ran in to this error in an environment without dask. ``` TypeError: Could not find a Chunk Manager which recognises type ``` I think we could easily recommend the user to install a package that provides `dask` by looking at `type(array).__name__`. This would make the message a lot friendlier ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7962/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 1760733017,I_kwDOAMm_X85o8qdZ,7924,"Migrate from nbsphinx to myst, myst-nb",2448579,open,0,,,4,2023-06-16T14:17:41Z,2023-06-20T22:07:42Z,,MEMBER,,,,"### Is your feature request related to a problem? I think we should switch to [MyST markdown](https://mystmd.org/) for our docs. I've been using MyST markdown and [MyST-NB](https://myst-nb.readthedocs.io/en/latest/index.html) in docs in other projects and it works quite well. Advantages: 1. We get HTML reprs in the docs ([example](https://cf-xarray.readthedocs.io/en/latest/selecting.html)) which is a big improvement. (#6620) 2. I think many find markdown a lot easier to write than RST There's a tool to migrate RST to MyST ([RTD's migration guide](https://docs.readthedocs.io/en/stable/guides/migrate-rest-myst.html)). ### Describe the solution you'd like _No response_ ### Describe alternatives you've considered _No response_ ### Additional context _No response_","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7924/reactions"", ""total_count"": 5, ""+1"": 4, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1119738354,I_kwDOAMm_X85Cvdny,6222,test packaging & distribution,2448579,closed,0,,,4,2022-01-31T17:42:40Z,2022-02-03T15:45:17Z,2022-02-03T15:45:17Z,MEMBER,,,,"### Is your feature request related to a problem? It seems like we should have a test to make sure our dependencies are specified correctly. ### Describe the solution you'd like For instance we could add a step to the release workflow: https://github.com/pydata/xarray/blob/b09de8195a9e22dd35d1b7ed608ea15dad0806ef/.github/workflows/pypi-release.yaml#L34-L43 after `twine check` where we pip install and then try to import xarray. Alternatively we could have another test config in our regular CI to build + import. Thoughts? Is this excessive for a somewhat rare problem? ### Describe alternatives you've considered _No response_ ### Additional context _No response_","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6222/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 1072473598,I_kwDOAMm_X84_7KX-,6051,"Check for just ... in stack etc, and raise with a useful error message",2448579,closed,0,,,4,2021-12-06T18:35:27Z,2022-01-03T23:05:23Z,2022-01-03T23:05:23Z,MEMBER,,,," **Is your feature request related to a problem? Please describe.** The following doesn't work ``` python import xarray as xr da = xr.DataArray([[1,2],[1,2]], dims=(""x"", ""y"")) da.stack(flat=...) ``` **Describe the solution you'd like** This could be equivalent to ``` python da.stack(flat=da.dims) ``` I *think* using `ds.dims` it should be fine for datasets too. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6051/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 502149236,MDU6SXNzdWU1MDIxNDkyMzY=,3371,Add xr.unify_chunks top level method,2448579,closed,0,,,4,2019-10-03T15:49:09Z,2021-06-16T14:56:59Z,2021-06-16T14:56:58Z,MEMBER,,,,"This should handle multiple DataArrays and Datasets. Implemented in #3276 as `Dataset.unify_chunks` and `DataArray.unify_chunks`","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3371/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 636666706,MDU6SXNzdWU2MzY2NjY3MDY=,4146,sparse upstream-dev test failures,2448579,closed,0,,,4,2020-06-11T02:20:11Z,2021-03-17T23:10:45Z,2020-06-16T16:00:10Z,MEMBER,,,,"Full log here: https://dev.azure.com/xarray/xarray/_build/results?buildId=3023&view=logs&jobId=2280efed-fda1-53bd-9213-1fa8ec9b4fa8&j=2280efed-fda1-53bd-9213-1fa8ec9b4fa8&t=175181ee-1928-5a6b-f537-168f7a8b7c2d Here are three of the errors: ``` /usr/share/miniconda/envs/xarray-tests/lib/python3.8/site-packages/sparse/_coo/umath.py:739: SystemError _ test_variable_method[obj.where(*(), **{'cond': \n})-True] _ TypeError: expected dtype object, got 'numpy.dtype[uint64]' ``` ``` def _match_coo(*args, **kwargs): """""" Matches the coordinates for any number of input :obj:`COO` arrays. Equivalent to ""sparse"" broadcasting for all arrays. Parameters ---------- args : Tuple[COO] The input :obj:`COO` arrays. return_midx : bool Whether to return matched indices or matched arrays. Matching only supported for two arrays. ``False`` by default. cache : dict Cache of things already matched. No cache by default. Returns ------- matched_idx : List[ndarray] The indices of matched elements in the original arrays. Only returned if ``return_midx`` is ``True``. matched_arrays : List[COO] The expanded, matched :obj:`COO` objects. Only returned if ``return_midx`` is ``False``. """""" from .core import COO from .common import linear_loc cache = kwargs.pop(""cache"", None) return_midx = kwargs.pop(""return_midx"", False) broadcast_shape = kwargs.pop(""broadcast_shape"", None) if kwargs: linear = [idx[s] for idx, s in zip(linear, sorted_idx)] > matched_idx = _match_arrays(*linear) E SystemError: CPUDispatcher() returned a result with an error set ``` ``` _______________________________ test_dask_token ________________________________ @requires_dask def test_dask_token(): import dask s = sparse.COO.from_numpy(np.array([0, 0, 1, 2])) # https://github.com/pydata/sparse/issues/300 s.__dask_tokenize__ = lambda: dask.base.normalize_token(s.__dict__) a = DataArray(s) t1 = dask.base.tokenize(a) t2 = dask.base.tokenize(a) t3 = dask.base.tokenize(a + 1) assert t1 == t2 assert t3 != t2 assert isinstance(a.data, sparse.COO) ac = a.chunk(2) t4 = dask.base.tokenize(ac) t5 = dask.base.tokenize(ac + 1) assert t4 != t5 > assert isinstance(ac.data._meta, sparse.COO) E AssertionError: assert False E + where False = isinstance(array([], dtype=int64), ) E + where array([], dtype=int64) = dask.array, shape=(4,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray>._meta E + where dask.array, shape=(4,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray> = \ndask.array, shape=(4,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray>\nDimensions without coordinates: dim_0.data E + and = sparse.COO ``` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4146/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 636665269,MDU6SXNzdWU2MzY2NjUyNjk=,4145,Fix matplotlib in upstream-dev test config,2448579,closed,0,,,4,2020-06-11T02:15:52Z,2020-06-12T09:11:31Z,2020-06-12T09:11:31Z,MEMBER,,,,"From @keewis comment in #4138 > I just noticed that the rackcdn.org repository doesn't have matplotlib>=3.2.0, so since about late February we don't test against matplotlib upstream anymore.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4145/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 398152613,MDU6SXNzdWUzOTgxNTI2MTM=,2667,datetime interpolation doesn't work,2448579,closed,0,,,4,2019-01-11T06:45:55Z,2019-02-11T09:47:09Z,2019-02-11T09:47:09Z,MEMBER,,,,"#### Code Sample, a copy-pastable example if possible This code doesn't work anymore on master. ```python a = xr.DataArray(np.arange(21).reshape(3, 7), dims=['x', 'time'], coords={'x': [1, 2, 3], 'time': pd.date_range('01-01-2001', periods=7, freq='D')}) xi = xr.DataArray(np.linspace(1, 3, 50), dims=['time'], coords={'time': pd.date_range('01-01-2001', periods=50, freq='H')}) a.interp(x=xi, time=xi.time) ``` #### Problem description The above code now raises the error ``` --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) in 6 dims=['time'], 7 coords={'time': pd.date_range('01-01-2001', periods=50, freq='H')}) ----> 8 a.interp(x=xi, time=xi.time) ~/work/python/xarray/xarray/core/dataarray.py in interp(self, coords, method, assume_sorted, kwargs, **coords_kwargs) 1032 ds = self._to_temp_dataset().interp( 1033 coords, method=method, kwargs=kwargs, assume_sorted=assume_sorted, -> 1034 **coords_kwargs) 1035 return self._from_temp_dataset(ds) 1036 ~/work/python/xarray/xarray/core/dataset.py in interp(self, coords, method, assume_sorted, kwargs, **coords_kwargs) 2008 in indexers.items() if k in var.dims} 2009 variables[name] = missing.interp( -> 2010 var, var_indexers, method, **kwargs) 2011 elif all(d not in indexers for d in var.dims): 2012 # keep unrelated object array ~/work/python/xarray/xarray/core/missing.py in interp(var, indexes_coords, method, **kwargs) 468 new_dims = broadcast_dims + list(destination[0].dims) 469 interped = interp_func(var.transpose(*original_dims).data, --> 470 x, destination, method, kwargs) 471 472 result = Variable(new_dims, interped, attrs=var.attrs) ~/work/python/xarray/xarray/core/missing.py in interp_func(var, x, new_x, method, kwargs) 535 new_axis=new_axis, drop_axis=drop_axis) 536 --> 537 return _interpnd(var, x, new_x, func, kwargs) 538 539 ~/work/python/xarray/xarray/core/missing.py in _interpnd(var, x, new_x, func, kwargs) 558 var = var.transpose(range(-len(x), var.ndim - len(x))) 559 # stack new_x to 1 vector, with reshape --> 560 xi = np.stack([x1.values.ravel() for x1 in new_x], axis=-1) 561 rslt = func(x, var, xi, **kwargs) 562 # move back the interpolation axes to the last position ~/work/python/xarray/xarray/core/missing.py in (.0) 558 var = var.transpose(range(-len(x), var.ndim - len(x))) 559 # stack new_x to 1 vector, with reshape --> 560 xi = np.stack([x1.values.ravel() for x1 in new_x], axis=-1) 561 rslt = func(x, var, xi, **kwargs) 562 # move back the interpolation axes to the last position AttributeError: 'numpy.ndarray' object has no attribute 'values' ``` I think the issue is this line which returns a numpy array instead of a Variable. This was added in the `coarsen` PR (cc @fujiisoup) https://github.com/pydata/xarray/blob/d4c46829b283ab7e7b7db8b86dae77861ce68f3c/xarray/core/utils.py#L636","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2667/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 373955021,MDU6SXNzdWUzNzM5NTUwMjE=,2510,Dataset-wide _FillValue,2448579,closed,0,,,4,2018-10-25T13:44:46Z,2018-10-25T17:39:35Z,2018-10-25T17:37:26Z,MEMBER,,,,"I'm looking at a netCDF file that has the variable ``` float T_20(time, depth, lat, lon) ; T_20:name = ""T"" ; T_20:long_name = ""TEMPERATURE (C)"" ; T_20:generic_name = ""temp"" ; T_20:FORTRAN_format = ""f10.2"" ; T_20:units = ""C"" ; T_20:epic_code = 20 ; ``` and global attributes ``` // global attributes: :platform_code = ""8n90e"" ; :site_code = ""8n90e"" ; :wmo_platform_code = 23007 ; :array = ""RAMA"" ; :Request_for_acknowledgement = ""If you use these data in publications or presentations, please acknowledge the GTMBA Project Office of NOAA/PMEL. Also, we would appreciate receiving a preprint and/or reprint of publications utilizing the data for inclusion in our bibliography. Relevant publications should be sent to: GTMBA Project Office, NOAA/Pacific Marine Environmental Laboratory, 7600 Sand Point Way NE, Seattle, WA 98115"" ; :Data_Source = ""Global Tropical Moored Buoy Array Project Office/NOAA/PMEL"" ; :File_info = ""Contact: Dai.C.McClurg@noaa.gov"" ; :missing_value = 1.e+35f ; :_FillValue = 1.e+35f ; :CREATION_DATE = ""13:05 28-JUL-2017"" ; :_Format = ""classic"" ; ``` #### Problem description In this case the `_FillValue` and `missing_value` attributes are set for the entire dataset and not each individual variable. `decode_cf_variable` thus fails to insert NaNs. I'm not sure that this is standards-compliant but is this something we could support?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2510/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue