github: issues: 187 rows where repo = 13221727, type = "issue" and user = 2448579 sorted by updated

187 rows where repo = 13221727, type = "issue" and user = 2448579 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	assignee	comments	created_at	updated_at ▲	closed_at	author_association	body	reactions	state_reason	repo	type
1915997507	I_kwDOAMm_X85yM81D	8238	NamedArray tracking issue	dcherian 2448579	open		12	2023-09-27T17:07:58Z	2024-04-30T12:49:17Z		MEMBER	@andersy005 I think it would be good to keep a running list of NamedArray tasks. I'll start with a rough sketch, please update/edit as you like. [x] Refactor out `NamedArray` base class (#8075) [x] publicize design doc: Scientific Python \| Pangeo \| NumPy Mailist [ ] Migrate `VariableArithmetic` to `NamedArrayArithmetic` (#8244) [ ] Migrate ExplicitlyIndexed array classes to array protocols [x] MIgrate from `Indexer` objects to `.oindex` and `.vindex` on ExplicitlyIndexed array classes [ ] https://github.com/pydata/xarray/pull/8870 [ ] Migrate unary ops [ ] Migrate binary ops [ ] Migrate nanops.py [x] Avoid "injecting" reduce methods potentially by using `generate_reductions.py`? (#8304) [ ] reprs and `formatting.py` [x] `parallelcompat.py` [ ] `pycompat.py` (#8244) [ ] https://github.com/pydata/xarray/pull/8276 [ ] have `test_variable.py` test both NamedArray and Variable [x] Arrays with unknown shape #8291 [ ] https://github.com/pydata/xarray/issues/8306 [ ] https://github.com/pydata/xarray/issues/8310 [ ] https://github.com/pydata/xarray/issues/8333 [ ] Try to preserve imports from `xarray.core/` by importing `namedarray` functionality into `xarray.core/*` xref #3981	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8238/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
2259316341	I_kwDOAMm_X86Gqm51	8965	Support concurrent loading of variables	dcherian 2448579	open		4	2024-04-23T16:41:24Z	2024-04-29T22:21:51Z		MEMBER	Is your feature request related to a problem? Today if users have to concurrently load multiple variables in a DataArray or Dataset, they have to use dask. It struck me that it'd be pretty easy for `.load` to gain an `executor` kwarg that accepts anything that follows the `concurrent.futures` executor interface, and parallelize this loop. https://github.com/pydata/xarray/blob/b0036749542145794244dee4c4869f3750ff2dee/xarray/core/dataset.py#L853-L857	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8965/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1574694462	I_kwDOAMm_X85d2-4-	7513	intermittent failures with h5netcdf, h5py on macos	dcherian 2448579	closed		5	2023-02-07T16:58:43Z	2024-04-28T23:35:21Z	2024-04-28T23:35:21Z	MEMBER	What is your issue? cc @hmaarrfk @kmuehlbauer Passed: https://github.com/pydata/xarray/actions/runs/4115923717/jobs/7105298426 Failed: https://github.com/pydata/xarray/actions/runs/4115946392/jobs/7105345290 Versions: `h5netcdf 1.1.0 pyhd8ed1ab_0 conda-forge h5py 3.8.0 nompi_py310h5555e59_100 conda-forge hdf4 4.2.15 h7aa5921_5 conda-forge hdf5 1.12.2 nompi_h48135f9_101 conda-forge` ``` =================================== FAILURES =================================== ___ test_open_mfdataset_manyfiles[h5netcdf-20-True-5-5] ______ [gw1] darwin -- Python 3.10.9 /Users/runner/micromamba-root/envs/xarray-tests/bin/python readengine = 'h5netcdf', nfiles = 20, parallel = True, chunks = 5 file_cache_maxsize = 5 @requires_dask @pytest.mark.filterwarnings("ignore:use make_scale(name) instead") def test_open_mfdataset_manyfiles( readengine, nfiles, parallel, chunks, file_cache_maxsize ): # skip certain combinations skip_if_not_engine(readengine) if ON_WINDOWS: pytest.skip("Skipping on Windows") randdata = np.random.randn(nfiles) original = Dataset({"foo": ("x", randdata)}) # test standard open_mfdataset approach with too many files with create_tmp_files(nfiles) as tmpfiles: writeengine = readengine if readengine != "pynio" else "netcdf4" # split into multiple sets of temp files for ii in original.x.values: subds = original.isel(x=slice(ii, ii + 1)) if writeengine != "zarr": subds.to_netcdf(tmpfiles[ii], engine=writeengine) else: # if writeengine == "zarr": subds.to_zarr(store=tmpfiles[ii]) # check that calculation on opened datasets works properly `with open_mfdataset( tmpfiles, combine="nested", concat_dim="x", engine=readengine, parallel=parallel, chunks=chunks if (not chunks and readengine != "zarr") else "auto", ) as actual:` /Users/runner/work/xarray/xarray/xarray/tests/test_backends.py:3267: /Users/runner/work/xarray/xarray/xarray/backends/api.py:991: in open_mfdataset datasets, closers = dask.compute(datasets, closers) /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/base.py:599: in compute results = schedule(dsk, keys, kwargs) /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/threaded.py:89: in get results = get_async( /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/local.py:511: in get_async raise_exception(exc, tb) /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/local.py:319: in reraise raise exc /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/local.py:224: in execute_task result = _execute_task(task, data) /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/core.py:119: in _execute_task return func((_execute_task(a, cache) for a in args)) /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/dask/utils.py:72: in apply return func(args, kwargs) /Users/runner/work/xarray/xarray/xarray/backends/api.py:526: in open_dataset backend_ds = backend.open_dataset( /Users/runner/work/xarray/xarray/xarray/backends/h5netcdf_.py:417: in open_dataset ds = store_entrypoint.open_dataset( /Users/runner/work/xarray/xarray/xarray/backends/store.py:32: in open_dataset vars, attrs = store.load() /Users/runner/work/xarray/xarray/xarray/backends/common.py:129: in load (decode_variable_name(k), v) for k, v in self.get_variables().items() /Users/runner/work/xarray/xarray/xarray/backends/h5netcdf.py:220: in get_variables return FrozenDict( /Users/runner/work/xarray/xarray/xarray/core/utils.py:471: in FrozenDict return Frozen(dict(args, *kwargs)) /Users/runner/work/xarray/xarray/xarray/backends/h5netcdf_.py:221: in <genexpr> (k, self.open_store_variable(k, v)) for k, v in self.ds.variables.items() /Users/runner/work/xarray/xarray/xarray/backends/h5netcdf_.py:200: in open_store_variable elif var.compression is not None: /Users/runner/micromamba-root/envs/xarray-tests/lib/python3.10/site-packages/h5netcdf/core.py:394: in compression return self._h5ds.compression self = <[AttributeError("'NoneType' object has no attribute '_root'") raised in repr()] Variable object at 0x151378970> `@property def _h5ds(self): # Always refer to the root file and store not h5py object # subclasses:` `return self._root._h5file[self._h5path]` E AttributeError: 'NoneType' object has no attribute '_h5file' ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7513/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
2248614324	I_kwDOAMm_X86GByG0	8952	`isel(multi_index_level_name = MultiIndex.level)` corrupts the MultiIndex	dcherian 2448579	open		1	2024-04-17T15:41:39Z	2024-04-18T13:14:46Z		MEMBER	What happened? From https://github.com/pydata/xarray/discussions/8951 if `d` is a MultiIndex-ed dataset with levels `(x, y, z)`, and `m` is a dataset with a single coord `x` `m.isel(x=d.x)` builds a dataset with a MultiIndex with levels `(y, z)`. This seems like it should work. cc @benbovy What did you expect to happen? No response Minimal Complete Verifiable Example ```Python import pandas as pd, xarray as xr, numpy as np xr.set_options(use_flox=True) test = pd.DataFrame() test["x"] = np.arange(100) % 10 test["y"] = np.arange(100) test["z"] = np.arange(100) test["v"] = np.arange(100) d = xr.Dataset.from_dataframe(test) d = d.set_index(index = ["x", "y", "z"]) print(d) m = d.groupby("x").mean() print(m) print(d.xindexes) print(m.isel(x=d.x).xindexes) xr.align(d, m.isel(x=d.x)) res = d.groupby("x") - m print(res) ``` <xarray.Dataset> Dimensions: (index: 100) Coordinates: * index (index) object MultiIndex * x (index) int64 0 1 2 3 4 5 6 7 8 9 0 1 2 ... 8 9 0 1 2 3 4 5 6 7 8 9 * y (index) int64 0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 98 99 * z (index) int64 0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 98 99 Data variables: v (index) int64 0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 98 99 <xarray.Dataset> Dimensions: (x: 10) Coordinates: * x (x) int64 0 1 2 3 4 5 6 7 8 9 Data variables: v (x) float64 45.0 46.0 47.0 48.0 49.0 50.0 51.0 52.0 53.0 54.0 Indexes: ┌ index PandasMultiIndex │ x │ y └ z Indexes: ┌ index PandasMultiIndex │ y └ z ValueError... MVCE confirmation [x] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [x] Complete example — the example is self-contained, including all data and the text of any traceback. [x] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [x] New issue — a search of GitHub Issues suggests this is not a duplicate. [x] Recent environment — the issue occurs with the latest version of xarray and its dependencies. Relevant log output No response Anything else we need to know? No response Environment	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8952/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
2228319306	I_kwDOAMm_X86E0XRK	8914	swap_dims does not propagate indexes properly	dcherian 2448579	open		0	2024-04-05T15:36:26Z	2024-04-05T15:36:27Z		MEMBER	What happened? Found by hypothesis ``` import xarray as xr import numpy as np var = xr.Variable(dims="2", data=np.array(['1970-01-01T00:00:00.000000000', '1970-01-01T00:00:00.000000002', '1970-01-01T00:00:00.000000001'], dtype='datetime64[ns]')) var1 = xr.Variable(data=np.array([0], dtype=np.uint32), dims=['1'], attrs={}) state = xr.Dataset() state['2'] = var state = state.stack({"0": ["2"]}) state['1'] = var1 state['1_'] = var1#.copy(deep=True) state = state.swap_dims({"1": "1_"}) xr.testing.assertions._assert_internal_invariants(state, False) ``` This swaps simple pandas indexed dims, but the multi-index that is in the dataset and not affected by the swap_dims op ends up broken. cc @benbovy What did you expect to happen? No response Minimal Complete Verifiable Example No response MVCE confirmation [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [ ] Complete example — the example is self-contained, including all data and the text of any traceback. [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [ ] New issue — a search of GitHub Issues suggests this is not a duplicate. [ ] Recent environment — the issue occurs with the latest version of xarray and its dependencies. Relevant log output No response Anything else we need to know? No response Environment	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8914/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
2136709010	I_kwDOAMm_X85_W5eS	8753	Lazy Loading with `DataArray` vs. `Variable`	dcherian 2448579	closed		0	2024-02-15T14:42:24Z	2024-04-04T16:46:54Z	2024-04-04T16:46:54Z	MEMBER	Discussed in https://github.com/pydata/xarray/discussions/8751 <sup>Originally posted by ilan-gold February 15, 2024</sup> My goal is to get a dataset from [custom io-zarr backend lazy-loaded](https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html#how-to-support-lazy-loading). But when I declare a `DataArray` based on the `Variable` which uses `LazilyIndexedArray`, everything is read in. Is this expected? I specifically don't want to have to use dask if possible. I have seen https://github.com/aurghs/xarray-backend-tutorial/blob/main/2.Backend_with_Lazy_Loading.ipynb but it's a little bit different. While I have a custom backend array inheriting from `ZarrArrayWrapper`, this example using `ZarrArrayWrapper` directly still highlights the same unexpected behavior of everything being read in. ```python import zarr import xarray as xr from tempfile import mkdtemp import numpy as np from pathlib import Path from collections import defaultdict class AccessTrackingStore(zarr.DirectoryStore): def __init__(self, args, kwargs): super().__init__(args, **kwargs) self._access_count = {} self._accessed = defaultdict(set) def __getitem__(self, key): for tracked in self._access_count: if tracked in key: self._access_count[tracked] += 1 self._accessed[tracked].add(key) return super().__getitem__(key) def get_access_count(self, key): return self._access_count[key] def set_key_trackers(self, keys_to_track): if isinstance(keys_to_track, str): keys_to_track = [keys_to_track] for k in keys_to_track: self._access_count[k] = 0 def get_subkeys_accessed(self, key): return self._accessed[key] orig_path = Path(mkdtemp()) z = zarr.group(orig_path / "foo.zarr") z['array'] = np.random.randn(1000, 1000) store = AccessTrackingStore(orig_path / "foo.zarr") store.set_key_trackers(['array']) z = zarr.group(store) arr = xr.backends.zarr.ZarrArrayWrapper(z['array']) lazy_arr = xr.core.indexing.LazilyIndexedArray(arr) # just `.zarray` var = xr.Variable(('x', 'y'), lazy_arr) print('Variable read in ', store.get_subkeys_accessed('array')) # now everything is read in da = xr.DataArray(var) print('DataArray read in ', store.get_subkeys_accessed('array')) ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8753/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
2213636579	I_kwDOAMm_X86D8Wnj	8887	resetting multiindex may be buggy	dcherian 2448579	open		1	2024-03-28T16:23:38Z	2024-03-29T07:59:22Z		MEMBER	What happened? Resetting a MultiIndex dim coordinate preserves the MultiIndex levels as IndexVariables. We should either reset the indexes for the multiindex level variables, or warn asking the users to do so This seems to be the root cause exposed by https://github.com/pydata/xarray/pull/8809 cc @benbovy What did you expect to happen? No response Minimal Complete Verifiable Example ```Python import numpy as np import xarray as xr ND DataArray that gets stacked along a multiindex da = xr.DataArray(np.ones((3, 3)), coords={"dim1": [1, 2, 3], "dim2": [4, 5, 6]}) da = da.stack(feature=["dim1", "dim2"]) Extract just the stacked coordinates for saving in a dataset ds = xr.Dataset(data_vars={"feature": da.feature}) xr.testing.assertions._assert_internal_invariants(ds.reset_index(["feature", "dim1", "dim2"]), check_default_indexes=False) # succeeds xr.testing.assertions._assert_internal_invariants(ds.reset_index(["feature"]), check_default_indexes=False) # fails, but no warning either ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8887/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
2066510805	I_kwDOAMm_X857LHPV	8589	Don't overwrite indexes for region writes, always	dcherian 2448579	closed		2	2024-01-04T23:52:18Z	2024-03-27T16:24:37Z	2024-03-27T16:24:36Z	MEMBER	What happened? Currently we don't overwrite indexes when `region="auto"` https://github.com/pydata/xarray/blob/e6ccedb56ed4bc8d0b7c1f16ab325795330fb19a/xarray/backends/api.py#L1769-L1770 I propose we do this for all region writes and completely disallow modifying indexes with a region write. This would match the `map_blocks` model, where all indexes are specified in the `template` and no changes by the mapped function are allowed.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8589/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1471685307	I_kwDOAMm_X85XuCK7	7344	Disable bottleneck by default?	dcherian 2448579	open		11	2022-12-01T17:26:11Z	2024-03-27T00:22:41Z		MEMBER	What is your issue? Our choice to enable bottleneck by default results in quite a few issues about numerical stability and funny dtype behaviour: #7336, #7128, #2370, #1346 (and probably more) Shall we disable it by default?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7344/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
2188936276	I_kwDOAMm_X86CeIRU	8843	Get ready for pandas 3 copy-on-write	dcherian 2448579	closed		2	2024-03-15T15:51:36Z	2024-03-18T16:00:14Z	2024-03-18T16:00:14Z	MEMBER	What is your issue? This line fails with `pd.set_options("mode.copy_on_write", True)` https://github.com/pydata/xarray/blob/c9d3084e98d38a7a9488380789a8d0acfde3256f/xarray/tests/init.py#L329 We'll need to fix this before Pandas 3 is released in April: https://github.com/pydata/xarray/blob/c9d3084e98d38a7a9488380789a8d0acfde3256f/xarray/tests/init.py#L329 Here's a test ```python def example(): obj = Dataset() obj["dim2"] = ("dim2", 0.5 * np.arange(9)) obj["time"] = ("time", pd.date_range("2000-01-01", periods=20) print({k: v.data.flags for k, v in obj.variables.items()}) return obj example() pd.set_options("mode.copy_on_write", True) example() ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8843/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
2098659703	I_kwDOAMm_X859FwF3	8659	renaming index variables with `rename_vars` seems buggy	dcherian 2448579	closed		1	2024-01-24T16:35:18Z	2024-03-15T19:21:51Z	2024-03-15T19:21:51Z	MEMBER	What happened? (xref #8658) I'm not sure what the expected behaviour is here: ```python import xarray as xr import numpy as np from xarray.testing import _assert_internal_invariants ds = xr.Dataset() ds.coords["1"] = ("1", np.array([1], dtype=np.uint32)) ds["1_"] = ("1", np.array([1], dtype=np.uint32)) ds = ds.rename_vars({"1": "0"}) ds ``` It looks like this sequence of operations creates a default index But then ```python from xarray.testing import _assert_internal_invariants _assert_internal_invariants(ds, check_default_indexes=True) `fails with` ... File ~/repos/xarray/xarray/testing/assertions.py:301, in _assert_indexes_invariants_checks(indexes, possible_coord_variables, dims, check_default) 299 if check_default: 300 defaults = default_indexes(possible_coord_variables, dims) --> 301 assert indexes.keys() == defaults.keys(), (set(indexes), set(defaults)) 302 assert all(v.equals(defaults[k]) for k, v in indexes.items()), ( 303 indexes, 304 defaults, 305 ) AssertionError: ({'0'}, set()) ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8659/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
2187659148	I_kwDOAMm_X86CZQeM	8838	remove xfail from `test_dataarray.test_to_dask_dataframe()`	dcherian 2448579	open		2	2024-03-15T03:43:02Z	2024-03-15T15:33:31Z		MEMBER	What is your issue? when dask-expr is fixed. Added in https://github.com/pydata/xarray/pull/8837	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8838/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
2184871888	I_kwDOAMm_X86COn_Q	8830	failing tests, all envs	dcherian 2448579	closed		1	2024-03-13T20:56:34Z	2024-03-15T04:06:04Z	2024-03-15T04:06:04Z	MEMBER	What happened? All tests are failing because of an error in `create_test_data` `from xarray.tests import create_test_data create_test_data()` ``` AssertionError Traceback (most recent call last) Cell In[3], line 2 1 from xarray.tests import create_test_data ----> 2 create_test_data() File ~/repos/xarray/xarray/tests/init.py:329, in create_test_data(seed, add_attrs, dim_sizes) 327 obj.coords["numbers"] = ("dim3", numbers_values) 328 obj.encoding = {"foo": "bar"} --> 329 assert all(var.values.flags.writeable for var in obj.variables.values()) 330 return obj AssertionError: ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8830/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1308371056	I_kwDOAMm_X85N_Chw	6806	New alignment option: "exact" without broadcasting OR Turn off automatic broadcasting	dcherian 2448579	closed		9	2022-07-18T18:43:31Z	2024-03-13T15:36:35Z	2024-03-13T15:36:35Z	MEMBER	Is your feature request related to a problem? If we have two objects with dims `x` and `x1`, then `xr.align(..., join="exact")` will pass because these dimensions are broadcastable. I'd like a stricter option (`join="strict"`?) that disallows broadcasting. Describe the solution you'd like `python xr.align( xr.DataArray([1], dims="x"), xr.DataArray([1], dims="x1"), join="strict", )` would raise an error. It'd be nice to have this as a built-in option so we can use `python with xr.set_options(arithmetic_join="strict"): ...` Describe alternatives you've considered An alternative would be to allow control over automatic broadcasting through the `set_options` context manager., but that seems like it would be more complicated to implement. Additional context This turns up in staggered grid calculations with xgcm where it is easy to mistakenly construct very high-dimensional arrays because of automatic broadcasting.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6806/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
2149485914	I_kwDOAMm_X86AHo1a	8778	Stricter defaults for concat, combine, open_mfdataset	dcherian 2448579	open		2	2024-02-22T16:43:38Z	2024-02-23T04:17:40Z		MEMBER	Is your feature request related to a problem? The defaults for `concat` are excessively permissive: `data_vars="all", coords="different", compat="no_conflicts", join="outer"`. This comment illustrates why this can be hard to predict or understand: a seemingly unrelated option `decode_cf` controls whether a variable is in `data_vars` or `coords`, and can result in wildly different concatenation behaviour. This always concatenates data_vars along `concat_dim` even if they did not have that dimension to begin with. If the same coordinate var exists in different datasets/files, they will be sequentially compared for equality to decide whether they get concatenated. The outer join (applied along all dimensions that are not `concat_dim`) can result in very large datasets due to small floating points differences in the indexes, and also questionable behaviour with staggered grid datasets. "no_conflicts" basically picks the first not-NaN value after aligning all datasets, but is quite slow (we should be using `duck_array_ops.nanfirst` here I think). While "convenient" this really just makes the default experience quite bad with hard-to-understand slowdowns. Describe the solution you'd like I propose we migrate to `data_vars="minimal", coords="minimal", join="exact", compat="override"`. This should 1. only concatenate `data_vars` and `coords` variables when they already have `concat_dim`. 2. For any variables that do not have `concat_dim`, it will blindly pick them from the first file. 3. `join="exact"` will prevent ballooning of dimension sizes due to floating point inequalities. 4. These options will totally avoid any data reads unless explicitly requested by the user. Unfortunately, this has a pretty big blast radius so we'd need a long deprecation cycle. Describe alternatives you've considered No response Additional context xref https://github.com/pydata/xarray/issues/4824 xref https://github.com/pydata/xarray/issues/1385 xref https://github.com/pydata/xarray/issues/8231 xref https://github.com/pydata/xarray/issues/5381 xref https://github.com/pydata/xarray/issues/2064 xref https://github.com/pydata/xarray/issues/2217	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8778/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
2135011804	I_kwDOAMm_X85_QbHc	8748	release v2024.02.0	dcherian 2448579	closed	keewis 14808389	0	2024-02-14T19:08:38Z	2024-02-18T22:52:15Z	2024-02-18T22:52:15Z	MEMBER	What is your issue? Thanks to @keewis for volunteering at today's meeting :()	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8748/reactions", "total_count": 3, "+1": 0, "-1": 0, "laugh": 1, "hooray": 0, "confused": 0, "heart": 2, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
2064313690	I_kwDOAMm_X857Cu1a	8580	add py3.12 CI and update pyproject.toml	dcherian 2448579	closed		2	2024-01-03T16:26:47Z	2024-01-17T21:54:13Z	2024-01-17T21:54:13Z	MEMBER	What is your issue? We haven't done this yet! https://github.com/pydata/xarray/blob/d87ba61c957fc3af77251ca6db0f6bccca1acb82/pyproject.toml#L11-L15	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8580/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
2086607437	I_kwDOAMm_X858XxpN	8616	new release 2024.01.0	dcherian 2448579	closed		0	2024-01-17T17:03:20Z	2024-01-17T19:21:12Z	2024-01-17T19:21:12Z	MEMBER	What is your issue? Thanks @TomNicholas for volunteering to drive this release!	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8616/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 1, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
638947370	MDU6SXNzdWU2Mzg5NDczNzA=	4156	writing sparse to netCDF	dcherian 2448579	open		7	2020-06-15T15:33:23Z	2024-01-09T10:14:00Z		MEMBER	I haven't looked at this too closely but it appears that this is a way to save MultiIndexed datasets to netCDF. So we may be able to do `sparse -> multiindex -> netCDF` http://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#compression-by-gathering cc @fujiisoup	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4156/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
2064420057	I_kwDOAMm_X857DIzZ	8581	bump min versions	dcherian 2448579	closed		0	2024-01-03T17:45:10Z	2024-01-05T16:13:16Z	2024-01-05T16:13:15Z	MEMBER	What is your issue? Looks like we can bump a number of min versions: ``` Package Required Policy Status cartopy 0.20 (2021-09-17) 0.21 (2022-09-10) < dask-core 2022.7 (2022-07-08) 2022.12 (2022-12-02) < distributed 2022.7 (2022-07-08) 2022.12 (2022-12-02) < flox 0.5 (2022-05-03) 0.6 (2022-10-12) < iris 3.2 (2022-02-15) 3.4 (2022-12-01) < matplotlib-base 3.5 (2021-11-18) 3.6 (2022-09-16) < numba 0.55 (2022-01-14) 0.56 (2022-09-28) < numpy 1.22 (2022-01-03) 1.23 (2022-06-23) < packaging 21.3 (2021-11-18) 22.0 (2022-12-08) < pandas 1.4 (2022-01-22) 1.5 (2022-09-19) < scipy 1.8 (2022-02-06) 1.9 (2022-07-30) < seaborn 0.11 (2020-09-08) 0.12 (2022-09-06) < typing_extensions 4.3 (2022-07-01) 4.4 (2022-10-07) < zarr 2.12 (2022-06-23) 2.13 (2022-09-27) < ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8581/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
2064480451	I_kwDOAMm_X857DXjD	8582	Adopt SPEC 0 instead of NEP-29	dcherian 2448579	open		1	2024-01-03T18:36:24Z	2024-01-03T20:12:05Z		MEMBER	What is your issue? https://docs.xarray.dev/en/stable/getting-started-guide/installing.html#minimum-dependency-versions says that we follow NEP-29, and I think our min versions script also does that. I propose we follow https://scientific-python.org/specs/spec-0000/ In practice, I think this means we mostly drop Python versions earlier.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8582/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
2052952379	I_kwDOAMm_X856XZE7	8568	Raise when assigning attrs to virtual variables (default coordinate arrays)	dcherian 2448579	open		0	2023-12-21T19:24:11Z	2023-12-21T19:24:19Z		MEMBER	Discussed in https://github.com/pydata/xarray/discussions/8567 <sup>Originally posted by matthew-brett December 21, 2023</sup> Sorry for the introductory question, but we (@ivanov and I) ran into this behavior while experimenting: ```python import numpy as np data = np.zeros((3, 4, 5)) ds = xr.DataArray(data, dims=('i', 'j', 'k')) print(ds['k'].attrs) ``` This shows `{}` as we might reasonably expect. But then: ```python ds['k'].attrs['foo'] = 'bar' print(ds['k'].attrs) ``` This also gives `{}`, which we found surprising. We worked out why that was, after a little experimentation (the default coordinate arrays seems to get created on the fly and garbage collected immediately). But it took us a little while. Is that as intended? Is there a way of making this less confusing? Thanks for any help.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8568/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1954809370	I_kwDOAMm_X850hAYa	8353	Update benchmark suite for asv 0.6.1	dcherian 2448579	open		0	2023-10-20T18:13:22Z	2023-12-19T05:53:21Z		MEMBER	The new asv version comes with decorators for parameterizing and skipping, and the ability to use `mamba` to create environments. https://github.com/airspeed-velocity/asv/releases https://asv.readthedocs.io/en/v0.6.1/writing_benchmarks.html#skipping-benchmarks This might help us reduce benchmark times a bit, or at least simplify the code some.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8353/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
2027147099	I_kwDOAMm_X854089b	8523	tree-reduce the combine for `open_mfdataset(..., parallel=True, combine="nested")`	dcherian 2448579	open		4	2023-12-05T21:24:51Z	2023-12-18T19:32:39Z		MEMBER	Is your feature request related to a problem? When `parallel=True` and a distributed client is active, Xarray reads every file in parallel, constructs a Dataset per file with indexed coordinates loaded, and then sends all of that back to the "head node" for the combine. Instead we can tree-reduce the combine (example) by switching to `dask.bag` instead of `dask.delayed` and skip the overhead of shipping 1000s of copies of an indexed coordinate back to the head node. The downside is the dask graph is "worse" but perhaps that shouldn't stop us. I think this is only feasible for `combine="nested"` cc @TomNicholas	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8523/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1989588884	I_kwDOAMm_X852lreU	8448	mypy 1.7.0 raising errors	dcherian 2448579	closed		0	2023-11-12T21:41:43Z	2023-12-01T22:02:22Z	2023-12-01T22:02:22Z	MEMBER	What happened? xarray/namedarray/core.py:758: error: Value of type Never is not indexable [index] xarray/core/alignment.py:684: error: Unused "type: ignore" comment [unused-ignore] xarray/core/alignment.py:1156: error: Unused "type: ignore" comment [unused-ignore] xarray/core/dataset.py: note: In member "sortby" of class "Dataset": xarray/core/dataset.py:7967: error: Incompatible types in assignment (expression has type "tuple[Alignable, ...]", variable has type "tuple[DataArray, ...]") [assignment] xarray/core/dataset.py:7979: error: "Alignable" has no attribute "isel" [attr-defined]	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8448/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1615596004	I_kwDOAMm_X85gTAnk	7596	illustrate time offset arithmetic	dcherian 2448579	closed		2	2023-03-08T16:54:15Z	2023-11-29T01:31:45Z	2023-11-29T01:31:45Z	MEMBER	Is your feature request related to a problem? We should document changing the time vector using pandas date offsets here This is particularly useful for centering the time stamps after a resampling operation. Related: - CFTime offsets: https://github.com/pydata/xarray/issues/5687 - `loffset` deprecation: https://github.com/pydata/xarray/pull/7444 Describe the solution you'd like No response Describe alternatives you've considered No response Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7596/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1950211465	I_kwDOAMm_X850Pd2J	8333	Should NamedArray be interchangeable with other array types? or Should we support the `axis` kwarg?	dcherian 2448579	open		17	2023-10-18T16:46:37Z	2023-10-31T22:26:33Z		MEMBER	What is your issue? Raising @Illviljan's comment from https://github.com/pydata/xarray/pull/8304#discussion_r1363196597.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8333/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1672288892	I_kwDOAMm_X85jrRp8	7764	Support opt_einsum in xr.dot	dcherian 2448579	closed		7	2023-04-18T03:29:48Z	2023-10-28T03:31:06Z	2023-10-28T03:31:06Z	MEMBER	Is your feature request related to a problem? Shall we support opt_einsum as an optional backend for `xr.dot`? `opt_einsum.contract` is a drop-in replacement for `np.einsum` so this monkey-patch works today `xr.core.duck_array_ops.einsum = opt_einsum.contract` Describe the solution you'd like Add a `backend` kwarg with options `"numpy"` and `"opt_einsum"`, with the default being `"numpy"` Describe alternatives you've considered We could create a new package but it seems a bit silly. Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7764/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1952621896	I_kwDOAMm_X850YqVI	8337	Support rolling with numbagg	dcherian 2448579	open		3	2023-10-19T16:11:40Z	2023-10-23T15:46:36Z		MEMBER	Is your feature request related to a problem? We can do plain reductions, and groupby reductions with numbagg. Rolling is the last one left! I don't think coarsen will benefit since it's basically a reshape and reduce on that view, so it should already be accelerated. There may be small gains in handling the boundary conditions but that's probably it.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8337/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1954445639	I_kwDOAMm_X850fnlH	8350	optimize align for scalars at least	dcherian 2448579	open		5	2023-10-20T14:48:25Z	2023-10-20T19:17:39Z		MEMBER	What happened? Here's a simple rescaling calculation: ```python import numpy as np import xarray as xr ds = xr.Dataset( {"a": (("x", "y"), np.ones((300, 400))), "b": (("x", "y"), np.ones((300, 400)))} ) mean = ds.mean() # scalar std = ds.std() # scalar rescaled = (ds - mean) / std ``` The profile for the last line shows 30% (!!!) time spent in `align` (really `reindex_like`) except there's nothing to reindex when only scalars are involved! This is a small example inspired by a ML pipeline where this normalization is happening very many times in a tight loop. cc @benbovy What did you expect to happen? A fast path for when no reindexing needs to happen.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8350/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1943543755	I_kwDOAMm_X85z2B_L	8310	pydata/xarray as monorepo for Xarray and NamedArray	dcherian 2448579	open		1	2023-10-14T20:34:51Z	2023-10-14T21:29:11Z		MEMBER	What is your issue? As we work through refactoring for NamedArray, it's pretty clear that Xarray will depend pretty closely on many files in `namedarray/`. For example various `utils.py`, `pycompat.py`, `*ops.py`, `formatting.py`, `formatting_html.py` at least. This promises to be quite painful if we did break NamedArray out in to its own repo (particularly around typing, e.g. https://github.com/pydata/xarray/pull/8309) I propose we use pydata/xarray as a monorepo that serves two packages: NamedArray and Xarray. - We can move as much as is needed to have NamedArray be independent of Xarray, but Xarray will depend quite closely on many utility functions in NamedArray. - We can release both at the same time similar to dask and distributed. - We can re-evaluate if and when NamedArray grows its own community.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8310/reactions", "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1942893480	I_kwDOAMm_X85zzjOo	8306	keep_attrs for NamedArray	dcherian 2448579	open		0	2023-10-14T02:29:54Z	2023-10-14T02:31:35Z		MEMBER	What is your issue? Copying over @max-sixty's comment from https://github.com/pydata/xarray/pull/8304#discussion_r1358873522 I haven't been in touch with the NameArray discussions so forgive a glib comment — but re https://github.com/pydata/xarray/issues/3891 — this would be a "once-in-a-library" opportunity to always retain attrs in aggregations, removing the `keep_attrs` option in methods. (Xarray could still handle them as it wished, so xarray's external interface wouldn't need to change immediately...) @pydata/xarray Should we just delete the `keep_attrs` kwarg completely for NamedArray and always propagate attrs? `obj.attrs.clear()` seems just as easy to type.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8306/reactions", "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1916012703	I_kwDOAMm_X85yNAif	8239	Address repo-review suggestions	dcherian 2448579	open		7	2023-09-27T17:18:40Z	2023-10-02T20:24:34Z		MEMBER	What is your issue? Here's the output from the Scientific Python Repo Review tool. There's an online version here. On mac I run `pipx run 'sp-repo-review[cli]' --format html --show err gh:pydata/xarray@main \| pbcopy` A lot of these seem fairly easy to fix. I'll note that there's a large number of `mypy` config suggestions. General Detected build backend: `setuptools.build_meta` Detected license(s): Apache Software License <table> <tr><th>?</th><th>Name</th><th>Description</th></tr> <tr style="color: red;"> <td>❌</td> <td>PY007</td> <td> Supports an easy task runner (nox or tox) Projects must have a `noxfile.py` or `tox.ini` to encourage new contributors. </td> </tr> </table> PyProject See https://github.com/pydata/xarray/issues/8239#issuecomment-1739363809 <table> <tr><th>?</th><th>Name</th><th>Description</th></tr> <tr style="color: red;"> <td>❌</td> <td>PP305</td> <td> Specifies xfail_strict `xfail_strict` should be set. You can manually specify if a check should be strict when setting each xfail. `[tool.pytest.ini_options] xfail_strict = true` </td> </tr> <tr style="color: red;"> <td>❌</td> <td>PP308</td> <td> Specifies useful pytest summary `-ra` should be in `addopts = [...]` (print summary of all fails/errors). `[tool.pytest.ini_options] addops = ["-ra", "--strict-config", "--strict-markers"]` </td> </tr> </table> Pre-commit <table> <tr><th>?</th><th>Name</th><th>Description</th></tr> <tr style="color: red;"> <td>❌</td> <td>PC110</td> <td> Uses black Use `https://github.com/psf/black-pre-commit-mirror` instead of `https://github.com/psf/black` in `.pre-commit-config.yaml` </td> </tr> <tr style="color: red;"> <td>❌</td> <td>PC160</td> <td> Uses codespell Must have `https://github.com/codespell-project/codespell` repo in `.pre-commit-config.yaml` </td> </tr> <tr style="color: red;"> <td>❌</td> <td>PC170</td> <td> Uses PyGrep hooks (only needed if RST present) Must have `https://github.com/pre-commit/pygrep-hooks` repo in `.pre-commit-config.yaml` </td> </tr> <tr style="color: red;"> <td>❌</td> <td>PC180</td> <td> Uses prettier Must have `https://github.com/pre-commit/mirrors-prettier` repo in `.pre-commit-config.yaml` </td> </tr> <tr style="color: red;"> <td>❌</td> <td>PC191</td> <td> Ruff show fixes if fixes enabled If `--fix` is present, `--show-fixes` must be too. </td> </tr> <tr style="color: red;"> <td>❌</td> <td>PC901</td> <td> Custom pre-commit CI message Should have something like this in `.pre-commit-config.yaml`: `ci: autoupdate_commit_msg: 'chore: update pre-commit hooks'` </td> </tr> </table> MyPy <table> <tr><th>?</th><th>Name</th><th>Description</th></tr> <tr style="color: red;"> <td>❌</td> <td>MY101</td> <td> MyPy strict mode Must have `strict` in the mypy config. MyPy is best with strict or nearly strict configuration. If you are happy with the strictness of your settings already, ignore this check or set `strict = false` explicitly. `[tool.mypy] strict = true` </td> </tr> <tr style="color: red;"> <td>❌</td> <td>MY103</td> <td> MyPy warn unreachable Must have `warn_unreachable = true` to pass this check. There are occasionally false positives (often due to platform or Python version static checks), so it's okay to ignore this check. But try it first - it can catch real bugs too. `[tool.mypy] warn_unreachable = true` </td> </tr> <tr style="color: red;"> <td>❌</td> <td>MY104</td> <td> MyPy enables ignore-without-code Must have `"ignore-without-code"` in `enable_error_code = [...]`. This will force all skips in your project to include the error code, which makes them more readable, and avoids skipping something unintended. `[tool.mypy] enable_error_code = ["ignore-without-code", "redundant-expr", "truthy-bool"]` </td> </tr> <tr style="color: red;"> <td>❌</td> <td>MY105</td> <td> MyPy enables redundant-expr Must have `"redundant-expr"` in `enable_error_code = [...]`. This helps catch useless lines of code, like checking the same condition twice. `[tool.mypy] enable_error_code = ["ignore-without-code", "redundant-expr", "truthy-bool"]` </td> </tr> <tr style="color: red;"> <td>❌</td> <td>MY106</td> <td> MyPy enables truthy-bool Must have `"truthy-bool"` in `enable_error_code = []`. This catches mistakes in using a value as truthy if it cannot be falsey. `[tool.mypy] enable_error_code = ["ignore-without-code", "redundant-expr", "truthy-bool"]` </td> </tr> </table> Ruff <table> <tr><th>?</th><th>Name</th><th>Description</th></tr> <tr style="color: red;"> <td>❌</td> <td>RF101</td> <td> Bugbear must be selected Must select the flake8-bugbear `B` checks. Recommended: `[tool.ruff] select = [ "B", # flake8-bugbear ]` </td> </tr> </table>	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8239/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1908084109	I_kwDOAMm_X85xuw2N	8223	release 2023.09.0	dcherian 2448579	closed		6	2023-09-22T02:29:30Z	2023-09-26T08:12:46Z	2023-09-26T08:12:46Z	MEMBER	We've accumulated a nice number of changes. Can someone volunteer to do a release in the next few days?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8223/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1217566173	I_kwDOAMm_X85IkpXd	6528	cumsum drops index coordinates	dcherian 2448579	open		5	2022-04-27T16:04:08Z	2023-09-22T07:55:56Z		MEMBER	What happened? cumsum drops index coordinates. Seen in #6525, #3417 What did you expect to happen? Preserve index coordinates Minimal Complete Verifiable Example ```Python import xarray as xr ds = xr.Dataset( {"foo": (("x",), [7, 3, 1, 1, 1, 1, 1])}, coords={"x": [0, 1, 2, 3, 4, 5, 6]}, ) ds.cumsum("x") ``` `<xarray.Dataset> Dimensions: (x: 7) Dimensions without coordinates: x Data variables: foo (x) int64 7 10 11 12 13 14 15` Relevant log output No response Anything else we need to know? No response Environment xarray main	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6528/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1859703572	I_kwDOAMm_X85u2NMU	8095	Support `inline_array` kwarg in `open_zarr`	dcherian 2448579	open		2	2023-08-21T16:09:38Z	2023-09-21T20:37:50Z		MEMBER	cc @TomNicholas What happened? There is no way to specify `inline_array` in `open_zarr`. Instead we have to use `open_dataset`. Minimal Complete Verifiable Example ```Python import xarray as xr xr.Dataset({"a": xr.DataArray([1.0])}).to_zarr("temp.zarr") ``` `python xr.open_zarr('temp.zarr', inline_array=True)` `ValueError: argument inline_array cannot be passed both as a keyword argument and within the from_array_kwargs dictionary` `python xr.open_zarr('temp.zarr', from_array_kwargs=dict(inline_array=True))` `ValueError: argument inline_array cannot be passed both as a keyword argument and within the from_array_kwargs dictionary`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8095/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1175093771	I_kwDOAMm_X85GCoIL	6391	apply_ufunc and Datasets with variables without the core dimension	dcherian 2448579	closed		5	2022-03-21T09:13:02Z	2023-09-17T08:20:15Z	2023-09-17T08:20:14Z	MEMBER	Is your feature request related to a problem? Consider this example `python ds = xr.Dataset({"a": ("x", [1, 2, 3]), "b": ("y", [1, 2, 3])}) xr.apply_ufunc(np.mean, ds, input_core_dims=[["x"]])` This raises `ValueError: operand to apply_ufunc has required core dimensions ['x'], but some of these dimensions are absent on an input variable: ['x']` because core dimension `x` is missing on variable `b`. This behaviour makes it annoying to use `apply_ufunc` on Datasets. Describe the solution you'd like Add a new kwarg to `apply_ufunc` called `missing_core_dim` that controls how to handle variables without all input core dimensions. This kwarg could take one of two values: 1. `"raise"` - raise an error, current behaviour 2. `"copy"` - skip applying the function and copy the variable from input to output. 3. `"drop"`- skip applying the function and drop the variable. Describe alternatives you've considered No response Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6391/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1874695065	I_kwDOAMm_X85vvZOZ	8125	failing tests with pandas 2.1	dcherian 2448579	closed		10	2023-08-31T02:42:32Z	2023-09-15T13:12:02Z	2023-09-15T13:12:02Z	MEMBER	What happened? See https://github.com/pydata/xarray/pull/8101 `FAILED xarray/tests/test_missing.py::test_interpolate_pd_compat - ValueError: 'fill_value' is not a valid keyword for DataFrame.interpolate FAILED xarray/tests/test_missing.py::test_interpolate_pd_compat_non_uniform_index - ValueError: 'fill_value' is not a valid keyword for DataFrame.interpolate` and this doctest `FAILED xarray/core/dataarray.py::xarray.core.dataarray.DataArray.to_unstacked_dataset` @pydata/xarray can someone take a look please?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8125/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1812301185	I_kwDOAMm_X85sBYWB	8005	Design for IntervalIndex	dcherian 2448579	open		5	2023-07-19T16:30:50Z	2023-09-09T06:30:20Z		MEMBER	Is your feature request related to a problem? We should add a wrapper for `pandas.IntervalIndex` this would solve a long standing problem around propagating "bounds" variables (CF conventions, https://github.com/pydata/xarray/issues/1475) The CF design CF "encoding" for intervals is to use bounds variables. There is an attribute `"bounds"` on the dimension coordinate, that refers to a second variable (at least 2D). Example: `x` has an attribute `bounds` that refers to `x_bounds`. ```python import numpy as np left = np.arange(0.5, 3.6, 1) right = np.arange(1.5, 4.6, 1) bounds = np.stack([left, right]) ds = xr.Dataset( {"data": ("x", [1, 2, 3, 4])}, coords={"x": ("x", [1, 2, 3, 4], {"bounds": "x_bounds"}), "x_bounds": (("bnds", "x"), bounds)}, ) ds ``` A fundamental problem with our current data model is that we lose `x_bounds` when we extract `ds.data` because there is a dimension `bnds` that is not shared with `ds.data`. Very important metadata is now lost! We would also like to use the "bounds" to enable interval based indexing. `ds.sel(x=1.1)` should give you the value from the appropriate interval. Pandas IntervalIndex All the indexing is easy to implement by wrapping pandas.IntervalIndex, but there is one limitation. `pd.IntervalIndex` saves two pieces of information for each interval (left bound, right bound). CF saves three : left bound, right bound (see `x_bounds`) and a "central" value (see `x`). This should be OK to work around in our wrapper. Fundamental Question To me, a core question is whether `x_bounds` needs to be preserved after creating an `IntervalIndex`. 1. If so, we need a better rule around coordinate variable propagation. In this case, the IntervalIndex would be associated with `x` and `x_bounds`. So the rule could be > "propagate all variables necessary to propagate an index associated with any of the dimensions on the extracted variable." So when extracting `ds.data` we propagate all variables necessary to propagate indexes associated with `ds.data.dims` that is `x` which would say "propagate `x`, `x_bounds`, and the IntervalIndex. Alternatively, we could choose to drop `x_bounds` entirely. I interpret this approach as "decoding" the bounds variable to an interval index object. When saving to disk, we would encode the interval index in two variables. (See below) Describe the solution you'd like I've prototyped (2) [approach 1 in this notebook) following @benbovy's suggestion ```python from xarray import Variable from xarray.indexes import PandasIndex class XarrayIntervalIndex(PandasIndex): def __init__(self, index, dim, coord_dtype): assert isinstance(index, pd.IntervalIndex) # for PandasIndex self.index = index self.dim = dim self.coord_dtype = coord_dtype @classmethod def from_variables(cls, variables, options): assert len(variables) == 1 (dim,) = tuple(variables) bounds = options["bounds"] assert isinstance(bounds, (xr.DataArray, xr.Variable)) (axis,) = bounds.get_axis_num(set(bounds.dims) - {dim}) left, right = np.split(bounds.data, 2, axis=axis) index = pd.IntervalIndex.from_arrays(left.squeeze(), right.squeeze()) coord_dtype = bounds.dtype return cls(index, dim, coord_dtype) def create_variables(self, variables): from xarray.core.indexing import PandasIndexingAdapter newvars = {self.dim: xr.Variable(self.dim, PandasIndexingAdapter(self.index))} return newvars def __repr__(self): string = f"Xarray{self.index!r}" return string def to_pandas_index(self): return self.index @property def mid(self): return PandasIndex(self.index.right, self.dim, self.coord_dtype) @property def left(self): return PandasIndex(self.index.right, self.dim, self.coord_dtype) @property def right(self): return PandasIndex(self.index.right, self.dim, self.coord_dtype) ``` `python ds1 = ( ds.drop_indexes("x") .set_xindex("x", XarrayIntervalIndex, bounds=ds.x_bounds) .drop_vars("x_bounds") ) ds1` `python ds1.sel(x=1.1)` Describe alternatives you've considered I've tried some approaches in this notebook	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8005/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1888576440	I_kwDOAMm_X85wkWO4	8162	Update group by multi index	dcherian 2448579	open		0	2023-09-09T04:50:29Z	2023-09-09T04:50:39Z		MEMBER	ideally `GroupBy._infer_concat_args()` would return a `xr.Coordinates` object that contains both the coordinate(s) and their (multi-)index to assign to the result (combined) object. The goal is to avoid calling `create_default_index_implicit(coord)` below where `coord` is a `pd.MultiIndex` or a single `IndexVariable` wrapping a multi-index. If `coord` is a `Coordinates` object, we could do `combined = combined.assign_coords(coord)` instead. https://github.com/pydata/xarray/blob/e2b6f3468ef829b8a83637965d34a164bf3bca78/xarray/core/groupby.py#L1573-L1587 There are actually more general issues: The `group` parameter of Dataset.groupby being a single variable or variable name, it won't be possible to do groupby on a full pandas multi-index once we drop its dimension coordinate (#8143). How can we still support it? Maybe passing a dimension name to `group` and check that there's only one index for that dimension? How can we support custom, multi-coordinate indexes with groupby? I don't have any practical example in mind, but in theory just passing a single coordinate name as `group` will invalidate the index. Should we drop the index in the result? Or, like suggested above pass a dimension name as group and check the index? Originally posted by @benbovy in https://github.com/pydata/xarray/issues/8140#issuecomment-1709775666	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8162/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1812504689	I_kwDOAMm_X85sCKBx	8006	Fix documentation about datetime_unit of xarray.DataArray.differentiate	dcherian 2448579	closed		0	2023-07-19T18:31:10Z	2023-09-01T09:37:15Z	2023-09-01T09:37:15Z	MEMBER	Should say that `Y` and `M` cannot be supported with `datetime64` Discussed in https://github.com/pydata/xarray/discussions/8000 <sup>Originally posted by jesieleo July 19, 2023</sup> I have a piece of data that looks like this ``` <xarray.Dataset> Dimensions: (time: 612, LEV: 15, latitude: 20, longitude: 357) Coordinates: * time (time) datetime64[ns] 1960-01-15 1960-02-15 ... 2010-12-15 * LEV (LEV) float64 5.01 15.07 25.28 35.76 ... 149.0 171.4 197.8 229.5 * latitude (latitude) float64 -4.75 -4.25 -3.75 -3.25 ... 3.75 4.25 4.75 * longitude (longitude) float64 114.2 114.8 115.2 115.8 ... 291.2 291.8 292.2 Data variables: u (time, LEV, latitude, longitude) float32 ... Attributes: (12/30) cdm_data_type: Grid Conventions: COARDS, CF-1.6, ACDD-1.3 creator_email: chepurin@umd.edu creator_name: APDRC creator_type: institution creator_url: https://www.atmos.umd.edu/~ocean/ ... ... standard_name_vocabulary: CF Standard Name Table v29 summary: Simple Ocean Data Assimilation (SODA) soda po... time_coverage_end: 2010-12-15T00:00:00Z time_coverage_start: 1983-01-15T00:00:00Z title: SODA soda pop2.2.4 [TIME][LEV][LAT][LON] Westernmost_Easting: 118.25 ``` when i try to use xarray.DataArray.differentiate `data.u.differentiate('time',datetime_unit='M')` will appear ``` Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\Anaconda3\lib\site-packages\xarray\core\dataarray.py", line 3609, in differentiate ds = self._to_temp_dataset().differentiate(coord, edge_order, datetime_unit) File "D:\Anaconda3\lib\site-packages\xarray\core\dataset.py", line 6372, in differentiate coord_var = coord_var._to_numeric(datetime_unit=datetime_unit) File "D:\Anaconda3\lib\site-packages\xarray\core\variable.py", line 2428, in _to_numeric numeric_array = duck_array_ops.datetime_to_numeric( File "D:\Anaconda3\lib\site-packages\xarray\core\duck_array_ops.py", line 466, in datetime_to_numeric array = array / np.timedelta64(1, datetime_unit) TypeError: Cannot get a common metadata divisor for Numpy datatime metadata [ns] and [M] because they have incompatible nonlinear base time units. ``` Would you please told me is this a BUG?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8006/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1603957501	I_kwDOAMm_X85fmnL9	7573	Add optional min versions to conda-forge recipe (`run_constrained`)	dcherian 2448579	closed		4	2023-02-28T23:12:15Z	2023-08-21T16:12:34Z	2023-08-21T16:12:21Z	MEMBER	Is your feature request related to a problem? I opened this PR to add minimum versions for our optional dependencies: https://github.com/conda-forge/xarray-feedstock/pull/84/files to prevent issues like #7467 I think we'd need a policy to choose which ones to list. Here's the current list: `run_constrained: - bottleneck >=1.3 - cartopy >=0.20 - cftime >=1.5 - dask-core >=2022.1 - distributed >=2022.1 - flox >=0.5 - h5netcdf >=0.13 - h5py >=3.6 - hdf5 >=1.12 - iris >=3.1 - matplotlib-base >=3.5 - nc-time-axis >=1.4 - netcdf4 >=1.5.7 - numba >=0.55 - pint >=0.18 - scipy >=1.7 - seaborn >=0.11 - sparse >=0.13 - toolz >=0.11 - zarr >=2.10` Some examples to think about: 1. `iris` seems like a bad one to force. It seems like people might use Iris and Xarray independently and Xarray shouldn't force a minimum version. 2. For backends, I arbitrarily kept `netcdf4`, `h5netcdf` and `zarr`. 3. It seems like we should keep array types: so `dask`, `sparse`, `pint`. Describe the solution you'd like No response Describe alternatives you've considered No response Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7573/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1824824446	I_kwDOAMm_X85sxJx-	8025	Support Groupby first, last with flox	dcherian 2448579	open		0	2023-07-27T17:07:51Z	2023-07-27T19:08:06Z		MEMBER	Is your feature request related to a problem? flox recently added support for first, last, nanfirst, nanlast. So we should support that on the Xarray GroupBy object.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8025/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1642299599	I_kwDOAMm_X85h44DP	7683	automatically chunk in groupby binary ops	dcherian 2448579	closed		0	2023-03-27T15:14:09Z	2023-07-27T16:41:35Z	2023-07-27T16:41:34Z	MEMBER	What happened? From https://discourse.pangeo.io/t/xarray-unable-to-allocate-memory-how-to-size-up-problem/3233/4 Consider ``` python ds is dataset with big dask arrays mean = ds.groupby("time.day").mean() mean.to_netcdf() mean = xr.open_dataset(...) ds.groupby("time.day") - mean ``` In `GroupBy._binary_op` https://github.com/pydata/xarray/blob/39caafae4452f5327a7cd671b18d4bb3eb3785ba/xarray/core/groupby.py#L616 we will eagerly construct `other` that is of the same size as `ds`. What did you expect to happen? I think the only solution is to automatically chunk if `ds` has dask arrays, and `other` (or `mean`) isn't backed by dask arrays. A chunk size of `1` seems sensible. Minimal Complete Verifiable Example No response MVCE confirmation [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [ ] Complete example — the example is self-contained, including all data and the text of any traceback. [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [ ] New issue — a search of GitHub Issues suggests this is not a duplicate. Relevant log output No response Anything else we need to know? No response Environment	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7683/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1822982776	I_kwDOAMm_X85sqIJ4	8023	Possible autoray integration	dcherian 2448579	open		1	2023-07-26T18:57:59Z	2023-07-26T19:26:05Z		MEMBER	I'm opening this issue for discussion really. I stumbled on autoray (Github) by @jcmgray which provides an abstract interface to a number of array types. What struck me was the very general lazy compute system. This opens up the possibility of lazy-but-not-dask computation. Related: https://github.com/pydata/xarray/issues/2298 https://github.com/pydata/xarray/issues/1725 https://github.com/pydata/xarray/issues/5081	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8023/reactions", "total_count": 2, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 2 }		xarray 13221727	issue
1789989152	I_kwDOAMm_X85qsREg	7962	Better chunk manager error	dcherian 2448579	closed		4	2023-07-05T17:27:25Z	2023-07-24T22:26:14Z	2023-07-24T22:26:13Z	MEMBER	What happened? I just ran in to this error in an environment without dask. `TypeError: Could not find a Chunk Manager which recognises type <class 'dask.array.core.Array'>` I think we could easily recommend the user to install a package that provides `dask` by looking at `type(array).__name__`. This would make the message a lot friendlier	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7962/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1797636782	I_kwDOAMm_X85rJcKu	7976	Explore updating colormap code	dcherian 2448579	closed		0	2023-07-10T21:51:30Z	2023-07-11T13:49:54Z	2023-07-11T13:49:53Z	MEMBER	What is your issue? See https://github.com/matplotlib/matplotlib/issues/16296 Looks like the MPL API may have advanced enough that we can delete some of our use of private attributes.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7976/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1658291950	I_kwDOAMm_X85i14bu	7737	align ignores `copy`	dcherian 2448579	open		2	2023-04-07T02:54:00Z	2023-06-20T23:07:56Z		MEMBER	Is your feature request related to a problem? cc @benbovy xref #7730 ``` python import numpy as np import xarray as xr arr = np.random.randn(10, 10, 36530) time = xr.date_range("2000", periods=30365, calendar="noleap") da = xr.DataArray(arr, dims=("y", "x", "time"), coords={"time": time}) year = da["time.year"] ``` `python xr.align(da, year, join="outer", copy=False)` This should result in no copies, but does Describe the solution you'd like I think we need to check `aligner.copy` and/or `aligner.reindex` (maybe?) before copying here https://github.com/pydata/xarray/blob/f8127fc9ad24fe8b41cce9f891ab2c98eb2c679a/xarray/core/dataset.py#L2805-L2818 Describe alternatives you've considered No response Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7737/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1760733017	I_kwDOAMm_X85o8qdZ	7924	Migrate from nbsphinx to myst, myst-nb	dcherian 2448579	open		4	2023-06-16T14:17:41Z	2023-06-20T22:07:42Z		MEMBER	Is your feature request related to a problem? I think we should switch to MyST markdown for our docs. I've been using MyST markdown and MyST-NB in docs in other projects and it works quite well. Advantages: 1. We get HTML reprs in the docs (example) which is a big improvement. (#6620) 2. I think many find markdown a lot easier to write than RST There's a tool to migrate RST to MyST (RTD's migration guide). Describe the solution you'd like No response Describe alternatives you've considered No response Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7924/reactions", "total_count": 5, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
756425955	MDU6SXNzdWU3NTY0MjU5NTU=	4648	Comprehensive benchmarking suite	dcherian 2448579	open		6	2020-12-03T18:01:57Z	2023-06-15T16:56:00Z		MEMBER	I think a good "infrastructure" target for the NASA OSS call would be to expand our benchmarking suite (https://pandas.pydata.org/speed/xarray/#/) AFAIK running these in a useful manner on CI is still unsolved (please correct me if I'm wrong). But we can always run it on an NCAR machine using a cron job. Thoughts? cc @scottyhq A quick survey of work needed (please append): - [ ] indexing & slicing #3382 #2799 #2227 - [ ] DataArray construction #4744 - [ ] attribute access #4741, #4742 - [ ] property access #3514 - [ ] reindexing? https://github.com/pydata/xarray/issues/1385#issuecomment-297539517 - [x] alignment #3755, #7738 - [ ] assignment #1771 - [ ] coarsen - [x] groupby #659 #7795 #7796 - [x] resample #4498 #7795 - [ ] weighted #4482 #3883 - [ ] concat #7824 - [ ] merge - [ ] open_dataset, open_mfdataset #1823 - [ ] stack / unstack - [ ] apply_ufunc? - [x] interp #4740 #7843 - [ ] reprs #4744 - [x] to_(dask)_dataframe #7844 #7474 Related: #3514	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4648/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1692597701	I_kwDOAMm_X85k4v3F	7808	Default behaviour of `min_count` wrong with flox	dcherian 2448579	closed		0	2023-05-02T15:04:11Z	2023-05-10T02:39:45Z	2023-05-10T02:39:45Z	MEMBER	What happened? ```python with xr.set_options(display_style="text", use_flox=False): with xr.set_options(use_flox=False): display( xr.DataArray( data=np.array([np.nan, 1, 1, np.nan, 1, 1]), dims="x", coords={"labels": ("x", np.array([1, 2, 3, 1, 2, 3]))}, ) .groupby("labels") .sum() ) `with xr.set_options(use_flox=True): display( xr.DataArray( data=np.array([np.nan, 1, 1, np.nan, 1, 1]), dims="x", coords={"labels": ("x", np.array([1, 2, 3, 1, 2, 3]))}, ) .groupby("labels") .sum() )` ``` ``` without flox <xarray.DataArray (labels: 3)> array([0., 2., 2.]) Coordinates: * labels (labels) int64 1 2 3 with flox <xarray.DataArray (labels: 3)> array([nan, 2., 2.]) Coordinates: * labels (labels) int64 1 2 3 ``` What did you expect to happen? The same answer. We should set `min_count=0` when `min_count is None` Minimal Complete Verifiable Example No response MVCE confirmation [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [ ] Complete example — the example is self-contained, including all data and the text of any traceback. [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [ ] New issue — a search of GitHub Issues suggests this is not a duplicate. Relevant log output No response Anything else we need to know? No response Environment	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7808/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1236174701	I_kwDOAMm_X85Jrodt	6610	Update GroupBy constructor for grouping by multiple variables, dask arrays	dcherian 2448579	open		6	2022-05-15T03:17:54Z	2023-04-26T16:06:17Z		MEMBER	What is your issue? `flox` supports grouping by multiple variables (would fix #324, #1056) and grouping by dask variables (would fix #2852). To enable this in GroupBy we need to update the constructor's signature to 1. Accept multiple "by" variables. 2. Accept "expected group labels" for grouping by dask variables (like `bins` for `groupby_bins` which already supports grouping by dask variables). This lets us construct the output coordinate without evaluating the dask variable. 3. We may also want to simultaneously group by a categorical variable (season) and bin by a continuous variable (air temperature). So we also need a way to indicate whether the "expected group labels" are "bin edges" or categories. The signature in flox is (may be errors!) `python xarray_reduce( obj: Dataset \| DataArray, *by: DataArray \| str, func: str \| Aggregation, expected_groups: Sequence \| np.ndarray \| None = None, isbin: bool \| Sequence[bool] = False, ... )` You would calculate that last example using flox as `python xarray_reduce( ds, "season", "air_temperature", expected_groups=[None, np.arange(21, 30, 1)], isbin=[False, True], ... )` The use of `expected_groups` and `isbin` seems ugly to me (the names could also be better!) I propose we update groupby's signature to 1. change `group: DataArray \| str` to `group: DataArray \| str \| Iterable[str] \| Iterable[DataArray]` 2. We could add a top-level `xr.Bins` object that wraps bin edges + any kwargs to be passed to `pandas.cut`. Note our current groupby_bins signature has a bunch of kwargs passed directly to pandas.cut. 3. Finally add `groups: None \| ArrayLike \| xarray.Bins \| Iterable[None \| ArrayLike \| xarray.Bins]` to pass the "expected group labels". 1. If `None`, then groups will be auto-detected from non-dask `group` arrays (if `None` for a dask `group`, then raise error). 1. If `xarray.Bins` indicates binning by the appropriate variables 1. If `ArrayLike` treat as categorical. 1. `groups` is a little too similar to `group` so we should choose a better name. 1. The ordering of `ArrayLike` would let us fix #757 (pass the seasons in the order you want them in the output) So then that example becomes `python ds.groupby( ["season", "air_temperature"], # season is numpy, air_temperature is dask groups=[None, xr.Bins(np.arange(21, 30, 1), closed="right")], )` Thoughts?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6610/reactions", "total_count": 7, "+1": 7, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1654022522	I_kwDOAMm_X85ilmF6	7716	bad conda solve with pandas 2	dcherian 2448579	closed		18	2023-04-04T14:37:58Z	2023-04-16T17:57:27Z	2023-04-13T17:56:34Z	MEMBER	What happened? Pandas 2 is out. We have a `pandas<2` pin for our latest release, but `mamba` is now returning `xarray=2023.1.0` and `pandas=2.0` which is making cf-xarray and flox tests fail. It looks like any project that tests `resample` without pinning pandas will fail. I opened the issue here for visibility. It seems we might need a repodata patch to disallow `pandas<2`? cc @ocefpaf What did you expect to happen? No response Minimal Complete Verifiable Example No response MVCE confirmation [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [ ] Complete example — the example is self-contained, including all data and the text of any traceback. [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [ ] New issue — a search of GitHub Issues suggests this is not a duplicate. Relevant log output No response Anything else we need to know? No response Environment	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7716/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1642317716	I_kwDOAMm_X85h48eU	7685	Add welcome bot?	dcherian 2448579	closed		6	2023-03-27T15:24:25Z	2023-04-06T01:55:55Z	2023-04-06T01:55:55Z	MEMBER	Is your feature request related to a problem? Given all the outreachy interest (and perhaps just in general) it may be nice to enable a welcome bot like on the Jupyter repos Describe the solution you'd like No response Describe alternatives you've considered No response Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7685/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1649611456	I_kwDOAMm_X85iUxLA	7704	follow upstream scipy interpolation improvements	dcherian 2448579	open		0	2023-03-31T15:46:56Z	2023-03-31T15:46:56Z		MEMBER	Is your feature request related to a problem? Scipy 1.10.0 has some great improvements to interpolation (release notes) particularly around the fancier methods like `pchip`. It'd be good to see if we can simplify some of our code (or even enable using these options). Describe the solution you'd like No response Describe alternatives you've considered No response Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7704/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1409811164	I_kwDOAMm_X85UCALc	7162	copy of custom index does not align with original	dcherian 2448579	closed		7	2022-10-14T20:17:22Z	2023-03-24T20:37:13Z	2023-03-24T20:37:12Z	MEMBER	What happened? MY prototype CRSIndex is broken on the release version: https://github.com/dcherian/crsindex/blob/main/crsindex.ipynb under heading "BROKEN: Successfully align with a copy of itself" The cell's code is : `copy = newds.copy(deep=True) xr.align(copy, newds)` which should always work. @headtr1ck is https://github.com/pydata/xarray/pull/7140 to blame? Environment INSTALLED VERSIONS ------------------ commit: None python: 3.10.6 \| packaged by conda-forge \| (main, Aug 22 2022, 20:43:44) [Clang 13.0.1 ] python-bits: 64 OS: Darwin OS-release: 21.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.8.1 xarray: 2022.10.0 pandas: 1.5.0 numpy: 1.23.3 scipy: 1.9.1 netCDF4: 1.6.0 pydap: None h5netcdf: 1.0.2 h5py: 3.7.0 Nio: None zarr: 2.13.3 cftime: 1.6.2 nc_time_axis: 1.4.1 PseudoNetCDF: 3.2.2 rasterio: 1.3.2 cfgrib: 0.9.10.2 iris: 3.3.1 bottleneck: 1.3.5 dask: 2022.9.2 distributed: 2022.9.2 matplotlib: 3.6.1 cartopy: 0.21.0 seaborn: 0.12.0 numbagg: 0.2.1 fsspec: 2022.8.2 cupy: None pint: 0.19.2 sparse: 0.13.0 flox: 0.6.0 numpy_groupies: 0.9.19 setuptools: 65.5.0 pip: 22.2.2 conda: None pytest: 7.1.3 IPython: 8.5.0 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7162/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
984555353	MDU6SXNzdWU5ODQ1NTUzNTM=	5754	Variable.stack constructs extremely large chunks	dcherian 2448579	closed		6	2021-09-01T03:08:02Z	2023-03-22T14:51:44Z	2021-12-14T17:31:45Z	MEMBER	Minimal Complete Verifiable Example: Here's a small array with too-small chunk sizes just as an example ```python Put your MCVE code here import dask.array import xarray as xr var = xr.Variable(("x", "y", "z"), dask.array.random.random((4, 18483, 1000), chunks=(1, 183, -1))) ``` Now stack two dimensions, this is a 100x increase in chunk size (in my actual code, 85MB chunks become 8.5GB chunks =) ) `var.stack(new=("x", "y"))` But calling `reshape` on the dask array preserves the original chunk size `var.data.reshape((418483, -1))` Solution Ah, found it , we transpose then reshape in `Variable_stack_once`. https://github.com/pydata/xarray/blob/f915515d610b4471888fa44dfb00dbae3fd22349/xarray/core/variable.py#L1521-L1527 Writing those steps with pure dask yields the same 100x increase in chunksize `python var.data.transpose([2, 0, 1]).reshape((-1, 418483))` Anything else we need to know?: Environment: Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.6 \| packaged by conda-forge \| (default, Jan 25 2021, 23:21:18) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1127.18.2.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.19.0 pandas: 1.3.1 numpy: 1.21.1 scipy: 1.5.3 netCDF4: 1.5.6 pydap: installed h5netcdf: 0.11.0 h5py: 3.3.0 Nio: None zarr: 2.8.3 cftime: 1.5.0 nc_time_axis: 1.3.1 PseudoNetCDF: None rasterio: None cfgrib: None iris: 3.0.4 bottleneck: 1.3.2 dask: 2021.07.2 distributed: 2021.07.2 matplotlib: 3.4.2 cartopy: 0.19.0.post1 seaborn: 0.11.1 numbagg: None pint: 0.17 setuptools: 49.6.0.post20210108 pip: 21.2.2 conda: 4.10.3 pytest: 6.2.4 IPython: 7.26.0 sphinx: 4.1.2	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5754/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
344614881	MDU6SXNzdWUzNDQ2MTQ4ODE=	2313	Example on using `preprocess` with `mfdataset`	dcherian 2448579	open		6	2018-07-25T21:31:34Z	2023-03-14T12:35:00Z		MEMBER	I wrote this little notebook today while trying to get some satellite data in form that was nice to work with: https://gist.github.com/dcherian/66269bc2b36c2bc427897590d08472d7 I think it would make a useful example for the docs. A few questions: 1. Do you think it'd be a good addition to the examples? 2. Is this the recommended way of adding meaningful co-ordinates, expanding dims etc.? The main bit is this function: ``` def preprocess(ds): dsnew = ds.copy() dsnew['latitude'] = xr.DataArray(np.linspace(90, -90, 180), dims=['phony_dim_0']) dsnew['longitude'] = xr.DataArray(np.linspace(-180, 180, 360), dims=['phony_dim_1']) dsnew = (dsnew.rename({'l3m_data': 'sss', 'phony_dim_0': 'latitude', 'phony_dim_1': 'longitude'}) .set_coords(['latitude', 'longitude']) .drop('palette')) dsnew['time'] = (pd.to_datetime(dsnew.attrs['time_coverage_start']) + np.timedelta64(3, 'D') + np.timedelta64(12, 'h')) dsnew = dsnew.expand_dims('time').set_coords('time') return dsnew ``` Also open to other feedback...	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2313/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1599044689	I_kwDOAMm_X85fT3xR	7558	shift time using frequency strings	dcherian 2448579	open		2	2023-02-24T17:35:52Z	2023-02-26T15:08:13Z		MEMBER	Discussed in https://github.com/pydata/xarray/discussions/7557 <sup>Originally posted by arfriedman February 24, 2023</sup> Hi, In addition to integer offsets, I was wondering if it is possible to [shift](https://docs.xarray.dev/en/stable/generated/xarray.Variable.shift.html) a variable by a specific time frequency interval as in [pandas](https://pandas.pydata.org/docs/reference/api/pandas.Series.shift.html). For example, something like: ``` import xarray as xr ds = xr.tutorial.load_dataset("air_temperature") air = ds["air"] air.shift(time="1D") ``` Otherwise, is there another xarray function or recommended approach for this type of operation?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7558/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1599056009	I_kwDOAMm_X85fT6iJ	7559	Support specifying chunk sizes using labels (e.g. frequency string)	dcherian 2448579	open		2	2023-02-24T17:44:03Z	2023-02-25T03:46:49Z		MEMBER	Is your feature request related to a problem? `dask.dataframe` supports repartitioning or rechunking using a frequency string (`freq` kwarg). I think this would be a useful addition to `.chunk`. It would help with some groupby problems (as suggested in this comment) and generally make a few problems amenable to blockwise/map_blocks solutions. Describe the solution you'd like One solution is to allow `.chunk(lon=5, time="MS")`. There is some ugliness in that this syntax mixes up integer index values (`lon=5`) and a label-based frequency string `time="MS"` So perhaps a second method `chunk_by_labels` would be useful where `chunk_by_labels(lon=5, time="MS")` would rechunk the data so that a single chunk contains 5° of longitude points and a month of time. Alternative this could be `.chunk(lon=5, time="MS", by="labels")` Describe alternatives you've considered Have the user do this manually but that's kind of annoying, and a bit advanced. Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7559/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1530966360	I_kwDOAMm_X85bQLFY	7434	RTD failure on main	dcherian 2448579	closed		2	2023-01-12T15:57:55Z	2023-01-13T17:38:00Z	2023-01-13T17:38:00Z	MEMBER	What happened? logs sphinx.errors.SphinxParallelError: RuntimeError: Non Expected exception in `/home/docs/checkouts/readthedocs.org/user_builds/xray/checkouts/7433/doc/user-guide/interpolation.rst` line 331 This seems real	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7434/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1119647191	I_kwDOAMm_X85CvHXX	6220	[FEATURE]: Use fast path when grouping by unique monotonic decreasing variable	dcherian 2448579	open		1	2022-01-31T16:24:29Z	2023-01-09T16:48:58Z		MEMBER	Is your feature request related to a problem? See https://github.com/pydata/xarray/pull/6213/files#r795716713 We check whether the `by` variable for groupby is unique and monotonically increasing. But the fast path would also apply to unique and monotonically decreasing variables. Describe the solution you'd like Update the condition to `is_monotonic_increasing or is_monotonic_decreasing` and add a test. Describe alternatives you've considered No response Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6220/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1194945072	I_kwDOAMm_X85HOWow	6447	allow merging datasets where a variable might be a coordinate variable only in a subset of datasets	dcherian 2448579	open		1	2022-04-06T17:53:51Z	2022-11-16T03:46:56Z		MEMBER	Is your feature request related to a problem? Here are two datasets, in one `a` is a data_var, in the other `a` is a coordinate variable. The following fails ``` python import xarray as xr ds1 = xr.Dataset({"a": ('x', [1, 2, 3])}) ds2 = ds1.set_coords("a") ds2.update(ds1) `with` 649 ambiguous_coords = coord_names.intersection(noncoord_names) 650 if ambiguous_coords: --> 651 raise MergeError( 652 "unable to determine if these variables should be " 653 f"coordinates or not in the merged result: {ambiguous_coords}" 654 ) 656 attrs = merge_attrs( 657 [var.attrs for var in coerced if isinstance(var, (Dataset, DataArray))], 658 combine_attrs, 659 ) 661 return _MergeResult(variables, coord_names, dims, out_indexes, attrs) MergeError: unable to determine if these variables should be coordinates or not in the merged result: {'a'} ``` Describe the solution you'd like I think we should replace this error with a warning and arbitrarily choose to either convert `a` to a coordinate variable or a data variable. Describe alternatives you've considered No response Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6447/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1284094480	I_kwDOAMm_X85MiboQ	6722	Avoid loading any data for reprs	dcherian 2448579	closed		5	2022-06-24T19:04:30Z	2022-10-28T16:23:20Z	2022-10-28T16:23:20Z	MEMBER	What happened? For "small" datasets, we load in to memory when displaying the repr. For cloud backed datasets with large number of "small" variables, this can use a lot of time sequentially loading O(100) variables just for a repr. https://github.com/pydata/xarray/blob/6c8db5ed005e000b35ad8b6ea9080105e608e976/xarray/core/formatting.py#L548-L549 What did you expect to happen? Fast reprs! Minimal Complete Verifiable Example This dataset has 48 "small" variables ```Python import xarray as xr dc1 = xr.open_dataset('s3://its-live-data/datacubes/v02/N40E080/ITS_LIVE_vel_EPSG32645_G0120_X250000_Y4750000.zarr', engine= 'zarr', storage_options = {'anon':True}) dc1.repr_html() ``` On `2022.03.0` this repr takes 36.4s If I comment the `array.size` condition I get 6μs. MVCE confirmation [x] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [x] Complete example — the example is self-contained, including all data and the text of any traceback. [x] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [x] New issue — a search of GitHub Issues suggests this is not a duplicate. Relevant log output No response Anything else we need to know? No response Environment INSTALLED VERSIONS ------------------ commit: None python: 3.10.4 \| packaged by conda-forge \| (main, Mar 24 2022, 17:43:32) [Clang 12.0.1 ] python-bits: 64 OS: Darwin OS-release: 21.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2022.3.0 pandas: 1.4.2 numpy: 1.22.4 scipy: 1.8.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.11.3 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.10 cfgrib: None iris: None bottleneck: None dask: 2022.05.2 distributed: None matplotlib: 3.5.2 cartopy: 0.20.2 seaborn: 0.11.2 numbagg: None fsspec: 2022.5.0 cupy: None pint: None sparse: None setuptools: 62.3.2 pip: 22.1.2 conda: None pytest: None IPython: 8.4.0 sphinx: 4.5.0	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6722/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1309839509	I_kwDOAMm_X85OEpCV	6810	Convert upstream-dev CI scripts to github Action	dcherian 2448579	closed		2	2022-07-19T17:32:15Z	2022-10-26T09:12:43Z	2022-10-26T09:12:43Z	MEMBER	Is your feature request related to a problem? No. Describe the solution you'd like If possible, I think it'd be nice to move a lot of the upstream-dev CI scripting to its own github action like "ci-trigger". This will make it easier to use in other projects (like those under xarray-contrib). I'd like to use it for flox, cf-xarray. Describe alternatives you've considered No response Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6810/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1404926762	I_kwDOAMm_X85TvXsq	7154	nightly failure with h5netcdf indexing	dcherian 2448579	closed		8	2022-10-11T16:32:33Z	2022-10-12T14:11:04Z	2022-10-12T14:11:04Z	MEMBER	What happened? From upstream-dev CI: Workflow Run URL Python 3.10 Test Summary ``` xarray/tests/test_backends.py::TestH5NetCDFData::test_orthogonal_indexing: AssertionError: Left and right Dataset objects are not identical Differing coordinates: L numbers (dim3) int64 0 1 2 0 0 R numbers (dim3) int64 ... L * dim3 (dim3) <U1 'a' 'b' 'c' 'd' 'e' R * dim3 (dim3) object 'a' 'b' 'c' 'd' 'e' Differing data variables: L var3 (dim3, dim1) float64 -0.4059 1.247 -0.3095 ... 0.8073 -0.2758 foo: variable R var3 (dim3, dim1) float64 ... foo: variable L var2 (dim1, dim2) float64 0.3307 -1.768 -1.454 ... -0.6426 2.697 0.4849 foo: variable R var2 (dim1, dim2) float64 ... foo: variable L var1 (dim1, dim2) float64 -1.639 1.625 0.3936 ... -0.8715 0.2285 -0.0473 foo: variable R var1 (dim1, dim2) float64 ... foo: variable xarray/tests/test_backends.py::TestH5NetCDFData::test_vectorized_indexing: AttributeError: 'list' object has no attribute 'stop' xarray/tests/test_backends.py::TestH5NetCDFData::test_isel_dataarray: AssertionError: Left and right Dataset objects are not identical Differing data variables: L var2 (dim1, dim2) float64 0.6563 0.3721 1.274 ... 1.106 -0.2169 1.502 foo: variable R var2 (dim1, dim2) float64 ... foo: variable L var1 (dim1, dim2) float64 0.2482 0.4837 2.044 ... -0.8528 -1.536 -0.3347 foo: variable R var1 (dim1, dim2) float64 ... foo: variable xarray/tests/test_backends.py::TestH5NetCDFData::test_array_type_after_indexing: AssertionError: Left and right Dataset objects are not identical Differing coordinates: L numbers (dim3) int64 0 1 2 0 0 R numbers (dim3) int64 ... L * dim3 (dim3) <U1 'a' 'b' 'c' 'd' 'e' R * dim3 (dim3) object 'a' 'b' 'c' 'd' 'e' Differing data variables: L var3 (dim3, dim1) float64 -0.02351 -2.274 0.9986 ... -1.546 0.1454 foo: variable R var3 (dim3, dim1) float64 ... foo: variable L var2 (dim1, dim2) float64 0.7681 1.803 1.406 ... 1.524 0.5592 -0.5456 foo: variable R var2 (dim1, dim2) float64 ... foo: variable L var1 (dim1, dim2) float64 0.8966 -0.1489 0.3954 ... -0.689 -0.9191 foo: variable R var1 (dim1, dim2) float64 ... foo: variable xarray/tests/test_backends.py::TestH5NetCDFFileObject::test_orthogonal_indexing: AssertionError: Left and right Dataset objects are not identical Differing coordinates: L numbers (dim3) int64 0 1 2 0 0 R numbers (dim3) int64 ... L * dim3 (dim3) <U1 'a' 'b' 'c' 'd' 'e' R * dim3 (dim3) object 'a' 'b' 'c' 'd' 'e' Differing data variables: L var3 (dim3, dim1) float64 -0.4183 -0.3932 -0.01572 ... 0.6842 -0.4205 foo: variable R var3 (dim3, dim1) float64 ... foo: variable L var2 (dim1, dim2) float64 1.008 0.4886 -1.046 ... -1.152 -0.8104 1.077 foo: variable R var2 (dim1, dim2) float64 ... foo: variable L var1 (dim1, dim2) float64 -1.11 -0.3574 -1.076 ... 0.7554 0.1688 0.5749 foo: variable R var1 (dim1, dim2) float64 ... foo: variable xarray/tests/test_backends.py::TestH5NetCDFFileObject::test_vectorized_indexing: AttributeError: 'list' object has no attribute 'stop' xarray/tests/test_backends.py::TestH5NetCDFFileObject::test_isel_dataarray: AssertionError: Left and right Dataset objects are not identical Differing data variables: L var2 (dim1, dim2) float64 0.2409 0.5855 1.56 ... 0.4115 -0.4185 0.6749 foo: variable R var2 (dim1, dim2) float64 ... foo: variable L var1 (dim1, dim2) float64 -1.05 0.8272 -1.445 ... 0.3286 -0.05075 0.9352 foo: variable R var1 (dim1, dim2) float64 ... foo: variable xarray/tests/test_backends.py::TestH5NetCDFFileObject::test_array_type_after_indexing: AssertionError: Left and right Dataset objects are not identical Differing coordinates: L numbers (dim3) int64 0 1 2 0 0 R numbers (dim3) int64 ... L * dim3 (dim3) <U1 'a' 'b' 'c' 'd' 'e' R * dim3 (dim3) object 'a' 'b' 'c' 'd' 'e' Differing data variables: L var3 (dim3, dim1) float64 -0.8477 0.8072 0.4219 ... 0.2703 0.5475 -1.696 foo: variable R var3 (dim3, dim1) float64 ... foo: variable L var2 (dim1, dim2) float64 -0.9968 0.1141 0.7767 ... 0.09977 -0.7788 foo: variable R var2 (dim1, dim2) float64 ... foo: variable L var1 (dim1, dim2) float64 2.949 -0.4085 0.7757 ... -0.2474 2.141 1.753 foo: variable R var1 (dim1, dim2) float64 ... foo: variable xarray/tests/test_formatting.py::test__mapping_repr_recursive: ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part. ``` </details> cc @benbovy @kmuehlbauer Environment INSTALLED VERSIONS ------------------ commit: 8eea8bb67bad0b5ac367c082125dd2b2519d4f52 python: 3.10.6 \| packaged by conda-forge \| (main, Aug 22 2022, 20:35:26) [GCC 10.4.0] python-bits: 64 OS: Linux OS-release: 5.15.0-1020-azure machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: C.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.8.1 xarray: 2022.9.1.dev12+g8eea8bb6 pandas: 1.6.0.dev0+297.g55dc32437e numpy: 1.24.0.dev0+896.g5ecaf36cd scipy: 1.10.0.dev0+2012.5be8bc4 netCDF4: 1.6.0 pydap: installed h5netcdf: 1.1.0.dev5+g1168b4f h5py: 3.7.0 Nio: None zarr: 2.13.4.dev1 cftime: 1.6.2 nc_time_axis: 1.3.1.dev34+g0999938 PseudoNetCDF: 3.2.2 rasterio: 1.4dev cfgrib: 0.9.10.2 iris: 3.3.1 bottleneck: 1.3.5 dask: 2022.9.2+17.g5ba240b9 distributed: 2022.9.2+19.g07e22593 matplotlib: 3.7.0.dev320+g834c89c512 cartopy: 0.21.0 seaborn: 0.12.0 numbagg: None fsspec: 2022.8.2+14.g3969aaf cupy: None pint: 0.19.3.dev87+g052a920 sparse: None flox: 0.5.11.dev3+g031979d numpy_groupies: 0.9.19 setuptools: 65.4.1 pip: 22.2.2 conda: None pytest: 7.1.3 IPython: None sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7154/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1333514579	I_kwDOAMm_X85Pe9FT	6902	Flox based groupby operations don't support `dtype` in mean method	dcherian 2448579	closed		3	2022-08-09T16:38:25Z	2022-10-11T17:45:27Z	2022-10-11T17:45:27Z	MEMBER	Discussed in https://github.com/pydata/xarray/discussions/6901 <sup>Originally posted by tasansal August 9, 2022</sup> We have been using the new groupby logic with Flox and numpy_groupies; however, when we run the following, the dtype is not recognized as a valid argument. This breaks API compatibility for cases where you may not have the acceleration libraries installed. Not sure if this has to be upstream in In addition to base Xarray we have the following extras installed: Flox numpy_groupies Bottleneck We do this because our data is `float32` but we want the accumulator in mean to be `float64` for accuracy. One solution is to cast the variable to float64 before mean, which may cause a copy and spike in memory usage. When Flox and numpy_groupies are not installed, it works as expected. We are working with multi-dimensional time-series of weather forecast models. ```python da = xr.load_mfdataset(...) da.groupby("time.month").mean(dtype='float64').compute() ``` Here is the end of the traceback and it appears it is on Flox. ```shell File "/home/altay_sansal_tgs_com/miniconda3/envs/wind-data-mos/lib/python3.10/site-packages/flox/core.py", line 786, in _aggregate return _finalize_results(results, agg, axis, expected_groups, fill_value, reindex) File "/home/altay_sansal_tgs_com/miniconda3/envs/wind-data-mos/lib/python3.10/site-packages/flox/core.py", line 747, in _finalize_results finalized[agg.name] = agg.finalize(squeezed["intermediates"], *agg.finalize_kwargs) TypeError: <lambda>() got an unexpected keyword argument 'dtype' ``` What is the best way to handle this, maybe fix it in Flox?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6902/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1382753751	I_kwDOAMm_X85SayXX	7069	release?	dcherian 2448579	closed		5	2022-09-22T17:00:58Z	2022-10-01T18:25:13Z	2022-10-01T18:25:13Z	MEMBER	What is your issue? It's been 3 months since our last release. We still have quite a few regressions from the last release but @benbovy does have open PRs for a number of them. However, we do have some nice bugfixes and other commits in the mean time. I propose we issue a new release, perhaps after @benbovy merges the PRs he thinks are ready. I'll be out of town for the next few days, so if someone else could volunteer to be release manager that would be great!	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7069/reactions", "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
626591460	MDU6SXNzdWU2MjY1OTE0NjA=	4107	renaming Variable to a dimension name does not convert to IndexVariable	dcherian 2448579	closed	benbovy 4160723	0	2020-05-28T15:11:49Z	2022-09-27T09:33:42Z	2022-09-27T09:33:42Z	MEMBER	Seen in #4103 MCVE Code Sample ```python from xarray.tests import assert_identical coord_1 = xr.DataArray([1, 2], dims=["coord_1"], attrs={"attrs": True}) da = xr.DataArray([1, 0], [coord_1]) obj = da.reset_index("coord_1").rename({"coord_1_": "coord_1"}) assert_identical(da, obj) ``` Expected Output Problem Description ``` AssertionErrorTraceback (most recent call last) <ipython-input-19-02ef6bd89884> in <module> ----> 1 assert_identical(da, obj) ~/work/python/xarray/xarray/tests/init.py in assert_identical(a, b) 160 xarray.testing.assert_identical(a, b) 161 xarray.testing._assert_internal_invariants(a) --> 162 xarray.testing._assert_internal_invariants(b) 163 164 ~/work/python/xarray/xarray/testing.py in _assert_internal_invariants(xarray_obj) 265 _assert_variable_invariants(xarray_obj) 266 elif isinstance(xarray_obj, DataArray): --> 267 _assert_dataarray_invariants(xarray_obj) 268 elif isinstance(xarray_obj, Dataset): 269 _assert_dataset_invariants(xarray_obj) ~/work/python/xarray/xarray/testing.py in _assert_dataarray_invariants(da) 210 assert all( 211 isinstance(v, IndexVariable) for (k, v) in da._coords.items() if v.dims == (k,) --> 212 ), {k: type(v) for k, v in da._coords.items()} 213 for k, v in da._coords.items(): 214 _assert_variable_invariants(v, k) AssertionError: {'coord_1': <class 'xarray.core.variable.Variable'>} ``` Versions Output of <tt>xr.show_versions()</tt>	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4107/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1378174355	I_kwDOAMm_X85SJUWT	7055	Use roundtrip context manager in distributed write tests	dcherian 2448579	open		0	2022-09-19T15:53:40Z	2022-09-19T15:53:40Z		MEMBER	What is your issue? File roundtripping tests in `test_distributed.py` don't use the `roundtrip` context manager (thpugh one uses `create_tmp_file`) so I don't think any created files are being cleaned up. Example: https://github.com/pydata/xarray/blob/09e467a6a3a8ed68c6c29647ebf2b09288145da1/xarray/tests/test_distributed.py#L91-L119	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7055/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1321228754	I_kwDOAMm_X85OwFnS	6845	Do we need to update AbstractArray for duck arrays?	dcherian 2448579	open		6	2022-07-28T16:59:59Z	2022-07-29T17:20:39Z		MEMBER	What happened? I'm calling `cupy.round` on a DataArray wrapping a cupy array and it raises an error here: https://github.com/pydata/xarray/blob/3f7cc2da33d81e76afbfb82da57143b624b03a88/xarray/core/common.py#L155-L156 Traceback below: ``` --> 25 a = _core.array(a, copy=False) 26 return a.round(decimals, out=out) 27 cupy/_core/core.pyx in cupy._core.core.array() cupy/_core/core.pyx in cupy._core.core.array() cupy/_core/core.pyx in cupy._core.core._array_default() ~/miniconda3/envs/gpu/lib/python3.7/site-packages/xarray/core/common.py in __array__(self, dtype) 146 147 def __array__(self: Any, dtype: DTypeLike = None) -> np.ndarray: --> 148 return np.asarray(self.values, dtype=dtype) 149 150 def __repr__(self) -> str: ~/miniconda3/envs/gpu/lib/python3.7/site-packages/xarray/core/dataarray.py in values(self) 644 type does not support coercion like this (e.g. cupy). 645 """ --> 646 return self.variable.values 647 648 @values.setter ~/miniconda3/envs/gpu/lib/python3.7/site-packages/xarray/core/variable.py in values(self) 517 def values(self): 518 """The variable's data as a numpy.ndarray""" --> 519 return _as_array_or_item(self._data) 520 521 @values.setter ~/miniconda3/envs/gpu/lib/python3.7/site-packages/xarray/core/variable.py in _as_array_or_item(data) 257 TODO: remove this (replace with np.asarray) once these issues are fixed 258 """ --> 259 data = np.asarray(data) 260 if data.ndim == 0: 261 if data.dtype.kind == "M": cupy/_core/core.pyx in cupy._core.core.ndarray.__array__() TypeError: Implicit conversion to a NumPy array is not allowed. Please use `.get()` to construct a NumPy array explicitly. ``` What did you expect to happen? Not an error? I'm not sure what's expected `np.round(dataarray)` does actually work successfully. My question is : Do we need to update `AbstractArray.__array__` to return the underlying duck array instead of always a numpy array? Minimal Complete Verifiable Example No response MVCE confirmation [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [ ] Complete example — the example is self-contained, including all data and the text of any traceback. [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [ ] New issue — a search of GitHub Issues suggests this is not a duplicate. Relevant log output No response Anything else we need to know? No response Environment xarray v2022.6.0 cupy 10.6.0	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6845/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1315480779	I_kwDOAMm_X85OaKTL	6817	wrong mean of complex values	dcherian 2448579	closed		1	2022-07-22T23:09:47Z	2022-07-23T02:03:11Z	2022-07-23T02:03:11Z	MEMBER	What happened? Seen in #4972 ``` python import xarray as xr import numpy as np array = np.array([0. +0.j, 0.+np.nan * 1j], dtype=np.complex64) var = xr.Variable("x", array) print(var.mean().data) print(array.mean()) ``` `0j (nan+nanj)` What did you expect to happen? No response Minimal Complete Verifiable Example No response MVCE confirmation [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [ ] Complete example — the example is self-contained, including all data and the text of any traceback. [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [ ] New issue — a search of GitHub Issues suggests this is not a duplicate. Relevant log output No response Anything else we need to know? No response Environment	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6817/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1298145215	I_kwDOAMm_X85NYB-_	6763	Map_blocks should raise nice error if provided template has no dask arrays	dcherian 2448579	closed		3	2022-07-07T21:58:06Z	2022-07-14T17:42:26Z	2022-07-14T17:42:26Z	MEMBER	Discussed in https://github.com/pydata/xarray/discussions/6762 <sup>Originally posted by tlsw231 July 7, 2022</sup> I am trying to use `map_blocks` to: ingest a multi-dimensional array as input, reduce along one dimension and add extra dimensions to the output. Is this possible? I am attaching a simple MRE below that gives me an `zip argument #2 must support iteration` error. Any pointers on what I might be doing wrong? [My real example is a 3d-dataset with `(time,lat,lon)` dimensions and I am trying to reduce along `time` while adding two new dimensions to the output. I tried so many things and got so many errors, including the one in the title, that I thought it is better to first understand how `map_blocks` works!] ``` # The goal is to feed in a 2d array, reduce along one dimension and add two new dimensions to the output. chunks={} dummy = xr.DataArray(data=np.random.random([8,100]),dims=['dim1','dim2']).chunk(chunks) def some_func(func): dims=func.dims n1 = len(func[func.dims[1]]) # This is 'dim2', we will average along 'dim1' below in the for loop newdim1 = 2; newdim2 = 5; output = xr.DataArray(np.nannp.ones([n1,newdim1,newdim2]),dims=[dims[1],'new1','new2']) for n in range(n1): fmean = func.isel(dim2=n).mean(dims[0]).compute() for i in range(newdim1): for j in range(newdim2): output[n,i,j] = fmean return output #out = some_func(dummy) # This works template=xr.DataArray(np.nannp.ones([len(dummy.dim2),2,5]), dims=['dim2','new1','new2']) out = xr.map_blocks(some_func,dummy,template=template).compute() # gives me the error message in the title ``` [Edit: Fixed a typo in the `n1 = len(func[func.dims[1]])` line, of course getting the same error.]	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6763/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1289174987	I_kwDOAMm_X85M1z_L	6739	"center" kwarg ignored when manually iterating over DataArrayRolling	dcherian 2448579	closed		0	2022-06-29T19:07:07Z	2022-07-14T17:41:01Z	2022-07-14T17:41:01Z	MEMBER	Discussed in https://github.com/pydata/xarray/discussions/6738 <sup>Originally posted by ckingdon95 June 29, 2022</sup> Hello, I am trying to manually iterate over a DataArrayRolling object, as described [here ](https://docs.xarray.dev/en/stable/user-guide/computation.html#rolling-window-operations)in the documentation. I am confused why the following two code chunks do not produce the same sequence of values. I would like to be able to manually iterate over a DataArrayRolling object, and still be given center-justified windows. Is there a way to do this? ```python import xarray as xr import numpy as np my_data = xr.DataArray(np.arange(1,10), dims="x") # Option 1: take a center-justified rolling average result1 = my_data.rolling(x=3, center=True).mean().values result1 ``` This returns the following values, as expected: ``` array([nan, 2., 3., 4., 5., 6., 7., 8., nan]) ``` Whereas when I do it manually, it is not equivalent: ```python # Option 2: try to manually iterate, but the result is not centered my_data_rolling = my_data.rolling(x=3, center=True) result2 = [window.mean().values.item() for label, window in my_data_rolling] result2 ``` This returns ``` [nan, nan, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0] ``` Is this an issue with the window iterator? If it is not an issue, then is there a way for me to get the center-justified windows in the manual iteration?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6739/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1290524064	I_kwDOAMm_X85M69Wg	6741	some private imports broken on main	dcherian 2448579	closed		6	2022-06-30T18:59:28Z	2022-07-06T03:06:31Z	2022-07-06T03:06:31Z	MEMBER	What happened? Seen over in cf_xarray Using `xr.core.resample.Resample` worked prior to https://github.com/pydata/xarray/pull/6702. Now we need to use `from xarray.core.resample import Resample` I don't know if this is something that needs to be fixed or only worked coincidentally earlier. But I thought it was worth discussing prior to release. Thanks to @aulemahal for spotting	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6741/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
968977385	MDU6SXNzdWU5Njg5NzczODU=	5699	describe options in documentation	dcherian 2448579	closed		0	2021-08-12T14:48:00Z	2022-06-25T20:01:07Z	2022-06-25T20:01:07Z	MEMBER	I think we only describe available options in the API reference for `xr.set_options` It'd be nice to add a "Configuring Xarray" section in the User Guide.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5699/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1178907807	I_kwDOAMm_X85GRLSf	6407	Add backend tutorial material	dcherian 2448579	closed		0	2022-03-24T03:44:22Z	2022-06-23T01:51:44Z	2022-06-23T01:51:44Z	MEMBER	What is your issue? @aurghs developed some nice backend tutorial material for the Dask Summit: https://github.com/aurghs/xarray-backend-tutorial It'd be nice to add it either to our main documentation or to https://github.com/xarray-contrib/xarray-tutorial.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6407/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1258338848	I_kwDOAMm_X85LALog	6659	Publish nightly releases to TestPyPI	dcherian 2448579	closed		6	2022-06-02T15:21:24Z	2022-06-07T08:37:02Z	2022-06-06T22:33:15Z	MEMBER	Is your feature request related to a problem? From @keewis in #6645 if anyone can figure out how to create PEP440 (and thus PyPI) compatible development versions I think we can have a CI publish every commit on main to TestPyPI. Describe the solution you'd like No response Describe alternatives you've considered No response Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6659/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1238783899	I_kwDOAMm_X85J1leb	6616	flox breaks multiindex groupby	dcherian 2448579	closed		0	2022-05-17T15:05:00Z	2022-05-17T16:11:18Z	2022-05-17T16:11:18Z	MEMBER	What happened? From @malmans2 ``` python import numpy as np import xarray as xr ds = xr.Dataset( dict(a=(("z",), np.ones(10))), coords=dict(b=(("z"), np.arange(2).repeat(5)), c=(("z"), np.arange(5).repeat(2))), ).set_index(bc=["b", "c"]) grouped = ds.groupby("bc") with xr.set_options(use_flox=False): grouped.sum() # OK with xr.set_options(use_flox=True): grouped.sum() # Error ``` What did you expect to happen? No response Minimal Complete Verifiable Example No response MVCE confirmation [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [ ] Complete example — the example is self-contained, including all data and the text of any traceback. [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [ ] New issue — a search of GitHub Issues suggests this is not a duplicate. Relevant log output ```Python ctests/test_xarray.py:329: in test_multi_index_groupby_sum actual = xarray_reduce(ds, "bc", func="sum") flox/xarray.py:374: in xarray_reduce actual[k] = v.expand_dims(missing_group_dims) ../xarray/xarray/core/dataset.py:1427: in setitem self.update({key: value}) ../xarray/xarray/core/dataset.py:4432: in update merge_result = dataset_update_method(self, other) ../xarray/xarray/core/merge.py:1070: in dataset_update_method return merge_core( ../xarray/xarray/core/merge.py:722: in merge_core aligned = deep_align( ../xarray/xarray/core/alignment.py:824: in deep_align aligned = align( ../xarray/xarray/core/alignment.py:761: in align aligner.align() ../xarray/xarray/core/alignment.py:550: in align self.assert_unindexed_dim_sizes_equal() ../xarray/xarray/core/alignment.py:450: in assert_unindexed_dim_sizes_equal raise ValueError( E ValueError: cannot reindex or align along dimension 'bc' because of conflicting dimension sizes: {10, 6} (note: an index is found along that dimension with size=10) ____ test_multi_index_groupby_sum[numpy] _______________________________ tests/test_xarray.py:329: in test_multi_index_groupby_sum actual = xarray_reduce(ds, "bc", func="sum") flox/xarray.py:374: in xarray_reduce actual[k] = v.expand_dims(missing_group_dims) ../xarray/xarray/core/dataset.py:1427: in __setitem self.update({key: value}) ../xarray/xarray/core/dataset.py:4432: in update merge_result = dataset_update_method(self, other) ../xarray/xarray/core/merge.py:1070: in dataset_update_method return merge_core( ../xarray/xarray/core/merge.py:722: in merge_core aligned = deep_align( ../xarray/xarray/core/alignment.py:824: in deep_align aligned = align( ../xarray/xarray/core/alignment.py:761: in align aligner.align() ../xarray/xarray/core/alignment.py:550: in align self.assert_unindexed_dim_sizes_equal() ../xarray/xarray/core/alignment.py:450: in assert_unindexed_dim_sizes_equal raise ValueError( E ValueError: cannot reindex or align along dimension 'bc' because of conflicting dimension sizes: {10, 6} (note: an index is found along that dimension with size=10) Anything else we need to know? No response Environment	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6616/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1235494254	I_kwDOAMm_X85JpCVu	6606	Fix benchmark CI	dcherian 2448579	closed		0	2022-05-13T17:18:32Z	2022-05-14T23:06:44Z	2022-05-14T23:06:44Z	MEMBER	What is your issue? It's failing during setup: https://github.com/pydata/xarray/runs/6424624397?check_suite_focus=true ``` · Discovering benchmarks ·· Uninstalling from conda-py3.8-bottleneck-dask-distributed-flox-netcdf4-numpy-numpy_groupies-pandas-scipy-sparse ·· Building dd20d07f for conda-py3.8-bottleneck-dask-distributed-flox-netcdf4-numpy-numpy_groupies-pandas-scipy-sparse ·· Error running /home/runner/work/xarray/xarray/asv_bench/.asv/env/e8ce5703538597037a298414451d04d2/bin/python -mpip wheel --no-deps --no-index -w /home/runner/work/xarray/xarray/asv_bench/.asv/env/e8ce5703538597037a298414451d04d2/asv-build-cache/dd20d07f4057a9e29222ca132c36cbaaf3fbb242 /home/runner/work/xarray/xarray/asv_bench/.asv/env/e8ce5703538597037a298414451d04d2/project (exit status 1) STDOUT --------> Processing /home/runner/work/xarray/xarray/asv_bench/.asv/env/e8ce5703538597037a298414451d04d2/project STDERR --------> ERROR: Some build dependencies for file:///home/runner/work/xarray/xarray/asv_bench/.asv/env/e8ce5703538597037a298414451d04d2/project are missing: 'setuptools_scm[toml]>=3.4', 'setuptools_scm_git_archive'. ·· Failed: trying different commit/environment ·· Uninstalling from conda-py3.8-bottleneck-dask-distributed-flox-netcdf4-numpy-numpy_groupies-pandas-scipy-sparse ·· Building c34ef8a6 for conda-py3.8-bottleneck-dask-distributed-flox-netcdf4-numpy-numpy_groupies-pandas-scipy-sparse ·· Error running /home/runner/work/xarray/xarray/asv_bench/.asv/env/e8ce5703538597037a298414451d04d2/bin/python -mpip wheel --no-deps --no-index -w /home/runner/work/xarray/xarray/asv_bench/.asv/env/e8ce5703538597037a298414451d04d2/asv-build-cache/c34ef8a60227720724e90aa11a6266c0026a812a /home/runner/work/xarray/xarray/asv_bench/.asv/env/e8ce5703538597037a298414451d04d2/project (exit status 1) STDOUT --------> Processing /home/runner/work/xarray/xarray/asv_bench/.asv/env/e8ce5703538597037a298414451d04d2/project STDERR --------> ERROR: Some build dependencies for file:///home/runner/work/xarray/xarray/asv_bench/.asv/env/e8ce5703538597037a298414451d04d2/project are missing: 'setuptools_scm[toml]>=3.4', 'setuptools_scm_git_archive'. ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6606/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1180334986	I_kwDOAMm_X85GWnuK	6411	Better dask support in polyval	dcherian 2448579	closed		0	2022-03-25T04:35:48Z	2022-05-05T20:17:07Z	2022-05-05T20:17:07Z	MEMBER	Is your feature request related to a problem? polyval does not handle dask inputs well. ```python nt = 8772 // 4 ny = 489 nx = 655 chunks like the data is stored on disk small in time, big in space because the chunk sizes are -1 along lat, lon; reshaping this array to (time, latlon) prior to fitting is pretty cheap chunks = (8, -1, -1) da = xr.DataArray( dask.array.random.random((nt, ny, nx), chunks=chunks), dims=("ocean_time", "eta_rho", "xi_rho"), ) dim = "ocean_time" deg = 1 p = da.polyfit(dim="ocean_time", deg=1, skipna=False) create a chunked version of the "ocean_time" dimension chunked_dim = xr.DataArray( dask.array.from_array(da[dim].data, chunks=da.chunksizes[dim]), dims=dim, name=dim ) xr.polyval(chunked_dim, p.polyfit_coefficients) ``` Describe the solution you'd like Here's a partial solution. It does not handle datetime inputs (polyval handles this using `get_clean_interp_index` which computes dask inputs). But I've replaced the call to `np.vander` and used `xr.dot`. ```python def polyval(coord, coeffs, degree_dim="degree"): x = coord.data `deg_coord = coeffs[degree_dim] N = int(deg_coord.max()) + 1 lhs = xr.DataArray( np.stack([x ** (N - 1 - i) for i in range(N)], axis=1), dims=(coord.name, degree_dim), coords={coord.name: coord, degree_dim: np.arange(deg_coord.max() + 1)[::-1]}, ) return xr.dot(lhs, coeffs, dims=degree_dim)` polyval(chunked_dim, p.polyfit_coefficients) ``` This looks like what I expected cc @aulemahal Describe alternatives you've considered No response Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6411/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1207159549	I_kwDOAMm_X85H88r9	6497	restrict stale bot	dcherian 2448579	closed		1	2022-04-18T15:25:56Z	2022-04-18T16:11:11Z	2022-04-18T16:11:11Z	MEMBER	What is your issue? We have some stale issue but not that many. Can we restrict the bot to only issues that are untagged, or tagged as "usage question" or are not assigned to a "project" instead? This might reduce a lot of the noise.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6497/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
663931851	MDU6SXNzdWU2NjM5MzE4NTE=	4251	expanded attrs makes HTML repr confusing to read	dcherian 2448579	open		2	2020-07-22T17:33:13Z	2022-04-18T03:23:16Z		MEMBER	When the `attrs` are expanded, it can be hard to distinguish between the attrs and the next variable. See `>>> xr.tutorial.open_dataset("air_temperature")` Perhaps the gray background could be applied to attrs associated with a variable too?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4251/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1203414243	I_kwDOAMm_X85HuqTj	6481	refactor broadcast for flexible indexes	dcherian 2448579	open		0	2022-04-13T14:51:19Z	2022-04-13T14:51:28Z		MEMBER	What is your issue? From @benbovy in https://github.com/pydata/xarray/pull/6477 extract common indexes and explicitly pass them to the Dataset and DataArray constructors (when implemented) that are called in the broadcast helper functions (there are some temporary and ugly hacks in create_default_index_implicit so that it works now with pandas multi-indexes wrapped in coordinate variables without the need to pass those indexes explicitly) extract common indexes based on the dimension(s) of their coordinates and not their name (e.g., case of non-dimension but indexed coordinate)	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6481/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1188406993	I_kwDOAMm_X85G1abR	6430	Bug in broadcasting with multi-indexes	dcherian 2448579	closed		1	2022-03-31T17:25:57Z	2022-04-13T14:49:23Z	2022-04-13T14:49:23Z	MEMBER	What happened? ``` python import numpy as np import xarray as xr ds = xr.Dataset( {"foo": (("x", "y", "z"), np.ones((3, 4, 2)))}, {"x": ["a", "b", "c"], "y": [1, 2, 3, 4]}, ) expected = ds.sum("z") stacked = ds.stack(space=["x", "y"]) broadcasted, _ = xr.broadcast(stacked, stacked.space) stacked.sum("z").unstack("space") # works broadcasted.sum("z").unstack("space") # error ``` ``` ValueError Traceback (most recent call last) Input In [13], in <module> 10 broadcasted, _ = xr.broadcast(stacked, stacked.space) 11 stacked.sum("z").unstack("space") ---> 12 broadcasted.sum("z").unstack("space") File ~/work/python/xarray/xarray/core/dataset.py:4332, in Dataset.unstack(self, dim, fill_value, sparse) 4330 non_multi_dims = set(dims) - set(stacked_indexes) 4331 if non_multi_dims: -> 4332 raise ValueError( 4333 "cannot unstack dimensions that do not " 4334 f"have exactly one multi-index: {tuple(non_multi_dims)}" 4335 ) 4337 result = self.copy(deep=False) 4339 # we want to avoid allocating an object-dtype ndarray for a MultiIndex, 4340 # so we can't just access self.variables[v].data for every variable. 4341 # We only check the non-index variables. 4342 # https://github.com/pydata/xarray/issues/5902 ValueError: cannot unstack dimensions that do not have exactly one multi-index: ('space',) ``` What did you expect to happen? This should work. Minimal Complete Verifiable Example No response Relevant log output No response Anything else we need to know? No response Environment xarray main after the flexible indexes refactor	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6430/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1193704369	I_kwDOAMm_X85HJnux	6444	xr.where with scalar as second argument fails with keep_attrs=True	dcherian 2448579	closed		1	2022-04-05T20:51:18Z	2022-04-12T02:12:39Z	2022-04-12T02:12:39Z	MEMBER	What happened? ``` python import xarray as xr xr.where(xr.DataArray([1, 2, 3]) > 0, 1, 0) ``` fails with `` 1809 if keep_attrs is True: 1810 # keep the attributes of x, the second parameter, by default to 1811 # be consistent with thewhere`method of`DataArray`and`Dataset` -> 1812 keep_attrs = lambda attrs, context: attrs[1] 1814 # alignment for three arguments is complicated, so don't support it yet 1815 return apply_ufunc( 1816 duck_array_ops.where, 1817 cond, (...) 1823 keep_attrs=keep_attrs, 1824 ) IndexError: list index out of range ``` The workaround is to pass `keep_attrs=False` What did you expect to happen? No response Minimal Complete Verifiable Example No response Relevant log output No response Anything else we need to know? No response Environment xarray 2022.3.0	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6444/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
528168017	MDU6SXNzdWU1MjgxNjgwMTc=	3573	rasterio test failure	dcherian 2448579	closed		1	2019-11-25T15:40:19Z	2022-04-09T01:17:32Z	2022-04-09T01:17:32Z	MEMBER	version `rasterio 1.1.1 py36h900e953_0 conda-forge` ``` =================================== FAILURES =================================== ___ TestRasterio.testrasterio_vrt ____ self = <xarray.tests.test_backends.TestRasterio object at 0x7fc8355c8f60> `def test_rasterio_vrt(self): import rasterio # tmp_file default crs is UTM: CRS({'init': 'epsg:32618'} with create_tmp_geotiff() as (tmp_file, expected): with rasterio.open(tmp_file) as src: with rasterio.vrt.WarpedVRT(src, crs="epsg:4326") as vrt: expected_shape = (vrt.width, vrt.height) expected_crs = vrt.crs expected_res = vrt.res # Value of single pixel in center of image lon, lat = vrt.xy(vrt.width // 2, vrt.height // 2)` `expected_val = next(vrt.sample([(lon, lat)]))` xarray/tests/test_backends.py:3966: /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/rasterio/sample.py:43: in sample_gen data = read(indexes, window=window, masked=masked, boundless=True) ??? E ValueError: WarpedVRT does not permit boundless reads rasterio/_warp.pyx:978: ValueError ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3573/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1194790343	I_kwDOAMm_X85HNw3H	6445	map removes non-dimensional coordinate variables	dcherian 2448579	open		0	2022-04-06T15:40:40Z	2022-04-06T15:40:40Z		MEMBER	What happened? `python ds = xr.Dataset( {"a": ("x", [1, 2, 3])}, coords={"c": ("x", [1, 2, 3]), "d": ("y", [1, 2, 3, 4])} ) print(ds.coords) mapped = ds.map(lambda x: x) print(mapped.coords)` Variables `d` gets dropped in the `map` call. It does not share any dimensions with any of the data variables. `Coordinates: c (x) int64 1 2 3 d (y) int64 1 2 3 4 Coordinates: c (x) int64 1 2 3` What did you expect to happen? No response Minimal Complete Verifiable Example No response Relevant log output No response Anything else we need to know? No response Environment xarray 2022.03.0	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6445/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1189140909	I_kwDOAMm_X85G4Nmt	6434	concat along dim with mix of scalar coordinate and array coordinates is not right	dcherian 2448579	closed		3	2022-04-01T02:29:16Z	2022-04-06T01:19:47Z	2022-04-06T01:19:47Z	MEMBER	What happened? Really hard to describe in words =) `concat = xr.concat([da.isel(time=0), da.isel(time=[1])], dim="time") xr.align(concat, da, dim="time")` fails when `concat` and `da` should be identical. This is causing failures in cf-xarray:https://github.com/xarray-contrib/cf-xarray/issues/319 cc @benbovy What did you expect to happen? No response Minimal Complete Verifiable Example ```Python import numpy as np import xarray as xr time = xr.DataArray( np.array( ["2013-01-01T00:00:00.000000000", "2013-01-01T06:00:00.000000000"], dtype="datetime64[ns]", ), dims="time", name="time", ) da = time concat = xr.concat([da.isel(time=0), da.isel(time=[1])], dim="time") xr.align(da, concat, join="exact") # works da = xr.DataArray(np.ones(time.shape), dims="time", coords={"time": time}) concat = xr.concat([da.isel(time=0), da.isel(time=[1])], dim="time") xr.align(da, concat, join="exact") ``` Relevant log output ``` ValueError Traceback (most recent call last) Input In [27], in <module> 17 da = xr.DataArray(np.ones(time.shape), dims="time", coords={"time": time}) 18 concat = xr.concat([da.isel(time=0), da.isel(time=[1])], dim="time") ---> 19 xr.align(da, concat, join="exact") File ~/work/python/xarray/xarray/core/alignment.py:761, in align(join, copy, indexes, exclude, fill_value, objects) 566 """ 567 Given any number of Dataset and/or DataArray objects, returns new 568 objects with aligned indexes and dimension sizes. (...) 751 752 """ 753 aligner = Aligner( 754 objects, 755 join=join, (...) 759 fill_value=fill_value, 760 ) --> 761 aligner.align() 762 return aligner.results File ~/work/python/xarray/xarray/core/alignment.py:549, in Aligner.align(self) 547 self.find_matching_unindexed_dims() 548 self.assert_no_index_conflict() --> 549 self.align_indexes() 550 self.assert_unindexed_dim_sizes_equal() 552 if self.join == "override": File ~/work/python/xarray/xarray/core/alignment.py:395, in Aligner.align_indexes(self) 393 if need_reindex: 394 if self.join == "exact": --> 395 raise ValueError( 396 "cannot align objects with join='exact' where " 397 "index/labels/sizes are not equal along " 398 "these coordinates (dimensions): " 399 + ", ".join(f"{name!r} {dims!r}" for name, dims in key[0]) 400 ) 401 joiner = self._get_index_joiner(index_cls) 402 joined_index = joiner(matching_indexes) ValueError: cannot align objects with join='exact' where index/labels/sizes are not equal along these coordinates (dimensions): 'time' ('time',) ``` Anything else we need to know? No response* Environment xarray main	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6434/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1001197796	I_kwDOAMm_X847rRDk	5804	vectorized groupby binary ops	dcherian 2448579	closed		1	2021-09-20T17:04:47Z	2022-03-29T07:11:28Z	2022-03-29T07:11:28Z	MEMBER	By switching to `numpy_groupies` we are vectorizing our groupby reductions. I think we can do the same for groupby's binary ops. Here's an example array ``` python import numpy as np import xarray as xr %load_ext memory_profiler N = 4 * 2000 da = xr.DataArray( np.random.random((N, N)), dims=("x", "y"), coords={"labels": ("x", np.repeat(["a", "b", "c", "d", "e", "f", "g", "h"], repeats=N//8))}, ) ``` Consider this "anomaly" calculation, anomaly defined relative to the group mean ``` python def anom_current(da): grouped = da.groupby("labels") mean = grouped.mean() anom = grouped - mean return anom ``` With this approach, we loop over each group and apply the binary operation: https://github.com/pydata/xarray/blob/a1635d324753588e353e4e747f6058936fa8cf1e/xarray/core/computation.py#L502-L525 This saves some memory, but becomes slow for large number of groups. We could instead do `def anom_vectorized(da): mean = da.groupby("labels").mean() mean_expanded = mean.sel(labels=da.labels) anom = da - mean_expanded return anom` Now we are faster, but construct an extra array as big as the original array (I think this is an OK tradeoff). ``` %timeit anom_current(da) 1.4 s ± 20.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) %timeit anom_vectorized(da) 937 ms ± 5.26 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` (I haven't experimented with dask yet, so the following is just a theory). I think the real benefit comes with dask. Depending on where the groups are located relative to chunking, we could end up creating a lot of tiny chunks by splitting up existing chunks. With the vectorized approach we can do better. Ideally we would reindex the "mean" dask array with a numpy-array-of-repeated-ints such that the chunking of `mean_expanded` exactly matches the chunking of `da` along the grouped dimension. ~In practice, dask.array.take doesn't allow specifying "output chunks" so we'd end up chunking "mean_expanded" based on dask's automatic heuristics, and then rechunking again for the binary operation.~ Thoughts? cc @rabernat	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5804/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1178949620	I_kwDOAMm_X85GRVf0	6408	backwards incompatible changes in reductions	dcherian 2448579	closed		2	2022-03-24T04:11:00Z	2022-03-26T08:44:43Z	2022-03-26T08:44:43Z	MEMBER	What is your issue? I merged #5950 but forgot that it included some backward-incompatible changes (Sorry! this came up in https://github.com/pydata/xarray/pull/6403 thanks to @mathause for spotting.) Arguments like `keep_attrs`, `axis` are now keyword-only. Some reductions had the 3rd position arg as `keep_attrs` and in other cases it was `axis`. These have been standardized now, and only `dim` is accepted without kwarg-name. @pydata/xarray Should we add a deprecation cycle (https://github.com/pydata/xarray/issues/5531)? Point (2) above will make it a little messy. At the very least we should add a deprecation notice before releasing.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6408/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1174177534	I_kwDOAMm_X85F_Ib-	6381	vectorized indexing with DataArray should not preserve IndexVariable	dcherian 2448579	closed		1	2022-03-19T05:08:39Z	2022-03-21T04:47:47Z	2022-03-21T04:47:47Z	MEMBER	What happened? After vectorized indexing a DataArray with dim `x`by a DataArray `z`, we get a DataArray with dim `z` and `x` as non-dim coordinate. But `x` is still an IndexVariable, not a normal variable. What did you expect to happen? `x` should be a normal variable. Minimal Complete Verifiable Example ```python import xarray as xr xr.set_options(display_style="text") da = xr.DataArray([1, 2, 3], dims="x", coords={"x": [0, 1, 2]}) idxr = xr.DataArray([1], dims="z", name="x", coords={"z": ("z", ["a"])}) da.sel(x=idxr) ``` `<xarray.DataArray (z: 1)> array([2]) Coordinates: x (z) int64 1 * z (z) <U1 'a'` `x` is a non-dim coordinate but is backed by a IndexVariable with the wrong name! `python da.sel(x=idxr).x.variable` `<xarray.IndexVariable 'z' (z: 1)> array([1])` Relevant log output No response Anything else we need to know? No response Environment xarray main but this bug was present prior to the explicit indexes refactor.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6381/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1171932478	I_kwDOAMm_X85F2kU-	6373	Zarr backend should avoid checking for invalid encodings	dcherian 2448579	closed		3	2022-03-17T04:55:35Z	2022-03-18T10:06:01Z	2022-03-18T04:19:48Z	MEMBER	What is your issue? The zarr backend has a list of "valid" encodings that needs to be updated any time zarr adds something new (e.g. https://github.com/pydata/xarray/pull/6348) https://github.com/pydata/xarray/blob/53172cb1e03a98759faf77ef48efaa64676ad24a/xarray/backends/zarr.py#L215-L234 Can we get rid of this? I don't know the backends code well, but won't all our encoding parameters have been removed by this stage? The `raise_on_invalid` kwarg suggests so. @tomwhite points out that zarr will raise a warning: ``` python zarr.create((1,), blah=1) /Users/tom/miniconda/envs/sgkit-3.8/lib/python3.8/site-packages/zarr/creation.py:221: UserWarning: ignoring keyword argument 'blah' warn('ignoring keyword argument %r' % k) <zarr.core.Array (1,) float64> ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6373/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1170533154	I_kwDOAMm_X85FxOsi	6363	failing flaky test: rasterio vrt	dcherian 2448579	closed		2	2022-03-16T04:38:53Z	2022-03-17T06:25:22Z	2022-03-17T06:25:22Z	MEMBER	What happened? This test is failing with a 404 error: https://github.com/pydata/xarray/blob/95bb9ae4233c16639682a532c14b26a3ea2728f3/xarray/tests/test_backends.py#L4778-L4802 What did you expect to happen? No response Minimal Complete Verifiable Example No response Relevant log output No response Anything else we need to know? No response Environment N/A	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6363/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1171916710	I_kwDOAMm_X85F2gem	6372	apply_ufunc + dask="parallelized" + no core dimensions should raise a nicer error about core dimensions being absent	dcherian 2448579	open		0	2022-03-17T04:25:37Z	2022-03-17T05:10:16Z		MEMBER	What happened? From https://github.com/pydata/xarray/discussions/6370 Calling `apply_ufunc(..., dask="parallelized")` with no core dimensions and dask input "works" but raises an error on compute (`ValueError: axes don't match array` from `np.transpose`). `python xr.apply_ufunc( lambda x: np.mean(x), dt, dask="parallelized" )` What did you expect to happen? With numpy data the apply_ufunc call does raise an error: `xr.apply_ufunc( lambda x: np.mean(x), dt.compute(), dask="parallelized" )` `ValueError: applied function returned data with unexpected number of dimensions. Received 0 dimension(s) but expected 1 dimensions with names: ('x',)` Minimal Complete Verifiable Example ``` python import xarray as xr dt = xr.Dataset( data_vars=dict( value=(["x"], [1,1,2,2,2,3,3,3,3,3]), ), coords=dict( lon=(["x"], np.linspace(0,1,10)), ), ).chunk(chunks={'x': tuple([2,3,5])}) # three chunks of different size xr.apply_ufunc( lambda x: np.mean(x), dt, dask="parallelized" ) ``` Relevant log output No response Anything else we need to know? No response Environment N/A	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6372/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
584461380	MDU6SXNzdWU1ODQ0NjEzODA=	3868	What should pad do about IndexVariables?	dcherian 2448579	open		6	2020-03-19T14:40:21Z	2022-02-22T16:02:21Z		MEMBER	Currently `pad` adds NaNs for coordinate labels, which results in substantially reduced functionality. We need to think about 1. Int, Float, Datetime64, CFTime indexes: linearly extrapolate? Should we care whether the index is sorted or not? (I think not) 2. MultiIndexes: ?? 3. CategoricalIndexes: ?? 4. Unindexed dimensions EDIT: Added unindexed dimensions	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3868/reactions", "total_count": 6, "+1": 6, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1119738354	I_kwDOAMm_X85Cvdny	6222	test packaging & distribution	dcherian 2448579	closed		4	2022-01-31T17:42:40Z	2022-02-03T15:45:17Z	2022-02-03T15:45:17Z	MEMBER	Is your feature request related to a problem? It seems like we should have a test to make sure our dependencies are specified correctly. Describe the solution you'd like For instance we could add a step to the release workflow: https://github.com/pydata/xarray/blob/b09de8195a9e22dd35d1b7ed608ea15dad0806ef/.github/workflows/pypi-release.yaml#L34-L43 after `twine check` where we pip install and then try to import xarray. Alternatively we could have another test config in our regular CI to build + import. Thoughts? Is this excessive for a somewhat rare problem? Describe alternatives you've considered No response Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6222/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
937266282	MDU6SXNzdWU5MzcyNjYyODI=	5578	Specify minimum versions in setup.cfg	dcherian 2448579	open		2	2021-07-05T17:25:03Z	2022-01-09T03:33:38Z		MEMBER	See https://github.com/pydata/xarray/issues/5342#issuecomment-873660034	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5578/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1072473598	I_kwDOAMm_X84_7KX-	6051	Check for just ... in stack etc, and raise with a useful error message	dcherian 2448579	closed		4	2021-12-06T18:35:27Z	2022-01-03T23:05:23Z	2022-01-03T23:05:23Z	MEMBER	Is your feature request related to a problem? Please describe. The following doesn't work ``` python import xarray as xr da = xr.DataArray([[1,2],[1,2]], dims=("x", "y")) da.stack(flat=...) ``` Describe the solution you'd like This could be equivalent to `python da.stack(flat=da.dims)` I think using `ds.dims` it should be fine for datasets too.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6051/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
514716299	MDU6SXNzdWU1MTQ3MTYyOTk=	3468	failure when roundtripping empty dataset to pandas	dcherian 2448579	open		1	2019-10-30T14:28:31Z	2021-11-13T14:54:09Z		MEMBER	see https://github.com/pydata/xarray/pull/3285	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3468/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

187 rows where repo = 13221727, type = "issue" and user = 2448579 sorted by updated_at descending

Is your feature request related to a problem?

What is your issue?

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

res = d.groupby("x") - m

print(res)

MVCE confirmation

Relevant log output

Anything else we need to know?

Environment

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

MVCE confirmation

Relevant log output

Anything else we need to know?

Environment

Discussed in https://github.com/pydata/xarray/discussions/8751

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

ND DataArray that gets stacked along a multiindex

Extract just the stacked coordinates for saving in a dataset

What happened?

What is your issue?

What is your issue?

What happened?

What is your issue?

What happened?

```

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

What is your issue?

What is your issue?

What is your issue?

What is your issue?

What is your issue?

Discussed in https://github.com/pydata/xarray/discussions/8567

Is your feature request related to a problem?

What happened?

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

What is your issue?

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Is your feature request related to a problem?

What happened?

What did you expect to happen?

What is your issue?

What is your issue?

What is your issue?

General

PyProject

Pre-commit

MyPy

Ruff

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

Relevant log output

Anything else we need to know?

Environment

What happened?

Minimal Complete Verifiable Example

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered