github: pull_requests: 62 rows where user = 4160723

62 rows where user = 4160723

Search:

descending

id ▼	node_id	number	state	title	user	body	created_at	updated_at	closed_at	merged_at	merge_commit_sha	draft	head	base	author_association	auto_merge	repo	url
64042989	MDExOlB1bGxSZXF1ZXN0NjQwNDI5ODk=	802	closed	Multi-index indexing	benbovy 4160723	Follows #767. This is incomplete (it still needs some tests and documentation updates), but it is working for both `Dataset` and `DataArray` objects. I also don't know if it is fully compatible with lazy indexing (Dask). Using the example from #767: ``` In [4]: da.sel(band_wavenumber={'band': 'foo'}) Out[4]: <xarray.DataArray (wavenumber: 2)> array([ 0.00017, 0.00014]) Coordinates: * wavenumber (wavenumber) float64 4.05e+03 4.05e+03 ``` As shown in this example, similarily to pandas, it automatically renames the dimension and assigns a new coordinate when the selection doesn't return a `pd.MultiIndex` (here it returns a `pd.FloatIndex`). In some cases this behavior may be unwanted (??), so I added a `drop_level` keyword argument (if `False` it keeps the multi-index and doesn't change the dimension/coordinate names): ``` In [5]: da.sel(band_wavenumber={'band': 'foo'}, drop_level=False) Out[5]: <xarray.DataArray (band_wavenumber: 2)> array([ 0.00017, 0.00014]) Coordinates: * band_wavenumber (band_wavenumber) object ('foo', 4050.2) ('foo', 4050.3) ``` Note that it also works with `DataArray.loc`, but (for now) in that case it always returns the multi-index: ``` In [6]: da.loc[{'band_wavenumber': {'band': 'foo'}}] Out[6]: <xarray.DataArray (band_wavenumber: 2)> array([ 0.00017, 0.00014]) Coordinates: * band_wavenumber (band_wavenumber) object ('foo', 4050.2) ('foo', 4050.3) ``` This is however inconsistent with `Dataset.sel` and `Dataset.loc` that both apply `drop_level=True` by default, due to their different implementation. Two solutions: (1) make `DataArray.loc` apply drop_level by default, or (2) use `drop_level=False` by default everywhere.	2016-03-24T14:39:38Z	2016-07-19T10:48:56Z	2016-07-19T01:15:42Z	2016-07-19T01:15:41Z	7a9e84b5708d3e8ec270a7415f9b5e54d30f13f7	0	712497c3997e72a36cafc8fb9eaafbecc76af5dc	80abe5dede7bf8a2949139f8ba083a6d74d4e3db	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/802
73465410	MDExOlB1bGxSZXF1ZXN0NzM0NjU0MTA=	879	closed	Multi-index repr	benbovy 4160723	Another item of #719. An example: ``` python >>> index = pd.MultiIndex.from_product((list('ab'), range(10))) >>> index.names= ('a_long_level_name', 'level_1') >>> data = xr.DataArray(range(20), [('x', index)]) >>> data <xarray.DataArray (x: 20)> array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]) Coordinates: * x (x) object MultiIndex - a_long_level_name object 'a' 'a' 'a' 'a' 'a' 'a' 'a' 'a' 'a' 'a' 'b' ... - level_1 int64 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 ``` To be consistent with the displayed coordinates and/or data variables, it displays the actual used level values. Using the `pandas.MultiIndex.get_level_values` method would be expensive for big indexes, so I re-implemented it in xarray so that we can truncate the computation to the first _x_ values, which is very cheap. It still needs testing. Maybe it would be nice to align the level values.	2016-06-11T10:58:13Z	2016-09-02T09:34:49Z	2016-08-31T21:40:59Z			0	4e7793a8d4fb0d5062ad8aab5578aaf3fec43577	450ac8fb16bec935a18ff3155673dff82208d3fe	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/879
73554612	MDExOlB1bGxSZXF1ZXN0NzM1NTQ2MTI=	881	closed	Fix variable copy with multi-index	benbovy 4160723	Fixes #769.	2016-06-13T10:38:46Z	2016-08-01T14:17:17Z	2016-06-16T21:01:07Z	2016-06-16T21:01:07Z	065ea6a3695a58ad6256f79b7712b67a8da6377c	0	9ea8832959a54fed81e7194c18cc024ba0fe9bd1	450ac8fb16bec935a18ff3155673dff82208d3fe	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/881
77953522	MDExOlB1bGxSZXF1ZXN0Nzc5NTM1MjI=	903	closed	fixed multi-index copy test	benbovy 4160723		2016-07-19T10:37:36Z	2016-08-01T14:16:15Z	2016-07-19T14:47:58Z	2016-07-19T14:47:58Z	e8566940a97cd5a11fdbe796cb5f8b0f00864624	0	c863df76651fbc0bae1a02819c7db28eef4f4ae5	7a9e84b5708d3e8ec270a7415f9b5e54d30f13f7	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/903
80229493	MDExOlB1bGxSZXF1ZXN0ODAyMjk0OTM=	947	closed	Multi-index levels as coordinates	benbovy 4160723	Implements 2, 4 and 5 in #719. Demo: ``` In [1]: import numpy as np In [2]: import pandas as pd In [3]: import xarray as xr In [4]: index = pd.MultiIndex.from_product((list('ab'), range(2)), ...: names= ('level_1', 'level_2')) In [5]: da = xr.DataArray(np.random.rand(4, 4), coords={'x': index}, ...: dims=('x', 'y'), name='test') In [6]: da Out[6]: <xarray.DataArray 'test' (x: 4, y: 4)> array([[ 0.15036153, 0.68974802, 0.40082234, 0.94451318], [ 0.26732938, 0.49598123, 0.8679231 , 0.6149102 ], [ 0.3313594 , 0.93857424, 0.73023367, 0.44069622], [ 0.81304837, 0.81244159, 0.37274953, 0.86405196]]) Coordinates: * level_1 (x) object 'a' 'a' 'b' 'b' * level_2 (x) int64 0 1 0 1 * y (y) int64 0 1 2 3 In [7]: da['level_1'] Out[7]: <xarray.DataArray 'level_1' (x: 4)> array(['a', 'a', 'b', 'b'], dtype=object) Coordinates: * level_1 (x) object 'a' 'a' 'b' 'b' * level_2 (x) int64 0 1 0 1 In [8]: da.sel(x='a', level_2=1) Out[8]: <xarray.DataArray 'test' (y: 4)> array([ 0.26732938, 0.49598123, 0.8679231 , 0.6149102 ]) Coordinates: x object ('a', 1) * y (y) int64 0 1 2 3 In [9]: da.sel(level_2=1) Out[9]: <xarray.DataArray 'test' (level_1: 2, y: 4)> array([[ 0.26732938, 0.49598123, 0.8679231 , 0.6149102 ], [ 0.81304837, 0.81244159, 0.37274953, 0.86405196]]) Coordinates: * level_1 (level_1) object 'a' 'b' * y (y) int64 0 1 2 3 ``` Some notes about the implementation: - I slightly modified `Coordinate` so that it allows setting different values for the names of the coordinate and its dimension. There is no breaking change. - I also added a `Coordinate.get_level_coords` method to get independent, single-index coordinates objects from a MultiIndex coordinate. Remaining issues: - `Coordinate.get_level_coords` calls `pandas.MultiIndex.get_level_values` for each level and is itself called each time when indexing and for repr. This can be very costly!! It would be …	2016-08-05T11:34:49Z	2016-09-14T15:25:28Z	2016-09-14T03:34:51Z	2016-09-14T03:34:51Z	41654ef5e9da8cd15f3b68f8384f8c45c7fc16e9	0	a447767e8d611d945dc864910a427ef7e3f4db11	3ecfa66613aaefdea8beb15edbd392b9f9d815c6	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/947
87715303	MDExOlB1bGxSZXF1ZXN0ODc3MTUzMDM=	1028	closed	Add `set_index`, `reset_index` and `reorder_levels` methods	benbovy 4160723	Another item in #719. I added tests and updated the docs, so this is ready for review.	2016-10-03T13:22:24Z	2023-08-30T09:28:26Z	2016-12-27T17:03:00Z	2016-12-27T17:03:00Z	7ad254409f97dfe932855445602faaf7324f3d5e	0	c58cb470baf53d1c67971540e1d7c02dbafd212a	34fd2b6cb94dfb824c5371c37b6eb5e70a88260f	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/1028
121942631	MDExOlB1bGxSZXF1ZXN0MTIxOTQyNjMx	1422	closed	xarray.core.variable.as_variable part of the public API	benbovy 4160723	- [x] Closes #1303 - [x] Tests added / passed - [x] Passes ``git diff upstream/master \| flake8 --diff`` (if we ignore messages for .rst files and "imported but not used" messages for `xarray.__init__.py`) - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API Make `xarray.core.variable.as_variable` part of the public API and accessible as a top-level function: `xarray.as_variable`. I changed the docstrings to follow the numpydoc format more closely. I also removed the `copy=False` keyword arguments as apparently it was unused.	2017-05-23T08:44:08Z	2017-06-10T18:33:34Z	2017-06-02T17:55:12Z	2017-06-02T17:55:12Z	b8771934a2ef24fd3ce5a93fc2accb3f6fa12e4e	0	37343de03666f6cac03ce68a7fed60b866338ee7	6b18d77b5581be4d91cb12da95a530f92ab867b5	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/1422
135298867	MDExOlB1bGxSZXF1ZXN0MTM1Mjk4ODY3	1507	closed	Detailed report for testing.assert_equal and testing.assert_identical	benbovy 4160723	- ~~Closes #xxxx~~ - [x] Tests added / passed - [x] Passes ``git diff upstream/master \| flake8 --diff`` - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API ~~In addition to `Dataset` repr, the error message also shows the output of `Dataset.info()` for both datasets.~~ ~~This may not be the most elegant solution, but it is helpful when datasets only differ by their attributes attached to coordinates or data variables (not shown in repr). I'm open to any suggestion.~~ The report shows the differences for dimensions, data values (``Variable`` and ``DataArray``), coordinates, data variables and attributes (the latter only for ``testing.assert_identical``). There is currently not much tests for `xarray.testing` functions, but I'm willing to add more if needed. Not sure if it's worth a what's new entry (EDIT: added one).	2017-08-11T09:38:23Z	2019-10-25T15:07:39Z	2019-01-18T09:16:31Z	2019-01-18T09:16:31Z	1d0a2bc4970d9e7337fe307f4519bd936f7d7d89	0	443e59365e5440979421644e50491f7dd323ab95	f13536c965d02bb2845da31e909899a90754b375	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/1507
153118247	MDExOlB1bGxSZXF1ZXN0MTUzMTE4MjQ3	1723	closed	Fix unexpected behavior of .set_index() since pandas 0.21.0	benbovy 4160723	- [x] Closes #1722 - [x] Tests added / passed - [x] Passes ``git diff upstream/master */py \| flake8 --diff`` - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API	2017-11-16T18:37:20Z	2019-10-25T15:07:18Z	2017-11-17T00:54:51Z	2017-11-17T00:54:51Z	1a012080e0910f3295d0fc26806ae18885f56751	0	eda038be4f7e4298806ed1e3f92c8fc7bf287a21	8267fdb1093bba3934a172cf71128470698279cd	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/1723
162426756	MDExOlB1bGxSZXF1ZXN0MTYyNDI2NzU2	1820	closed	WIP: html repr	benbovy 4160723	- [x] Closes #1627 - [ ] Tests added - [ ] Tests passed - [ ] Passes ``git diff upstream/master */py \| flake8 --diff`` - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API This is work in progress, although the basic functionality is there. You can see a preview here: http://nbviewer.jupyter.org/gist/benbovy/3009f342fb283bd0288125a1f7883ef2 TODO: - [ ] Add support for Multi-indexes - [ ] Probably good to have some opt-in or fail back system in case where we (or users) know that the rendering will not work - [ ] Add some tests Nice to have (keep this for later): - Clean-up CSS code and HTML template (track CSS [subgrid support](https://caniuse.com/#feat=css-subgrid) in browsers, this may simplify a lot the things here). - Dynamically adapt cell widths (given the length of the names of variables and dimensions). Currently all cells have a fixed width. This is tricky, though, as we don't use a monospace font here. - Integration with jupyterlab/notebook themes (CSS classes) and maybe allow custom CSS. - Integration of Dask arrays HTML repr (+ integration of repr for other array backends). - Maybe find a way (if possible) to include CSS only once in the notebook (currently it is included each time a xarray object is displayed in an output cell, which is not very nice). - Review the rules for collapsing the `Coordinates`, `Data variables` and `Attributes` sections (maybe expose them as global options). - Maybe also define some rules to collapse automatically the data section (DataArray and Variable) when the data repr is too long. - Maybe add rich representation for `Dataset.coords` and `Dataset.data_vars` as well? <details> <summary>Other thoughts (old)</summary> A big challenge here is to provide both robust and flexible styling (CSS): - I have tested the current styling in jupyterlab (0.30.6, light theme), notebook (5.2.2) and nbviewer: despite some slight differences it looks quite good! - However, the current CSS code is a bit…	2018-01-11T16:33:07Z	2019-10-25T15:06:58Z	2019-10-24T16:48:46Z		e360d3fc81209d7586de95bc044feb3d4a508657	0	17de08ba4cc2eb7e3326c1451c1257c911a17958	bb87a9441d22b390e069d0fde58f297a054fd98a	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/1820
171631545	MDExOlB1bGxSZXF1ZXN0MTcxNjMxNTQ1	1946	closed	DOC: add main sections to toc	benbovy 4160723	Not a big change, but adds a little more clarity IMO. I'm open to any suggestion for better section names and/or organization. Also I let "What's new" at the top, but not sure if "Getting started" is the right section.	2018-02-27T11:13:17Z	2018-02-27T21:16:18Z	2018-02-27T19:04:24Z	2018-02-27T19:04:24Z	4ee244078ea90084624c1b6d006f50285f8f2d21	0	0fe80d06242b7a7392c9c96598dd9c557ca667ad	243093cf814ffaae2a9ce08215632500fbebcf52	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/1946
207277486	MDExOlB1bGxSZXF1ZXN0MjA3Mjc3NDg2	2357	closed	DOC: move xarray related projects to top-level TOC section	benbovy 4160723	Make xarray-related projects more discoverable, as it has been suggested in xarray mailing-list.	2018-08-09T10:57:47Z	2018-08-11T13:41:24Z	2018-08-10T20:13:08Z	2018-08-10T20:13:08Z	846e28f8862b150352512f8e3d05bcb9db57a1a3	0	5bd1b794860b8c8e276d4918bfd40c6bad6e1411	04458670782c0b6fdba7e7021055155b2a6f284a	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/2357
332552507	MDExOlB1bGxSZXF1ZXN0MzMyNTUyNTA3	3448	closed	Add license for the icons used in the html repr	benbovy 4160723		2019-10-25T14:57:20Z	2019-10-25T15:48:52Z	2019-10-25T15:40:46Z	2019-10-25T15:40:46Z	63cc85759ac25605c8398d904d055df5dc538b94	0	372f61d954f4b90222c636757665e747502c38d6	bb0a5a2b1c71f7c2622543406ccc82ddbb290ece	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/3448
416544318	MDExOlB1bGxSZXF1ZXN0NDE2NTQ0MzE4	4053	closed	Fix html repr in untrusted notebooks (plain text fallback)	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #4041 - [x] Tests added - [x] Passes `isort -rc . && black . && mypy . && flake8` - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API This is not very elegant (actually plain text repr is already included in the notebook as `text/plain` mime type but it is ignored when `text/html` mime type is present), but it seems to work. I haven't found a better workaround. I don't really know if this can be properly tested (I only added a basic test). Steps to test this fix: - To "untrust" a notebook: open an existing notebook with a simple editor, manually edit one output cell with a xarray object repr, and save the ipynb file. - Open this notebook with the Notebook app, you should see the plain text repr.	2020-05-12T07:38:22Z	2022-03-29T07:10:07Z	2020-05-20T17:06:40Z	2020-05-20T17:06:40Z	cb90d5542bd6868d5548ae8efb5815c249c2c329	0	39299e9f8e71b34ba4587800658204f5b66d9576	3e5dd6ef32b9c69806af69a3a5168edcf3b2e21f	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/4053
582224148	MDExOlB1bGxSZXF1ZXN0NTgyMjI0MTQ4	4979	closed	Flexible indexes refactoring notes	benbovy 4160723	As a preliminary step before I take on the refactoring and implementation of flexible indexes in Xarray for the next few months, I reviewed the status of https://github.com/pydata/xarray/projects/1 and started compiling partially implemented or planned changes, thoughts, etc. into a single document that may serve as a basis for further discussion and implementation work. It's still very much work in progress (I will update it regularly in the forthcoming days) and it is very open to discussion (we can use this PR for that)! I'm not sure if Xarray's root folder is a good place for this document, though. We could move this into a new repository in `xarray-contrib` (that could also host other enhancement proposals) if that's necessary. I'm looking forward to getting started on this and to getting your thoughts/feedback!	2021-03-01T16:57:32Z	2022-03-29T07:09:31Z	2021-03-17T16:47:29Z	2021-03-17T16:47:29Z	d9ba56c22f22ae48ecc53629c2d49f1ae02fcbcb	0	6efcdfe893594fcf493e17f693df1d4816b686ba	48378c4b11c5c2672ff91396d4284743165b4fbe	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/4979
608110624	MDExOlB1bGxSZXF1ZXN0NjA4MTEwNjI0	5102	closed	Flexible indexes: add Index base class and xindexes properties	benbovy 4160723	This PR clears up the path for flexible indexes: - it adds a new ~~`IndexAdapter`~~ `Index` base class that is meant to be inherited by all xarray-compatible indexes (built-in or 3rd-party) - `PandasIndexAdapter` now inherits from ~~`IndexAdapter`~~ `Index` - the `xarray_obj.xindexes` properties return `Index` (`PandasIndexAdapter`) instances. `xarray_obj.indexes` properties still return `pandas.Index` instances. ~~The latter is a breaking change, although I'm not sure if the `indexes` property has been made public yet.~~ This is still work in progress, there are many broken tests that are not fixed yet. (EDIT: all tests should be fixed now). There's a lot of dirty fixes to avoid circular dependencies and in the many places where we still need direct access to the `pandas.Index` objects, but I'd expect that these will be cleaned-up further in the refactoring.	2021-04-02T16:18:07Z	2022-03-29T07:10:07Z	2021-05-11T08:21:26Z	2021-05-11T08:21:26Z	6e14df62f0b01d8ca5b04bd0ed2b5ee45444265d	0	ce59dece723ca49eaae69779dee5da2aa30d0286	234b40a37e484a795e6b12916315c80d70570b27	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/5102
645933827	MDExOlB1bGxSZXF1ZXN0NjQ1OTMzODI3	5322	closed	Internal refactor of label-based data selection	benbovy 4160723	Xarray label-based data selection now relies on a newly added `xarray.Index.query(self, labels: Dict[Hashable, Any]) -> Tuple[Any, Optional[None, Index]]` method where: - `labels` is a always a dictionary with coordinate name(s) as key(s) and the corresponding selection label(s) as values - When calling `.sel` with some coordinate(s)/label(s) pairs, those are first grouped by index so that only the relevant pairs are passed to an `Index.query` - the returned tuple contains the positional indexers and (optionally) a new index object For a simple `pd.Index`, `labels` always corresponds to a 1-item dictionary like `{'coord_name': label_values}`, which is not very useful in this case, but this format is useful for `pd.MultiIndex` and will likely be for other, custom indexes. Moving the label->positional indexer conversion logic into `PandasIndex.query()`, I've tried to separate `pd.Index` vs `pd.MultiIndex` concerns by adding a new `PandasMultiIndex` wrapper class (it will probably be useful for other things as well) and refactor the complex logic that was implemented in `convert_label_indexer`. Hopefully it is a bit clearer now. Working towards a more flexible/generic system, we still need to figure out how to: - pass index query extra arguments like `method` and `tolerance` for `pd.Index` but in a more generic way - handle several positional indexers over multiple dimensions possibly returned by a custom "meta-index" (e.g., staggered grid index) - handle the case of positional indexers returned from querying >1 indexes along the same dimension (e.g., multiple coordinates along `x` with a simple `pd.Index`) - pandas indexes don't need information like the names or shapes of their corresponding coordinate(s) to perform label-based selection, but this kind of information will probably be needed for other indexes (we actually need it for advanced point-wise selection using tree-based indexes in [xoak](https://github.com/xarray-contrib/xoak)). This could be done in follow-up PRs.. Side note: I'…	2021-05-17T14:52:49Z	2022-03-29T07:10:07Z	2021-06-08T09:35:54Z	2021-06-08T09:35:54Z	9daf9b13648c9a02bddee3640b80fe95ea1fff61	0	fda484988c074bfd371ed490641a383c9429c43a	2b38adc1bdd1dd97934fb061d174149c73066f19	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/5322
655109484	MDExOlB1bGxSZXF1ZXN0NjU1MTA5NDg0	5385	closed	Cast PandasIndex to pd.(Multi)Index	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #5384 - [x] Tests added - [x] Passes `pre-commit run --all-files` - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`	2021-05-27T15:15:41Z	2022-03-29T07:09:31Z	2021-05-28T08:28:11Z	2021-05-28T08:28:11Z	2b38adc1bdd1dd97934fb061d174149c73066f19	0	b81931cf852432b7a7857aec4b38566d7e3e0b6e	a6a1e48b57499f91db7e7c15593aadc7930020e8	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/5385
697307477	MDExOlB1bGxSZXF1ZXN0Njk3MzA3NDc3	5636	closed	Refactor index vs. coordinate variable(s)	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #5553 - [x] Tests added - [x] Passes `pre-commit run --all-files` - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` This implements option 3 (sort of) described in https://github.com/pydata/xarray/issues/5553#issue-933551030: - the goal is to avoid wrapping an `xarray.Index` into an `xarray.Variable` and keep those two concepts distinct from each other. - the `xarray.Index.from_variables` class constructor accepts a dictionary of `xarray.Variable` objects as argument and may (or should?) also return corresponding `xarray.IndexVariable` objects to ensure immutability. - for `PandasIndex`, the new returned `xarray.IndexVariable` wraps the underlying `pd.Index` via a `PandasIndexingAdapter` (this reverts some changes made in #5102). - for `PandasMultiIndex`, this PR adds `PandasMultiIndexingAdapter` so that we can wrap the pandas multi-index in separate coordinate variables objects: one for the dimension + one for each level. The level coordinates data internally hold a reference to the dimension coordinate data to avoid indexing the same underlying `pd.MultiIndex` for each of those coordinates (`PandasMultiIndexingAdapter.__getitem__` is memoized for that purpose). This is very much work in progress, I need to update (or revert) all related parts of Xarray's internals, update tests, etc. At this stage any comment on the approach described above is welcome.	2021-07-26T19:54:25Z	2023-08-30T09:21:55Z	2021-08-09T07:56:56Z	2021-08-09T07:56:56Z	4bb9d9c6df77137f05e85c7cc6508fe7a93dc0e4	0	e5f2502c07bd7ad449f9f6acfd0e6ac3ede92fb9	8b95da8e21a9d31de9f79cb0506720595f49e1dd	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/5636
709187466	MDExOlB1bGxSZXF1ZXN0NzA5MTg3NDY2	5692	closed	Explicit indexes	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes many issues: - [x] closes #1366 - [x] closes #1408 - [x] closes #2489 - [x] closes #3432 - [x] closes #4542 - [x] closes #4955 - [x] closes #5202 - [x] closes #5645 - [x] closes #5691 - [x] closes #5697 - [x] closes #5700 - [x] closes #5727 - [x] closes #5953 - [x] closes #6183 - [x] closes #6313 - [x] Tests added - [x] Passes `pre-commit run --all-files` - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - New functions/methods are listed in `api.rst` (new `Index` and `Indexes` API not public yet) Follow-up on #5636 (work in progress), supersedes #2195. This is likely to be going big, sorry in advance! It'll be safer to make a release before merging this PR. Current progress: - [x] create (default) indexes using the `Index` classes - [x] refactor default indexes created when 1st accessing `.xindexes` or `.indexes` - [x] support for non-default indexes (no public API yet) - [x] remove multi-index virtual coordinates (replace it by regular coordinates) - [x] refactor internal (text / html) formatting functions - [x] internal refactor of location-based selection (`.isel()`) - [x] internal refactor of label-based selection (`.sel()`) - [x] internal refactor of `.rename()` - Some changes in behavior (see comments below) - see #4108 - see #4107 - see #4417 - [x] internal refactor of `set_index` / `reset_index` - [x] internal refactor of `stack` / `unstack` - Some changes in behavior (see comments below) - [x] internal refactor of `Dataset.to_stacked_array` - [x] internal refactor of `swap_dims` - [x] internal refactor of `expand_dims` - [x] internal refactor of alignment - [x] internal refactor of `reindex` and `reindex_like` - [x] internal refactor of `interp` and `interp_like` - [x] internal refactor of merge - [x] internal refactor of concat - [x] internal refactor of compu…	2021-08-11T15:57:41Z	2023-08-30T09:26:37Z	2022-03-17T17:11:44Z	2022-03-17T17:11:40Z	3ead17ea9e99283e2511b65b9d864d1c7b10b3c4	0	77fdaf0e3a268d1d1fbdb6c7aef9abfd07bf0d32	29a87cc110f1a1ff7b21c308ba7277963b51ada3	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/5692
884210772	PR_kwDOAMm_X840s_xU	6385	closed	Fix concat with scalar coordinate	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #6384 - [x] Tests added	2022-03-20T16:46:48Z	2022-03-29T07:09:30Z	2022-03-21T04:49:23Z	2022-03-21T04:49:22Z	83f238a05a82fc85dcd7346f758ba3bea0416181	0	a91e6ee2728bb5b2768184d4e0cf1c261113f93e	073512ed3f997c0589af97eaf3d4b20796b18cf8	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/6385
884214603	PR_kwDOAMm_X840tAtL	6386	closed	Fix Dataset groupby returning a DataArray	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #6379 - [x] Tests added	2022-03-20T17:06:13Z	2022-03-29T07:09:30Z	2022-03-20T18:55:27Z	2022-03-20T18:55:26Z	fed852073eee883c0ed1e13e28e508ff0cf9d5c1	0	f4e8d48c4040f9165622baf48322771c376af39c	073512ed3f997c0589af97eaf3d4b20796b18cf8	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/6386
884218819	PR_kwDOAMm_X840tBvD	6387	closed	Fix concat with variable or dataarray as dim (propagate attrs)	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #6380 - [x] Tests added	2022-03-20T17:27:41Z	2022-03-29T07:09:29Z	2022-03-20T18:53:46Z	2022-03-20T18:53:46Z	03b6ba1e779b0d1829ca7b2e8f5da4d9c39ece6f	0	cd2ab9e1d605d6469178b24a39a14634f97b5c22	073512ed3f997c0589af97eaf3d4b20796b18cf8	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/6387
884252480	PR_kwDOAMm_X840tJ9A	6388	closed	isel: convert IndexVariable to Variable if index is dropped	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #6381 - [x] Tests added	2022-03-20T20:29:58Z	2022-03-29T07:10:08Z	2022-03-21T04:47:48Z	2022-03-21T04:47:47Z	067b2e86e6311e9c37e0def0c83cdb9a1a367a74	0	626f27966a52a5162f026ac042ccd18ec1592a22	fed852073eee883c0ed1e13e28e508ff0cf9d5c1	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/6388
884259571	PR_kwDOAMm_X840tLrz	6389	closed	Re-index: fix missing variable metadata	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #6382 - [x] Tests added	2022-03-20T21:11:38Z	2022-03-29T07:09:31Z	2022-03-21T07:53:05Z	2022-03-21T07:53:04Z	c604ee1fe852d51560100df6af79b4c28660f6b5	0	86b920ac931c9a78b067e08a84e3c587ec905047	fed852073eee883c0ed1e13e28e508ff0cf9d5c1	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/6389
884923775	PR_kwDOAMm_X840vt1_	6394	closed	Fix DataArray groupby returning a Dataset	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #6393 - [x] Tests added	2022-03-21T14:43:21Z	2022-03-29T07:09:30Z	2022-03-21T15:26:20Z	2022-03-21T15:26:20Z	321c5608a3be3cd4b6a4de3b658d1e2d164c0409	0	6123ae884795d08db6f4de736e5d52ef90648991	c604ee1fe852d51560100df6af79b4c28660f6b5	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/6394
886017261	PR_kwDOAMm_X840z4zt	6400	closed	Speed-up multi-index html repr + add display_values_threshold option	benbovy 4160723	This adds `PandasMultiIndexingAdapter._repr_html_` that can greatly speed-up the html repr of Xarray objects with multi-indexes. This optimized `_repr_html_` implementation is now used for formatting the array detailed view of all multi-index coordinates in the html repr, instead of converting the full index and each levels to numpy arrays before formatting them. ```python import xarray as xr ds = xr.tutorial.load_dataset("air_temperature") da = ds["air"].stack(z=[...]) da.shape # (3869000,) %timeit -n 1 -r 1 da._repr_html_() # 9.96 ms ! ``` <!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #5529 - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`	2022-03-22T12:57:37Z	2022-03-29T07:10:22Z	2022-03-29T07:05:32Z	2022-03-29T07:05:32Z	d8fc34660f409d4c6a7ce9fe126d126e4f76c7fd	0	b8f732c61a86be5d1e8efbf3a906f9a5f69c31fd	728b648d5c7c3e22fe3704ba163012840408bf66	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/6400
891741295	PR_kwDOAMm_X841JuRv	6418	closed	Fix concat with scalar coordinate (dtype)	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #6416 - [x] Tests added	2022-03-28T12:22:50Z	2022-03-29T07:06:46Z	2022-03-28T16:05:01Z	2022-03-28T16:05:01Z	009b15461bf1ad4567e57742e44db4efa4e44cc7	0	5711dc21ff0711559214bde147cf3a20f6880f8e	728b648d5c7c3e22fe3704ba163012840408bf66	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/6418
900624195	PR_kwDOAMm_X841rm9D	6443	closed	Fix concat with scalar coordinate (wrong index type)	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #6434 - [x] Tests added	2022-04-05T19:16:30Z	2022-12-08T09:36:50Z	2022-04-06T01:19:48Z	2022-04-06T01:19:47Z	facafac359c39c3e940391a3829869b4a3df5d70	0	185b79199d25ff83dfdea944fa200342afc5e144	2eef20b74c69792bad11e5bfda2958dc8365513c	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/6443
998719144	PR_kwDOAMm_X847hz6o	6800	closed	(scipy 2022 branch) Add an "options" argument to Index.from_variables()	benbovy 4160723	It allows passing options to the constructor of a custom `Index` subclass, in case there's any relevant build options to expose to users. This could for example be the distance metric chosen for an index based on `sklearn.neighbors.BallTree`, or the CRS definition for a geospatial index. The `**options` arguments of `Dataset.set_xindex()` are passed through. An alternative way would be to pass options via coordinate metadata, like the `spatial_ref` coordinate in rioxarray. Perhaps both alternatives may co-exist? This PR also adds type annotations to `set_xindex()`.	2022-07-17T20:01:00Z	2022-12-08T09:38:50Z	2022-09-02T13:54:46Z		f4b214279bd34fe6c5bdebfb7f8f76e63e53d40c	0	46e19d493a18fc81f44129ff65441925080297b3	a5f068e0f6cb4d5ba8de5e10844ae2bfc4a56655	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/6800
1013692836	PR_kwDOAMm_X848a7mk	6857	closed	Fix aligned index variable metadata side effect	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #6852 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`	2022-08-01T10:57:16Z	2022-12-08T09:36:49Z	2022-08-31T07:16:14Z	2022-08-31T07:16:14Z	4880012ddee9e43e3e18e95551876e9c182feafb	0	c39cdaa63f0d55b34cca1d04a24b1621801cc8e6	434f9e8929942afc2380eab52a07e77d30cc7885	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/6857
1042357878	PR_kwDOAMm_X84-IR52	6971	closed	Add set_xindex and drop_indexes methods	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #6849 - [x] Supersedes #6800 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [x] New functions/methods are listed in `api.rst` This PR adds Dataset and DataArray `.set_xindex` and `.drop_indexes` methods (the latter is also discussed in #4366). I've cherry picked the relevant commits in the `scipy22` branch and added a few more commits. This PR also allows passing build options to any `Index`. Some comments and open questions: - Should we make the `index_cls` argument of `set_xindex` optional? - I.e., `set_index(coord_names, index_cls=None, **options)` where a pandas index is created by default (or a pandas multi-index if several coordinate names are given), provided that the coordinate(s) are valid 1-d candidates. - This would be redundant with the existing `set_index` method, but this would be convenient if we later depreciate it. - Should we depreciate `set_index` and `reset_index`? I think we should, but probably not at this point yet. - There's a special case for multi-indexes where `set_xindex(["foo", "bar"], PandasMultiIndex)` adds a dimension coordinate in addition to the "foo" and "bar" level coordinates so that it is consistent with the rest of Xarray. I find it a bit annoying, though. Probably another motivation for depreciating this dimension coordinate. - In this PR I also imported the `Index` base class in Xarray's root namespace. - It is needed for custom indexes and it's just a little more convenient than importing it from `xarray.core.indexes`. - Should we do the same for `PandasIndex` and `PandasMultiIndex` subclasses? Maybe if one wants to create a custom index inheriting from it. `PandasMultiIndex` factory methods could be also useful if we depreciate passing `pd.MultiIndex` objects as DataArray / Dataset coordinates.	2022-08-31T12:54:35Z	2022-12-08T09:38:13Z	2022-09-28T07:25:15Z	2022-09-28T07:25:15Z	e678a1d7884a3c24dba22d41b2eef5d7fe5258e7	0	b598447ba2e9c98bb1186719dc9bc6be95e13042	a042ae69c0444912f94bb4f29c93fa05046893ed	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/6971
1043726871	PR_kwDOAMm_X84-NgIX	6975	closed	Add documentation on custom indexes	benbovy 4160723	This PR documents the API of the `Index` base class and adds a guide for creating custom indexes (reworked from https://hackmd.io/Zxw_zCa7Rbynx_iJu6Y3LA). Hopefully it will help anyone experimenting with this feature. @pydata/xarray your feedback would be very much appreciated! I've been into this for quite some time, so there may be things that seem obvious to me but that you can still find very confusing or non-intuitive. It would then deserve some extra or better explanation. More specifically, I'm open to any suggestion on how to better illustrate this with clear and succinct examples. There are other parts of the documentation that still need to be updated regarding the indexes refactor (e.g., "dimension" coordinates, `xindexes` property, set/drop indexes, etc.). But I suggest to do that in separate PRs and focus here on creating custom indexes.	2022-09-01T13:20:00Z	2023-08-30T09:10:34Z	2023-07-17T23:23:22Z	2023-07-17T23:23:22Z	7234603781768728b3fd544cdcaca991466d4a44	0	07814bc579a0687ddc4deef0a1825c16ba02333e	647376d1d2db3210c142d8204c1c3a7431b85b9a	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/6975
1046566934	PR_kwDOAMm_X84-YVgW	6992	closed	Review (re)set_index	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes - [x] fixes #6946 - [x] fixes #6989 - [x] fixes #6959 - [x] fixes #6969 - [x] fixes #7036 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` Restore behavior prior to the explicit indexes refactor (i.e., refactored but without breaking changes). TODO: - [x] review `set_index` - [x] review `reset_index` For `reset_index`, the only behavior that is not restored here is the coordinate renamed with a `_` suffix when dropping a single index. This was originally to prevent any coordinate with no index matching a dimension name, which is now irrelevant. That is a quite dirty workaround and I don't know who is relying on it (no complaints yet), but I'm open to restore it if needed (esp. considering that we may later deprecate `reset_index` completely in favor of `drop_indexes` #6971).	2022-09-05T15:07:43Z	2023-08-30T09:05:10Z	2022-09-27T10:35:38Z	2022-09-27T10:35:38Z	a042ae69c0444912f94bb4f29c93fa05046893ed	0	ca01949cb889ee38aae33560b02de1f7625fd921	45c0a114e2b7b27b83c9618bc05b36afac82183c	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/6992
1047776643	PR_kwDOAMm_X84-c82D	6999	closed	Raise UserWarning when rename creates a new dimension coord	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #6607 - [x] Closes #4107 - [x] Closes #6229 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` Current implemented "fix": raise a `UserWarning` and suggest using `swap_dims` () Alternatively, we could: - revert the breaking change (i.e., create the index again) and raise a `DeprecationWarning` instead - raise an error instead of a warning I don't have strong opinions on this, I'm happy to implement another alternative. The downside of reverting the breaking change now is that unfortunately it will introduce a breaking change in the next release., while workarounds are pretty straightforward. () from https://github.com/pydata/xarray/issues/6607#issuecomment-1126587818, doing `ds.set_coords(['lon']).rename(x='lon').set_index(lon='lon')` is working too. With #6971, `.set_xindex('lon')` could work as well.	2022-09-06T16:16:17Z	2022-12-08T09:38:13Z	2022-09-27T09:33:40Z	2022-09-27T09:33:40Z	45c0a114e2b7b27b83c9618bc05b36afac82183c	0	486f9b876c212cc3f2df7dd1438d1832ce5df03b	1f4be33365573da19a684dd7f2fc97ace5d28710	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/6999
1048613040	PR_kwDOAMm_X84-gJCw	7003	closed	Misc. fixes for Indexes with pd.Index objects	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #6987 - [x] Tests added	2022-09-07T11:05:02Z	2022-12-08T09:36:51Z	2022-09-23T07:30:38Z	2022-09-23T07:30:38Z	9d1499e22e2748eeaf088e6a2abc5c34053bf37c	0	54271bd4cda67c5f5b8703095798c122b7e96b0c	5bec4662a7dd4330eca6412c477ca3f238323ed2	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/7003
1048884296	PR_kwDOAMm_X84-hLRI	7004	open	Rework PandasMultiIndex.sel internals	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #6838 - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` This PR hopefully improves how are handled the labels that are provided for multi-index level coordinates in `.sel()`. More specifically, slices are handled in a cleaner way and it is now allowed to provide array-like labels. `PandasMultiIndex.sel()` relies on the underlying `pandas.MultiIndex` methods like this: - use ``get_loc`` when all levels are provided with each a scalar label (no slice, no array) - always drops the index and returns scalar coordinates for each multi-index level - use ``get_loc_level`` when only a subset of levels are provided with scalar labels only - may collapse one or more levels of the multi-index (dropped levels result in scalar coordinates) - if only one level remains: renames the dimension and the corresponding dimension coordinate - use ``get_locs`` for all other cases. - always keeps the multi-index and its coordinates (even if only one item or one level is selected) This yields a predictable behavior: as soon as one of the provided labels is a slice or array-like, the multi-index and all its level coordinates are kept in the result. Some cases illustrated below (I compare this PR with an older release due to the errors reported in #6838): ```python import xarray as xr import pandas as pd midx = pd.MultiIndex.from_product([list("abc"), range(4)], names=("one", "two")) ds = xr.Dataset(coords={"x": midx}) # <xarray.Dataset> # Dimensions: (x: 12) # Coordinates: # * x (x) object MultiIndex # * one (x) object 'a' 'a' 'a' 'a' 'b' 'b' 'b' 'b' 'c' 'c' 'c' 'c' # * two (x) int64 0 1 2 3 0 1 2 3 0 1 2 3 # Data variables: # empty ``` ```python ds.sel(one="a", two=0) # this PR # # <xarray.Dataset> # Dimensions: () # Coordinates: # x object ('a', 0) # one <U1 'a' # t…	2022-09-07T14:57:29Z	2022-09-22T20:38:41Z			0a4b1aafbe66a857de627cf180eba8713ca9a85d	0	00baaddefae0a189874ca64d9f4be4d2d83cc744	5bec4662a7dd4330eca6412c477ca3f238323ed2	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/7004
1070271669	PR_kwDOAMm_X84_ywy1	7101	closed	Fix Dataset.assign_coords overwriting multi-index	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #7097 - [x] Tests added @dcherian the `DeprecationWarning` was ignored by default for `.assign_coords()` because of https://github.com/pydata/xarray/pull/6798#discussion_r924653224. I changed it to `FutureWarning` so that it is shown for both `.assign()` and `.assign_coords()`.	2022-09-28T16:21:48Z	2022-12-08T09:36:50Z	2022-09-28T18:02:16Z	2022-09-28T18:02:16Z	513ee34f16cc8f9250a72952e33bf9b4c95d33d1	0	ee9b027c0e41de15fc4960dde9e4c551d7d2a9df	e678a1d7884a3c24dba22d41b2eef5d7fe5258e7	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/7101
1071450326	PR_kwDOAMm_X84_3QjW	7105	closed	Fix to_index(): return multiindex level as single index	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #6836 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`	2022-09-29T14:44:22Z	2022-12-08T09:36:51Z	2022-10-12T14:12:48Z	2022-10-12T14:12:48Z	f93b467db5e35ca94fefa518c32ee9bf93232475	0	e9a75b746d68fba12216a1f455252cd9fa4c3ebf	50ea159bfd0872635ebf4281e741f3c87f0bef6b	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/7105
1090510499	PR_kwDOAMm_X85A_96j	7182	open	add MultiPandasIndex helper class	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [ ] Closes #xxxx - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` This PR adds a `xarray.indexes.MultiPandasIndex` helper class for building custom, meta-indexes that encapsulate multiple `PandasIndex` instances. Unlike `PandasMultiIndex`, the meta-index classes inheriting from this helper class may encapsulate loosely coupled (pandas) indexes, with coordinates of arbitrary dimensions (each coordinate must be 1-dimensional but an Xarray index may be created from coordinates with differing dimensions). Early prototype in this [notebook](https://notebooksharing.space/view/3d599addf8bd6b06a6acc241453da95e28c61dea4281ecd194fbe8464c9b296f#displayOptions=) TODO / TO FIX: - How to allow custom `__init__` options in subclasses be passed to all the `type(self)(new_indexes)` calls inside the `MultiPandasIndex` "base" class? This could be done via `**kwargs` passed through... However, mypy will certainly complain (Liskov Substitution Principle). - Is `MultiPandasIndex` a good name for this helper class?	2022-10-18T09:42:58Z	2023-08-23T16:30:28Z			6633615eca663c879bba4e9a144050c4aaa7555f	1	e4d753c3bf3ffdc30864510885c68fdb2e8349a2	ab726c536464fbf4d8878041f950d2b0ae09b862	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/7182
1098978950	PR_kwDOAMm_X85BgRaG	7214	closed	Pass indexes directly to the DataArray and Dataset constructors	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #6392 - [x] Closes #6633 ? - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` From https://github.com/pydata/xarray/issues/6392#issuecomment-1290454937: I'm thinking of only accepting one or more instances of [Indexes](https://github.com/pydata/xarray/blob/e678a1d7884a3c24dba22d41b2eef5d7fe5258e7/xarray/core/indexes.py#L1030) as indexes argument in the Dataset and DataArray constructors. The only exception is when `fastpath=True` a mapping can be given directly. Also, when an empty collection of indexes is passed this skips the creation of default pandas indexes for dimension coordinates. - It is much easier to handle: just check that keys returned by `Indexes.variables` do no conflict with the coordinate names in the `coords` argument - It is slightly safer: it requires the user to explicitly create an `Indexes` object, thus with less chance to accidentally provide coordinate variables and index objects that do not relate to each other (we could probably add some safe guards in the `Indexes` class itself) - It is more convenient: an Xarray `Index` may provide a factory method that returns an instance of `Indexes` that we just need to pass as indexes, and we could also do something like `ds = xr.Dataset(indexes=other_ds.xindexes)`	2022-10-25T14:16:44Z	2023-08-30T09:11:56Z	2023-07-18T11:52:11Z		b3a3fd5a537d8000baf8ece3093a60ea14406ecc	1	ddd505e6af5270e143ee814485d5b4665456d77f	6e77f5e8942206b3e0ab08c3621ade1499d8235b	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/7214
1142893563	PR_kwDOAMm_X85EHyv7	7347	closed	Fix assign_coords resetting all dimension coords to default index	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #7346 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`	2022-12-02T08:19:01Z	2022-12-08T09:36:49Z	2022-12-02T16:32:40Z	2022-12-02T16:32:40Z	8938d390a969a94275a4d943033a85935acbce2b	0	23d9889d11b181c94db2b5e8fe33073a1328be1f	92e7cb5b21a6dee7f7333c66e41233205c543bc1	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/7347
1154470307	PR_kwDOAMm_X85Ez9Gj	7368	closed	Expose "Coordinates" as part of Xarray's public API	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #7214 - [x] Closes #6392 - [x] xref #6633 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [x] New functions/methods are listed in `api.rst` This is a rework of #7214. It follows the suggestions made in https://github.com/pydata/xarray/pull/7214#issuecomment-1295283938, https://github.com/pydata/xarray/pull/7214#issuecomment-1297046405 and https://github.com/pydata/xarray/pull/7214#issuecomment-1293774799: - No `indexes` argument is added to `Dataset.__init__`, and the `indexes` argument of `DataArray.__init__` is kept private (i.e., valid only if fastpath=True) - When a `Coordinates` object is passed to a new Dataset or DataArray via the `coords` argument, both coordinate variables and indexes are copied/extracted and added to the new object - This PR also adds ~~an `IndexedCoordinates` subclass~~ `Coordinates` public constructors used to create Xarray coordinates and indexes from non-Xarray objects. For example, the `Coordinates.from_pandas_multiindex()` class method creates a new set of index and coordinates from an existing `pd.MultiIndex`. EDIT: `IndexCoordinates` has been merged with `Coordinates` EDIT2: it ended up as a pretty big refactor with the promotion of `Coordinates` has a 2nd-class Xarray container that supports alignment like Dataset and DataArray. It is still quite advanced API, useful for passing coordinate variables and indexes around. Internally, `Coordinates` objects are still "virtual" containers (i.e., proxies for coordinate variables and indexes stored in their corresponding DataArray or Dataset objects). For now, a "stand-alone" `Coordinates` object created from scratch wraps a Dataset with no data variables. Some examples of usage: ```python import pandas as pd import xarray as xr midx = pd.MultiIndex.from_product([["a", "b"], [1, 2]], names=("one", "two")) coords = xr.Coordinates.from_pandas_multiinde…	2022-12-08T16:59:29Z	2023-08-30T09:11:57Z	2023-07-21T20:40:03Z	2023-07-21T20:40:03Z	4441f9915fa978ad5b276096ab67ba49602a09d2	0	4ef5f17db6d2aefd91fb02485ab7a815fe460b47	6b1ff6d13bf360df786500dfa7d62556d23e6df9	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/7368
1166747288	PR_kwDOAMm_X85FiyaY	7382	closed	Some alignment optimizations	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Benchmark added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` May fix some performance regressions, e.g., see https://github.com/pydata/xarray/issues/7376#issuecomment-1352989233. @ravwojdyla with this PR `ds.assign(foo=~ds["d3"])` in your example should be much faster (on par with version 2022.3.0).	2022-12-15T12:54:56Z	2023-08-30T09:05:24Z	2023-01-05T21:25:55Z	2023-01-05T21:25:55Z	d6d24507793af9bcaed79d7f8d3ac910e176f1ce	0	95be2d07403a8e061df19f682db42ad273c62745	b93dae4079daf0fc4c042fd0d699c16624430cdc	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/7382
1465015830	PR_kwDOAMm_X85XUl4W	8051	open	Allow setting (or skipping) new indexes in open_dataset	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #6633 - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` This PR introduces a new boolean parameter `set_indexes=True` to `xr.open_dataset()`, which may be used to skip the creation of default (pandas) indexes when opening a dataset. Currently works with the Zarr backend: ```python import numpy as np import xarray as xr # example dataset (real dataset may be much larger) arr = np.random.random(size=1_000_000) xr.Dataset({"x": arr}).to_zarr("dataset.zarr") xr.open_dataset("dataset.zarr", set_indexes=False, engine="zarr") # <xarray.Dataset> # Dimensions: (x: 1000000) # Coordinates: # x (x) float64 ... # Data variables: # empty xr.open_zarr("dataset.zarr", set_indexes=False) # <xarray.Dataset> # Dimensions: (x: 1000000) # Coordinates: # x (x) float64 ... # Data variables: # empty ``` I'll add it to the other Xarray backends as well, but I'd like to get your thoughts about the API first. 1. Do we want to add yet another keyword parameter to `xr.open_dataset()`? There are already many... 2. Do we want to add this parameter to the `BackendEntrypoint.open_dataset()` API? - I'm afraid we must do it if we want this parameter in `xr.open_dataset()` - this would also make it possible skipping the creation of custom indexes (if any) in custom IO backends - con: if we require `set_indexes` in the signature in addition to the `drop_variables` parameter, this is a breaking change for all existing 3rd-party backends. Or should we group `set_indexes` with the other xarray decoder kwargs? This would feel a bit odd to me as setting indexes is different from decoding data. 3. Or should we leave this up to the backends? - pros: no breaking change, more flexible (3rd party backends may want to offer more control like choosing between cus…	2023-08-07T10:53:46Z	2024-02-03T19:12:48Z			0b37c66130416f202c3b8ee2302ee9ea517bdadd	0	eae983bb6b7ee916e5c8956b6af42c2207ad48d1	c9ba2be2690564594a89eb93fb5d5c4ae7a9253c	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/8051
1482940936	PR_kwDOAMm_X85YY-II	8094	closed	Refactor update coordinates to better handle multi-coordinate indexes	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #7563 - [x] Closes #8039 - [x] Closes #8056 - [x] Closes #7885 - [x] Closes #7921 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` This refactor should better handle multi-coordinate indexes when updating (or assigning) new coordinates. It also fixes, better isolates and better warns a bunch of deprecated pandas multi-index special cases (i.e., directly passing `pd.MultiIndex` objects or updating a multi-index dimension coordinate). I very much look forward to seeing support for those cases dropped :).	2023-08-21T13:57:38Z	2023-08-30T09:06:28Z	2023-08-29T14:23:29Z	2023-08-29T14:23:29Z	1fedfd86604f87538d1953b01d6990c2c89fcbf3	0	748ee246821f5c308fc52e29c5d6b1d5f628cacf	42d42bab5811702e56c638b9489665d3c505a0c1	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/8094
1486052929	PR_kwDOAMm_X85Yk15B	8102	closed	Add `Coordinates.assign()` method	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [x] New functions/methods are listed in `api.rst` This is consistent with the Dataset and DataArray `assign` methods (now that `Coordinates` is also exposed as public API). This allows writing: ```python midx = pd.MultiIndex.from_arrays([["a", "a", "b", "b"], [0, 1, 0, 1]]) midx_coords = xr.Coordinates.from_pandas_multiindex(midx, "x") ds = xr.Dataset(coords=midx_coords.assign(y=[1, 2])) ``` which is quite common (at least in the tests) and a bit nicer than ```python ds = xr.Dataset(coords=midx_coords.merge({"y": [1, 2]}).coords) ```	2023-08-23T09:15:51Z	2023-09-01T13:28:16Z	2023-09-01T13:28:16Z	2023-09-01T13:28:16Z	71177d481eb0c3547cb850a4b3e866af6d4fded7	0	6f1dfed9dac9bcecb6b9b8bd1abd20d5cb388f68	1043a9e13574e859ec08d19425341b2e359d2802	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/8102
1486710446	PR_kwDOAMm_X85YnWau	8104	closed	Fix merge with compat=minimal (coord names)	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #7405 - [x] Closes #7588 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`	2023-08-23T16:20:48Z	2023-08-30T09:11:18Z	2023-08-30T07:57:35Z	2023-08-30T07:57:35Z	b136fcb679e9e70fd44b60688d96e75d4e3f8dcb	0	613eb1337d38f6b92434feaffb12b4f99e597cf0	1fedfd86604f87538d1953b01d6990c2c89fcbf3	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/8104
1487073982	PR_kwDOAMm_X85YovK-	8107	closed	Better default behavior of the Coordinates constructor	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` After working more on `Coordinates` I realize that the default behavior of its constructor could be more consistent with other Xarray objects. This PR changes this default behavior such that: - Pandas indexes are created for dimension coordinates if `indexes=None` (default). To create dimension coordinates with no index, just pass `indexes={}`. - If another `Coordinates` object is passed as input, its indexes are also added to the new created object. Since we don't support alignment / merge here, the following call raises an error: `xr.Coordinates(coords=xr.Coordinates(...), indexes={...})`. This PR introduces a breaking change since `Coordinates` are now exposed in v2023.8.0, which has just been released. It is a bit unfortunate but I think it may be OK for a fresh feature, especially if the next release will be soon after this one.	2023-08-23T21:42:51Z	2024-02-04T18:32:42Z	2023-08-31T07:35:47Z	2023-08-31T07:35:47Z	0f9f790c7e887bbfd13f4026fd1d37e4cd599ff1	0	bce000cff6be4cf9d42454da4c370685e9dad051	42d42bab5811702e56c638b9489665d3c505a0c1	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/8107
1487590692	PR_kwDOAMm_X85YqtUk	8109	closed	Better error message when trying to set an index from a scalar coordinate	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #4091 - [x] Tests added The message suggests using `.expand_dims()`.	2023-08-24T08:18:13Z	2023-08-30T09:27:27Z	2023-08-30T07:13:15Z	2023-08-30T07:13:15Z	e5a38f6837ae9b9aa28a4bd063620a1cd802e093	0	a1d70aa0aca1fb33b611a23697e5af04b34b2c7c	42d42bab5811702e56c638b9489665d3c505a0c1	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/8109
1488345780	PR_kwDOAMm_X85Ytlq0	8111	open	Alignment: allow flexible index coordinate order	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #7002 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` This PR relaxes some of the rules used in alignment for finding the indexes to compare or join together. Those indexes must still be of the same type and must relate to the same set of coordinates (and dimensions), but the order of coordinates is now ignored. It is up to the index to implement the equal / join logic if it needs to care about that order. Regarding `pandas.MultiIndex`, it seems that the level names are ignored when comparing indexes: ```python midx = pd.MultiIndex.from_product([["a", "b"], [0, 1]], names=("one", "two"))) midx2 = pd.MultiIndex.from_product([["a", "b"], [0, 1]], names=("two", "one")) midx.equals(midx2) # True ``` However, in Xarray the names of the multi-index levels (and their order) matter since each level has its own xarray coordinate. In this PR, `PandasMultiIndex.equals()` and `PandasMultiIndex.join()` thus check that the level names match.	2023-08-24T16:18:49Z	2023-09-28T15:58:38Z			79103728908c37d32bc902cd7bcc583363ce9bd9	0	0645c4b813908104c27ace51fce16ac053c6e1e8	42d42bab5811702e56c638b9489665d3c505a0c1	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/8111
1492188700	PR_kwDOAMm_X85Y8P4c	8118	open	Add Coordinates `set_xindex()` and `drop_indexes()` methods	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - Complements #8102 - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` I don't think that we need to copy most API from Dataset / DataArray to `Coordinates`, but I find it convenient to have some relevant methods there too. For example, building Coordinates from scratch (with custom indexes) before passing the whole coords + indexes bundle around: ```python import dask.array as da import numpy as np import xarray as xr coords = ( xr.Coordinates( coords={"x": da.arange(100_000_000), "y": np.arange(100)}, indexes={}, ) .set_xindex("x", DaskIndex) .set_xindex("y", xr.indexes.PandasIndex) ) ds = xr.Dataset(coords=coords) # <xarray.Dataset> # Dimensions: (x: 100000000, y: 100) # Coordinates: # * x (x) int64 dask.array<chunksize=(16777216,), meta=np.ndarray> # * y (y) int64 0 1 2 3 4 5 6 7 8 9 10 ... 90 91 92 93 94 95 96 97 98 99 # Data variables: # empty # Indexes: # x DaskIndex ```	2023-08-28T14:28:24Z	2023-09-19T01:53:18Z			664b100ba033d892b0894c82c49c18fc71b3f7be	0	13ebc667add99d53fe5619de8206ce745e453829	828ea08aa74d390519f43919a0e8851e29091d00	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/8118
1496182200	PR_kwDOAMm_X85ZLe24	8124	open	More flexible index variables	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [ ] Closes #xxxx - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` The goal of this PR is to provide a more general solution to indexed coordinate variables, i.e., support arbitrary dimensions and/or duck arrays for those variables while at the same time prevent them from being updated in a way that would invalidate their index. This would solve problems like the one mentioned here: https://github.com/pydata/xarray/issues/1650#issuecomment-1697237429 @shoyer I've tried to implement what you have suggested in https://github.com/pydata/xarray/pull/4979#discussion_r589798510. It would be nice indeed if eventually we could get rid of `IndexVariable`. It won't be easy to deprecate it until we finish the index refactor (i.e., all methods listed in #6293), though. Also, I didn't find an easy way to refactor that class as it has been designed too closely around a 1-d variable backed by a `pandas.Index`. So the approach implemented in this PR is to keep using `IndexVariable` for PandasIndex until we can deprecate / remove it later, and for the other cases use `Variable` with data wrapped in a custom `IndexedCoordinateArray` object. The latter solution (wrapper) doesn't always work nicely, though. For example, several methods of `Variable` expect that `self._data` directly returns a duck array (e.g., a dask array or a chunked duck array). A wrapped duck array will result in unexpected behavior there. We could probably add some checks / indirection or extend the wrapper API... But I wonder if there wouldn't be a more elegant approach? More generally, which operations should we allow / forbid / skip for an indexed coordinate variable? - Set array items in-place? Do not allow. - Replace data? Do not allow. - (Re)Chunk? - Load lazy data? - ... ? (Note: we could add `Index.chunk()` and `Index.load()` metho…	2023-08-30T21:45:12Z	2023-08-31T16:02:20Z			8b84dc392e5443f9ada245cb6a6f31d8f19327df	1	09f3ed0acd119fcefa07652bbc40dff96db2f66c	0f9f790c7e887bbfd13f4026fd1d37e4cd599ff1	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/8124
1497266410	PR_kwDOAMm_X85ZPnjq	8128	open	Add Index.load() and Index.chunk() methods	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [ ] Closes #xxxx - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` As mentioned in #8124, it gives more control to custom Xarray indexes on what best to do when the Dataset / DataArray `load()` and `chunk()` counterpart methods are called. `PandasIndex.load()` and `PandasIndex.chunk()` always return self (no action required). For a DaskIndex, we might want to return a PandasIndex (or another non-lazy index) from `load()` and rebuild a DaskIndex object from `chunk()` (rechunk).	2023-08-31T14:16:27Z	2023-08-31T15:49:06Z			a1842563887f8375fb3a03824189a75e6f080c96	1	4506cb600caba75f163c088171f590b67f59264b	1043a9e13574e859ec08d19425341b2e359d2802	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/8128
1500283634	PR_kwDOAMm_X85ZbILy	8140	open	Deprecate passing pd.MultiIndex implicitly	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - Follow-up #8094 - [x] Closes #6481 - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` This PR should normally raise a warning each time when indexed coordinates are created implicitly from a `pd.MultiIndex` object. I updated the tests to create coordinates explicitly using `Coordinates.from_pandas_multiindex()`. I also refactored some parts where a `pd.MultiIndex` could still be passed and promoted internally, with the exception of: - `swap_dims()`: it should raise a warning! Right now the warning message is a bit confusing for this case, but instead of adding a special case we should probably deprecate the whole method? As it is suggested as a TODO comment... This method was to circumvent the limitations of dimension coordinates, which isn't needed anymore (`rename_dims` and/or `set_xindex` is equivalent and less confusing). - `xr.DataArray(pandas_obj_with_multiindex, dims=...)`: I guess it should raise a warning too? - `da.stack(z=...).groupby("z")`: it shoudn't raise a warning, but this requires a (heavy?) refactoring of groupby. During building the "grouper" objects, `grouper.group1d` or `grouper.unique_coord` may still be built by extracting only the multi-index dimension coordinate. I'd greatly appreciate if anyone familiar with the groupby implementation could help me with this! @dcherian ?	2023-09-03T14:01:18Z	2023-11-15T20:15:00Z			ddb96c1f3a6fc2bcddea2432af311c5cbfcfc492	0	ef7dae0893f6701a203f8ec3c2e655bff7944b91	e2b6f3468ef829b8a83637965d34a164bf3bca78	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/8140
1500744603	PR_kwDOAMm_X85Zc4ub	8141	closed	Fix doctests: pandas 2.1 MultiIndex repr with nan	benbovy 4160723		2023-09-04T07:08:55Z	2023-09-05T08:35:37Z	2023-09-05T08:35:36Z	2023-09-05T08:35:36Z	f13da94db8ab4b564938a5e67435ac709698f1c9	0	445e6c923d112d584c714df3bf3ba2fbab004d3e	e9c1962f31a7b5fd7a98ee4c2adf2ac147aabbcf	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/8141
1500931269	PR_kwDOAMm_X85ZdmTF	8142	closed	Dirty workaround for mypy 1.5 error	benbovy 4160723	I wanted to fix the following error with mypy 1.5: ``` xarray/core/dataset.py:505: error: Definition of "__eq__" in base class "DatasetOpsMixin" is incompatible with definition in base class "Mapping" [misc] ``` Which looks similar to https://github.com/python/mypy/issues/9319. It is weird that here it worked with mypy versions < 1.5, though. I don't know if there is a better fix, but I thought that redefining `__eq__` in `Dataset` would be a bit less dirty workaround than adding `type: ignore` in the class declaration.	2023-09-04T09:21:18Z	2023-09-07T16:04:55Z	2023-09-07T08:21:12Z	2023-09-07T08:21:12Z	e2b6f3468ef829b8a83637965d34a164bf3bca78	0	46bd88fbea07d52f06eab5d11ca3f72b547af263	f13da94db8ab4b564938a5e67435ac709698f1c9	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/8142
1501219392	PR_kwDOAMm_X85ZespA	8143	open	Deprecate the multi-index dimension coordinate	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` This PR adds a `future_no_mindex_dim_coord=False` option that, if set to True, enables the future behavior of `PandasMultiIndex` (i.e., no added dimension coordinate with tuple values): ```python import xarray as xr ds = xr.Dataset(coords={"x": ["a", "b"], "y": [1, 2]}) ds.stack(z=["x", "y"]) # <xarray.Dataset> # Dimensions: (z: 4) # Coordinates: # * z (z) object MultiIndex # * x (z) <U1 'a' 'a' 'b' 'b' # * y (z) int64 1 2 1 2 # Data variables: # empty with xr.set_options(future_no_mindex_dim_coord=True): ds.stack(z=["x", "y"]) # <xarray.Dataset> # Dimensions: (z: 4) # Coordinates: # * x (z) <U1 'a' 'a' 'b' 'b' # * y (z) int64 1 2 1 2 # Dimensions without coordinates: z # Data variables: # empty ``` There are a few other things that we'll need to adapt or deprecate: - Dropping multi-index dimension coordinate de-facto allows having several multi-indexes along the same dimension. Normally `stack` should already take this into account, but there may be other places where this is not yet supported or where we should raise an explicit error. - Deprecate `Dataset.reorder_levels`: API is not compatible with the absence of dimension coordinate and several multi-indexes along the same dimension. I think it is OK to deprecate such edge case, which alternatively could be done by extracting the pandas index, updating it and then re-assign it to a the dataset with `assign_coords(xr.Coordinates.from_pandas_multiindex(...))` - The text-based repr: in the example above, `Dimensions without coordinate: z` doesn't make much sense - ... ? I started updating the tests, although this will be much easier once #8140 is merged. This is something that we could also easily split into multiple PRs. It is probably OK if some features are (t…	2023-09-04T12:32:36Z	2023-09-04T12:32:48Z			d0709f6d90e3f71d78e562c15b1662a423d8e3e9	0	87d5bf72e766b101db32dc65e6a79957368812ee	71177d481eb0c3547cb850a4b3e866af6d4fded7	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/8143
1509661685	PR_kwDOAMm_X85Z-5v1	8170	open	Dataset.from_dataframe: optionally keep multi-index unexpanded	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #8166 - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` I added both the `unstack` and `dim` arguments but we can change that. - [ ] update `DataArray.from_series()`	2023-09-11T06:20:17Z	2023-09-11T06:20:17Z			d3c6c4785be4a17946c88907176833e8bdabcd67	1	1afef691db8879526212a504bb42dbfc6f81878a	2951ce0215f14a8a79ecd0b5fc73a02a34b9b86b	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/8170
1696970326	PR_kwDOAMm_X85lJbZW	8672	closed	Fix multiindex level serialization after reset_index	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #8628 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`	2024-01-26T10:40:42Z	2024-02-23T01:22:17Z	2024-01-31T17:42:29Z	2024-01-31T17:42:29Z	f9f4c730254073f0f5a8fce65f4bbaa0eefec5fd	0	72f319f5c4259c19aabf223faa1d9a51ba035887	ca4f12133e9643c197facd17b54d5040a1bda002	MEMBER	{ "enabled_by": { "login": "dcherian", "id": 2448579, "node_id": "MDQ6VXNlcjI0NDg1Nzk=", "avatar_url": "https://avatars.githubusercontent.com/u/2448579?v=4", "gravatar_id": "", "url": "https://api.github.com/users/dcherian", "html_url": "https://github.com/dcherian", "followers_url": "https://api.github.com/users/dcherian/followers", "following_url": "https://api.github.com/users/dcherian/following{/other_user}", "gists_url": "https://api.github.com/users/dcherian/gists{/gist_id}", "starred_url": "https://api.github.com/users/dcherian/starred{/owner}{/repo}", "subscriptions_url": "https://api.github.com/users/dcherian/subscriptions", "organizations_url": "https://api.github.com/users/dcherian/orgs", "repos_url": "https://api.github.com/users/dcherian/repos", "events_url": "https://api.github.com/users/dcherian/events{/privacy}", "received_events_url": "https://api.github.com/users/dcherian/received_events", "type": "User", "site_admin": false }, "merge_method": "squash", "commit_title": "Fix multiindex level serialization after reset_index (#8672)", "commit_message": "* fix serialize multi-index level coord after reset\r\n\r\n* add regression test\r\n\r\n* update what's new\r\n\r\n---------\r\n\r\nCo-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>" }	xarray 13221727	https://github.com/pydata/xarray/pull/8672
1797701340	PR_kwDOAMm_X85rJr7c	8888	open	to_base_variable: coerce multiindex data to numpy array	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #8887, and probably supersedes #8809 - [x] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - ~~New functions/methods are listed in `api.rst`~~ @slevang this should also make work your test case added in #8809. I haven't added it here, instead I added a basic check that should be enough. I don't really understand why the serialization backends (zarr?) do not seem to work with the `PandasMultiIndexingAdapter.__array__()` implementation, which should normally coerce the multi-index levels into numpy arrays as needed. Anyway, I guess that coercing it early like in this PR doesn't hurt and may avoid the confusion of a non-indexed, isolated coordinate variable that still wraps a pandas.MultiIndex.	2024-03-29T10:10:42Z	2024-03-29T15:54:19Z			0f5c78efff8fdc024de20a178acf3ae7ac62f84e	0	dd9c3b4ad88b6694b6e737e86e80ad1dcfa1527c	2120808bbe45f3d4f0b6a01cd43bac4df4039092	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/8888
1808774743	PR_kwDOAMm_X85rz7ZX	8911	open	Refactor swap dims	benbovy 4160723	<!-- Feel free to remove check-list items aren't relevant to your change --> - [ ] Attempt at fixing #8646 - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` I've tried here re-implementing `swap_dims` using `rename_dims`, `drop_indexes` and `set_xindex`. This fixes the example in #8646 but unfortunately this fails at handling the pandas multi-index special case (i.e., a single non-dimension coordinate wrapping a `pd.MultiIndex` that is promoted to a dimension coordinate in `swap-dims` auto-magically results in a `PandasMultiIndex` with both dimension and level coordinates).	2024-04-05T08:45:49Z	2024-04-17T16:46:34Z			36231f3beea60c788054877f91689d3469f84cbc	1	4102b9f67e5c28b85a154cf7ff0749e1f8f1a258	56182f73c56bc619a18a9ee707ef6c19d54c58a2	MEMBER		xarray 13221727	https://github.com/pydata/xarray/pull/8911

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [pull_requests] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [state] TEXT,
   [locked] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [body] TEXT,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [merged_at] TEXT,
   [merge_commit_sha] TEXT,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [draft] INTEGER,
   [head] TEXT,
   [base] TEXT,
   [author_association] TEXT,
   [auto_merge] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [url] TEXT,
   [merged_by] INTEGER REFERENCES [users]([id])
);
CREATE INDEX [idx_pull_requests_merged_by]
    ON [pull_requests] ([merged_by]);
CREATE INDEX [idx_pull_requests_repo]
    ON [pull_requests] ([repo]);
CREATE INDEX [idx_pull_requests_milestone]
    ON [pull_requests] ([milestone]);
CREATE INDEX [idx_pull_requests_assignee]
    ON [pull_requests] ([assignee]);
CREATE INDEX [idx_pull_requests_user]
    ON [pull_requests] ([user]);