github: issue_comments: 523 rows where user = 4160723 sorted by updated

523 rows where user = 4160723 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
1259228475	https://github.com/pydata/xarray/issues/6293#issuecomment-1259228475	https://api.github.com/repos/pydata/xarray/issues/6293	IC_kwDOAMm_X85LDk07	benbovy 4160723	2022-09-27T09:22:04Z	2023-08-24T11:42:53Z	MEMBER	Following thoughts and discussions in various issues (e.g., #6836), I'd like to suggest another section to the ones in the top comment: Deprecate `pandas.MultiIndex` special cases in Xarray remove the multi-index “dimension” coordinate (tuple elements) do not automatically promote `pandas.MultiIndex` objects as dimension + level coordinates, e.g., like in `xr.Dataset(coords={“x”: pd_midx})` but instead treat it as a single duck-array. do not accept `pandas.MultiIndex` as `dim` argument in `xarray.concat()` (#7148) remove `obj.to_index()` for all xarray objects? (EDIT) remove `Dataset.reset_index()` and `DataArray.reset_index()` They are source of many problems and complexities in Xarray internals (many regressions reported since the index refactor were related to those special cases) and I'm not sure that the value they add is really worth the trouble. Also, in the long term the special treatment of `PandasMultiIndex` vs. other Xarray multi-indexes may add some confusion. Some of those features are widely used (e.g., the creation of Dataset / DataArray from pandas multi-indexes is used in many places in unit tests), so we would need convenient alternatives and a smooth transition.	{ "total_count": 5, "+1": 5, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Explicit indexes: next steps 1148021907
1504975778	https://github.com/pydata/xarray/issues/6836#issuecomment-1504975778	https://api.github.com/repos/pydata/xarray/issues/6836	IC_kwDOAMm_X85ZtBui	benbovy 4160723	2023-04-12T09:42:39Z	2023-04-12T09:42:39Z	MEMBER	A special-case sounds reasonable to me as well as a temporary fix before looking into if/how we can refactor groupby so that it works with multiple kinds of built-in and/or custom indexes.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	groupby(multi-index level) not working correctly on a multi-indexed DataArray or DataSet 1318992926
1480906129	https://github.com/pydata/xarray/pull/7653#issuecomment-1480906129	https://api.github.com/repos/pydata/xarray/issues/7653	IC_kwDOAMm_X85YRNWR	benbovy 4160723	2023-03-23T10:01:35Z	2023-03-23T10:01:35Z	MEMBER	For the html repr an option that is easy to implement would be to add `max-height` and `overflow-y: scroll` CSS properties here: https://github.com/pydata/xarray/blob/1e361ccb9123fe25acfd9e3364c911c1eec7d9db/xarray/static/css/style.css#L256-L261 I don't think the default browser scrollbar will look very pretty inside the repr, but it might be OK if we don't set max-height to a too small value. A "click to expand" UI would certainly look prettier, but I doubt it would be easy to implement that in pure-CSS. "Expand on hover" is easier but that would be quite annoying UX I think.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	limit lines in html repr of dataset attrs 1633513067
1463633814	https://github.com/pydata/xarray/issues/7563#issuecomment-1463633814	https://api.github.com/repos/pydata/xarray/issues/7563	IC_kwDOAMm_X85XPUeW	benbovy 4160723	2023-03-10T10:59:07Z	2023-03-10T10:59:07Z	MEMBER	Thanks for the report @lkugler ! Directly assigning a multi-index like `mda['position'] = midx` is now ambiguous because all levels of the multi-index are now exposed as actual coordinates. We should provide a temporary fix or at least issue a warning. A proper way to assign a pandas multi-index is implemented in #7368. In the meantime, the workaround below should work for your example (it might stop working in the future, though): `python mda.coords.update(xr.Dataset(coords={"position": midx}))`	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	MultiIndex coordinates do not exist updating v2022.3 to v2022.12 1600983717
1440178393	https://github.com/pydata/xarray/pull/7530#issuecomment-1440178393	https://api.github.com/repos/pydata/xarray/issues/7530	IC_kwDOAMm_X85V12DZ	benbovy 4160723	2023-02-22T14:51:32Z	2023-02-22T14:51:32Z	MEMBER	I've imported the generated PDF in inkscape, fixed the font and converted it to paths, added a small margin and exported it as svg. I attach the file here, @dcherian feel free to add it in this PR.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	[skip-ci] Add PDF of Xarray logo 1584791395
1438377578	https://github.com/pydata/xarray/issues/7539#issuecomment-1438377578	https://api.github.com/repos/pydata/xarray/issues/7539	IC_kwDOAMm_X85Vu-Zq	benbovy 4160723	2023-02-21T12:13:18Z	2023-02-21T12:13:18Z	MEMBER	In general I also find that `xr.concat` is a powerful feature (incl. auto-alignment and merge options) at the expense that it may sometimes (often?) be hard to reason about. Would it make sense to have a simpler version? To avoid making `xr.concat` signature even more complicated, maybe another top-level function like `xr.concat_noalign`? Or any suggestion in #7045 to deactivate auto-alignment Xarray-wise. Or indeed at least make it clearer in the docs that something like `drop_indexes` or `reset_coords` should be used first in order to skip auto-alignment for some variables. I don't really know what I would prefer to happen with the coordinates. I guess to have created a time coordinate of size {new: 2, time: 4, cols: 2}, but then I don't know what that implies for the underlying index. @benbovy do you have any thoughts? I guess easiest for a concat version with no auto-alignment would be to drop the index when such case happens. (note: one problem in your example is that the Xarray data model still does not allow having a multi-dimensional "time" variable with "time" as also one of its dimensions, but this could be now relaxed). I've been also wondering whether some kind of `NDPandasIndex` would make any sense, i.e., a n-d coordinate variable with an internal 1-d (flattened) pandas index and some logic to convert between those n-d vs. 1-d spaces. This is the kind of approach used in xoak for using a kd-tree with coordinates of arbitrary dimensions, where labels in the form of nd-arrays for each coordinate are mapped into the `[n_points, n_coords]` shape (and inversely for getting the integer indices back as nd-arrays). This works well for point-wise indexing, but I doubt it would be very useful beyond that (e.g., slicing, etc.).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Concat doesn't concatenate dimension coordinates along new dims 1588461863
1431496828	https://github.com/pydata/xarray/issues/7076#issuecomment-1431496828	https://api.github.com/repos/pydata/xarray/issues/7076	IC_kwDOAMm_X85VUuh8	benbovy 4160723	2023-02-15T14:54:27Z	2023-02-15T14:54:27Z	MEMBER	@ACHMartin the issue is when you do `newds['z'] = stacked.z`. In the last versions of Xarray multi-index levels have each their own (real) coordinates, for consistency and clarity we soon won't support assigning a multi-index to a single coordinate of a Dataset / DataArray like that. I think that in other places we still do support it with a deprecation notice, but apparently in your example this is not the case. `unstack` doesn't work because the multi-index(es) and the coordinates of `newds` are not consistent. I don't know exactly what is your real problem, but from now on you should avoid implicitly assign a multi-index with `xr_obj["my_coord"] = ...` or `xr_obj.assign(my_coord=...)`. Instead you should re-create the multi-index, e.g., in your minimal example `newds = newds.set_index(z=["across", "along"])`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Can't unstack concatenated DataArrays 1384465119
1427538729	https://github.com/pydata/xarray/issues/7463#issuecomment-1427538729	https://api.github.com/repos/pydata/xarray/issues/7463	IC_kwDOAMm_X85VFoMp	benbovy 4160723	2023-02-13T08:31:49Z	2023-02-13T09:26:10Z	MEMBER	There are two issues: whether we should continue allowing IndexVariable data be updated in place via `.data` property. IMO we should really deprecate it, especially that now it is possible to have custom, possibly expensive index structures built from one or more coordinates. whether `deep=True` should deep copy the Xarray index objects. I don't have strong opinion on this. There is a similar discussion on the pandas side: https://github.com/pandas-dev/pandas/issues/19862. I wonder if we reverted the change here because some high-level operations in Xarray were by default deep copying the indexes? I don't think we would want such behavior unless the user explicitly sets `deep=True` somewhere?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Coordinates not deep copy 1550792876
1426311006	https://github.com/pydata/xarray/issues/7463#issuecomment-1426311006	https://api.github.com/repos/pydata/xarray/issues/7463	IC_kwDOAMm_X85VA8de	benbovy 4160723	2023-02-10T20:31:10Z	2023-02-10T20:38:48Z	MEMBER	Yes I think we should, but I might have missed the rationale behind allowing it if this is intentional. EDIT: perhaps better to issue a warning first to avoid some breaking change. We could also try to fix it (make a deep copy) at the same time as deprecating it, but that might be tricky without again introducing performance regressions.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Coordinates not deep copy 1550792876
1426299770	https://github.com/pydata/xarray/issues/7463#issuecomment-1426299770	https://api.github.com/repos/pydata/xarray/issues/7463	IC_kwDOAMm_X85VA5t6	benbovy 4160723	2023-02-10T20:25:12Z	2023-02-10T20:25:12Z	MEMBER	I think that the reverting change in IndexVariable came after refactoring copy in Xarray introduced some performance regression (https://github.com/pydata/xarray/pull/7209#issuecomment-1305593478). I didn't see #1463 (https://github.com/pydata/xarray/issues/1463#issuecomment-340454702), though. It feels weird to me that we can mutate an IndexVariable via its `data` property, considering that the underlying index is immutable. IIUC `xarr2.x.data[0] = 45` replaces the full index with a new one? I'm not sure if it is a good idea to allow this. For a pandas index that's probably OK (it is reasonably cheap to rebuild a new index) but for a custom index that is expensive to build (e.g., kd-tree) I don't think this behavior is desirable.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Coordinates not deep copy 1550792876
1422518769	https://github.com/pydata/xarray/issues/2028#issuecomment-1422518769	https://api.github.com/repos/pydata/xarray/issues/2028	IC_kwDOAMm_X85Uyenx	benbovy 4160723	2023-02-08T12:29:27Z	2023-02-08T12:41:00Z	MEMBER	@gewitterblitz there is a kdtree-based index example in #7041 that works with multi-dimensional coordinates. You could also have a look at https://xoak.readthedocs.io/en/latest/ (it doesn't use Xarray indexes - soon hopefully - so the current API is via Xarray accessors). EDIT: seeing your previous https://github.com/pydata/xarray/issues/2028#issuecomment-921926536, not sure how you could use slices for label selection using those indexes as I don't think the wrapped scipy / sklearn kdtree objects support range queries. Other spatial indexes may support it (e.g., there's an example in https://github.com/martinfleis/xvec of selecting points using a `shapely.box`, although currently it only supports 1-d geometry coordinates).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	slice using non-index coordinates 309691307
1421222703	https://github.com/pydata/xarray/issues/2028#issuecomment-1421222703	https://api.github.com/repos/pydata/xarray/issues/2028	IC_kwDOAMm_X85UtiMv	benbovy 4160723	2023-02-07T18:01:39Z	2023-02-07T18:01:39Z	MEMBER	@aberges-grd If your non-index coordinate supports it (I guess it does?), you could assign a default index to the coordinate with `set_xindex` and then use slices for selection like any other (dimension) coordinate backed by a pandas index.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	slice using non-index coordinates 309691307
1384164579	https://github.com/pydata/xarray/issues/7405#issuecomment-1384164579	https://api.github.com/repos/pydata/xarray/issues/7405	IC_kwDOAMm_X85SgKzj	benbovy 4160723	2023-01-16T14:42:23Z	2023-01-16T14:42:23Z	MEMBER	Yes thanks for the report. Looks like `Dataset._coord_names` got out of sync somehow.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Test for variable name in coords True after xr.merge with compat="minimal" 1512708767
1382070832	https://github.com/pydata/xarray/pull/7368#issuecomment-1382070832	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85SYLow	benbovy 4160723	2023-01-13T16:13:16Z	2023-01-13T16:13:16Z	MEMBER	Thanks for the review @shoyer. I addressed your comments. Everything seems OK except a rather annoying mypy error that I'm struggling with: The `DataAlignable` type variable should now encompass both `DataWithCoords` and `Coordinates`, since in this PR we add alignment support for the latter. I somewhat naively tried the options below without success: `DataAlignable = TypeVar("DataAlignable", bound=DataWithCoords \| Coordinates)` -> doesn't work since we cannot mix DataWithCoords and Coordinates when aligning each object (input type = output type) `DataAlignable = TypeVar("DataAlignable", bound=DataWithCoords, Coordinates)` -> doesn't work with subclasses `DataAlignable = TypeVar("DataAlignable", Dataset, DataArray, Coordinates)` -> doesn't work with generic types `T_Dataset`, etc.? I even tried using a Protocol @headtr1ck @Illviljan any idea?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1372908509	https://github.com/pydata/xarray/pull/7418#issuecomment-1372908509	https://api.github.com/repos/pydata/xarray/issues/7418	IC_kwDOAMm_X85R1Ovd	benbovy 4160723	2023-01-05T23:08:15Z	2023-01-05T23:08:15Z	MEMBER	Again, there is likely more good reasons merging the Datatree code with Xarray than not doing it, but IMHO such decision should be made very carefully. You certainly do know better than me what positive vs. negative impacts it would have here! I'm just speaking generally from my experience of having struggled while doing some heavy refactoring in Xarray recently :)	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Import datatree in xarray? 1519552711
1372888139	https://github.com/pydata/xarray/pull/7418#issuecomment-1372888139	https://api.github.com/repos/pydata/xarray/issues/7418	IC_kwDOAMm_X85R1JxL	benbovy 4160723	2023-01-05T22:46:05Z	2023-01-05T22:46:05Z	MEMBER	I don't have strong opinions for or against including datatree in Xarray. It indeed makes sense if it is using many Xarray internals and if there are many existing or potential applications for it. Additional load (CI) is fine if datatree doesn't bring any extra dependency and won't do so in the near future (which seems to be the case). Datatree should become a first-class Xarray object Since Datatree sits above DataArray and Dataset, it should not interfere with any of our existing API. Would it mean that if someone wants to later add any feature "x" or "y" into Xarray, they just need implementing the feature for Dataset (and possibly DataArray) and it will be guaranteed to work with Datatree? (I guess so but I'm not familiar enough with Datatree to know it for sure). Otherwise, if there is any extra implementation effort required to make feature "x" or "y" work with Datatree, then I'm concerned about the additional burden or obstacle for future contributors and maintainers. Or we could say that this is OK to leave datatree support and wait for someone to take care of it later, but I don't think it is ideal to have such non-synchronized state within Xarray itself.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Import datatree in xarray? 1519552711
1359003371	https://github.com/pydata/xarray/pull/7368#issuecomment-1359003371	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85RAL7r	benbovy 4160723	2022-12-20T08:34:06Z	2022-12-20T08:34:06Z	MEMBER	I'm wondering if instead of `Coordinates.from_pandas_multiindex()` we might want to provide a more generic constructor available as an extension point? For example: `Coordinates.from_index(index_obj: Any, , factory=None, *kwargs=None)` `factory` could be guessed from the type of `index_obj`. Xarray would support by default the `pandas.MultiIndex` and `pandas.Index` types. Like for IO backends, we could provide a `CoordinatesFactoryEntrypoint` so that it could support other index types. One downside is that specific (mandatory?) options like `dim` for a pandas (multi-)index are not directly visible. Would it be useful or is it overkill?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1357719218	https://github.com/pydata/xarray/pull/7382#issuecomment-1357719218	https://api.github.com/repos/pydata/xarray/issues/7382	IC_kwDOAMm_X85Q7Say	benbovy 4160723	2022-12-19T14:03:56Z	2022-12-19T14:03:56Z	MEMBER	I don't know if the optimizations added here will benefit a large set of use cases (it took 6 months before seeing an issue report), but it is worth for at least a few of them. This is ready I think (added some benchmarks).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Some alignment optimizations 1498386428
1353034657	https://github.com/pydata/xarray/pull/7382#issuecomment-1353034657	https://api.github.com/repos/pydata/xarray/issues/7382	IC_kwDOAMm_X85Qpauh	benbovy 4160723	2022-12-15T13:05:55Z	2022-12-15T13:05:55Z	MEMBER	Quick benchmark taking the example in #7376 (it seems even much faster than in version 2022.3.0!) ```python version 2022.3.0 %timeit ds.assign(foo=~ds["d3"]) 22.5 ms ± 1.96 ms per loop (mean ± std. dev. of 7 runs, 100 loops each) main branch %timeit ds.assign(foo=~ds["d3"]) 193 ms ± 1.35 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) this PR %timeit ds.assign(foo=~ds["d3"]) 1.01 ms ± 10.7 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each) ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Some alignment optimizations 1498386428
1352989233	https://github.com/pydata/xarray/issues/7376#issuecomment-1352989233	https://api.github.com/repos/pydata/xarray/issues/7376	IC_kwDOAMm_X85QpPox	benbovy 4160723	2022-12-15T12:27:37Z	2022-12-15T12:27:37Z	MEMBER	Thanks @benbovy! Are you also aware of the issue with plain assign being slower on MultiIndex (comment above: https://github.com/pydata/xarray/issues/7376#issuecomment-1350446546)? Do you know what could be the issue there by any chance? I see that in `ds.assign(foo=~ds["d3"])`, the coordinates of `~ds["d3"]` are dropped (#2087), which triggers re-indexing of the multi-index when aligning `ds` with `~ds["d3"]`. This is a quite expensive operation. It is not clear to me what would be a clean fix (see, e.g., #2180), but we could probably optimize the alignment logic so that when all unindexed dimension sizes match with indexed dimension sizes (like your example) no re-indexing is performed.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	groupby+map performance regression on MultiIndex dataset 1495605827
1352874809	https://github.com/pydata/xarray/pull/7368#issuecomment-1352874809	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85Qozs5	benbovy 4160723	2022-12-15T10:42:59Z	2022-12-15T10:42:59Z	MEMBER	OK this is now ready for review (cc @shoyer).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1352818155	https://github.com/pydata/xarray/pull/7368#issuecomment-1352818155	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85Qol3r	benbovy 4160723	2022-12-15T09:59:03Z	2022-12-15T09:59:03Z	MEMBER	Maybe there's some way to optimize that? I don't know if we can completely avoid it with the solution implemented in this PR, though. Promoting Coordinates is pretty clean and future proof IMO (assuming that we'll further refactor Coordinates to actually store variables and indexes, i.e., not as a proxy anymore). Is the (minor? temporary?) regression in performance acceptable and can we just leave it like that for now? Fixed in 193dad3 (with some reasonable special case added in `merge_core`).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1352310432	https://github.com/pydata/xarray/pull/7368#issuecomment-1352310432	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85Qmp6g	benbovy 4160723	2022-12-14T22:33:23Z	2022-12-15T01:08:41Z	MEMBER	I did some profiling to find the cause of the decrease in performance reported in the benchmarks (dataset creation). In summary, this is explained by a `Coordinates` object (built from the `coords` mapping) that is now included in objects to align when merging data vars and coordinates. Previously all non DataArray objects in the `coords` mapping were excluded from alignment (in `deep_align`). The introduced overhead comes from a call to `Coordinates._reindex_callback()`, which (I think?) should do no more than shallow copies and/or xarray wrapping stuff. In the benchmark report this is only marked as significant when creating small datasets (1.5-2x slower), and it becomes insignificant for datasets with more data variables. Maybe there's some way to optimize that? I don't know if we can completely avoid it with the solution implemented in this PR, though. Promoting `Coordinates` is pretty clean and future proof IMO (assuming that we'll further refactor `Coordinates` to actually store variables and indexes, i.e., not as a proxy anymore). Is the (minor? temporary?) regression in performance acceptable and can we just leave it like that for now? More details about the new workflow implemented in this PR when creating a new Dataset: if Dataset's `coords` argument is a "simple" mapping, it is first internally converted into a `Coordinates` object, with the creation of default indexes for dimension coordinates if one or more DataArray objects are given in `coords`, their coordinates (variables + indexes) are extracted and merged with the other input coordinates see the implementation in `xarray.core.coordinates.create_coords_with_default_indexes` otherwise, just reuse the `Coordinates` object passed as `coords` coordinates are then merged with data variables the `Coordinates` object is aligned with every other "alignable" object found in `data_vars` coordinate indexes (if any) are passed explicitly to `align` so they are used in priority explicitly using a `Coordinates` object skips the creation of default indexes during merging (in `collect_variables_and_indexes()`)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1352318926	https://github.com/pydata/xarray/issues/7376#issuecomment-1352318926	https://api.github.com/repos/pydata/xarray/issues/7376	IC_kwDOAMm_X85Qmr_O	benbovy 4160723	2022-12-14T22:43:11Z	2022-12-14T22:47:37Z	MEMBER	Are you aware of any workarounds for this issue with the current code (assuming I would like to preserve MultiIndex). Unfortunately I don't know about any workaround that would preserve the MultiIndex. Depending on how you use the multi-index, you could instead set two single indexes for "i1" and "i2" respectively (it is supported now, use `set_xindex()`). I think that groupby will work well in that case. If you really need a multi-index, you could still build it afterwards from the groupby result.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	groupby+map performance regression on MultiIndex dataset 1495605827
1350738301	https://github.com/pydata/xarray/issues/7376#issuecomment-1350738301	https://api.github.com/repos/pydata/xarray/issues/7376	IC_kwDOAMm_X85QgqF9	benbovy 4160723	2022-12-14T09:40:57Z	2022-12-14T09:40:57Z	MEMBER	Thanks for the report @ravwojdyla. Since #5692, multi-indexes level have each their own coordinate variable so copying takes a bit more time as we need to create more variables. Not sure what's happening with `_maybe_cast_to_cftimeindex`, though. The real issue here, however, is the same than in #6836. In your example, `.groupby("i1")` creates 400 000 groups whereas it should create only 4 groups.	{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 1 }	groupby+map performance regression on MultiIndex dataset 1495605827
1349321538	https://github.com/pydata/xarray/pull/7368#issuecomment-1349321538	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85QbQNC	benbovy 4160723	2022-12-13T18:03:17Z	2022-12-13T18:03:17Z	MEMBER	I think this is ready for review!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1347327518	https://github.com/pydata/xarray/pull/7368#issuecomment-1347327518	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85QTpYe	benbovy 4160723	2022-12-12T21:05:56Z	2022-12-12T21:05:56Z	MEMBER	In order to skip creating default indexes when passing a `Coordinates` object, I first tried a small refactor but in the end I found that the cleanest way to do it was to support alignment for `Coordinates`. I think it makes sense now that Coordinates is part of Xarray's public API as a "stand-alone" container like Dataset and DataArray. The "no default index with Coordinates" behavior should be consistent Xarray-wise, i.e., for DataArray / Dataset constructors and also `assign_coords`, `update`, etc. Sorry this PR is getting big, but hopefully this is almost ready (still a few tests to fix or to add).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1346344694	https://github.com/pydata/xarray/pull/7368#issuecomment-1346344694	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85QP5b2	benbovy 4160723	2022-12-12T11:55:10Z	2022-12-12T11:55:10Z	MEMBER	My suggestion would be: coords passed as a dict: create default indexes coords passed as IndexedCoordinates: do not create defaults So if we already have some coordinate data as a dict but don't want any default index, we would need to do this: `python ds = xr.Dataset(coords=xr.Coordinates(my_coord_dict))` instead of this: `python ds = xr.Dataset(coords=my_coord_dict)`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1346091151	https://github.com/pydata/xarray/pull/7368#issuecomment-1346091151	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85QO7iP	benbovy 4160723	2022-12-12T08:36:09Z	2022-12-12T08:36:09Z	MEMBER	Thanks @shoyer, I've been thinking about similar short/long term plans although so far I haven't figured out how to implement your point 3. I'll give it another try.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1345314909	https://github.com/pydata/xarray/pull/7368#issuecomment-1345314909	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85QL-Bd	benbovy 4160723	2022-12-10T16:59:44Z	2022-12-10T16:59:44Z	MEMBER	Long term, do you think it would make sense to merge together Indexes, Coordinates and IndexedCoordinates? They are sort of all containers for the same thing. Yes I think so. I'm actually trying to merge `IndexedCoordinates` with `Coordinates` but I'm stuck: the latter is abstract and I don't really see how I could refactor it together with `DatasetCoordinates` and `DataArrayCoordinates`. Do you have any idea on how best to proceed? Ideally, I'd see `Coordinates` be exposed in Xarray's main namespace with at least the two following constructors: ```python class Coordinates: `def __init__( self, coords: Mapping[Any, Any] \| None = None, indexes: Mapping[Any, Index] \| None = None, ): # Similar to Dataset.__init__ but without the need # to merge coords and data vars... # Probably ok to allow more flexibility / less safety here? ... @classmethod from_pandas_multiindex(cls, index: pd.MultiIndex, dim: str): ...` ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1344046801	https://github.com/pydata/xarray/pull/7368#issuecomment-1344046801	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85QHIbR	benbovy 4160723	2022-12-09T09:13:24Z	2022-12-09T09:16:35Z	MEMBER	I added `IndexedCoordinates.merge_coords` so that it is easier to combine different coordinates to pass to a new Dataset / DataArray, e.g., ```python midx = pd.MultiIndex.from_product([["a", "b"], [1, 2]], names=("one", "two")) coords = xr.IndexedCoordinates.from_pandas_multiindex(midx, "x") coords = coords.merge_coords({"y": [0, 1, 2]}) Coordinates: * x (x) object MultiIndex * one (x) object 'a' 'a' 'b' 'b' * two (x) int64 1 2 1 2 * y (y) int64 0 1 2 ds = xr.Dataset(coords=coords) <xarray.Dataset> Dimensions: (x: 4) Coordinates: * x (x) object MultiIndex * one (x) object 'a' 'a' 'b' 'b' * two (x) int64 1 2 1 2 * y (y) int64 0 1 2 Data variables: empty ``` `IndexedCoordinates.merge_coords` is very much like `Coordinates.merge` except that it returns a new Coordinates object instead of a Dataset. Or should we just use `merge`? It would require that: `Coordinates.merge` accepts `Mapping[Any, Any]` for its `other` argument. Only changing the type hint is enough here since the implementation already accepts any input passed to Dataset. When a Dataset is passed as `coords` argument to a new Dataset and DataArray, both variables and indexes should be extracted. It is already the case for Dataset but I think it only works for PandasIndex and PandasMultiIndex (default indexes & backwards compatibility).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1344004727	https://github.com/pydata/xarray/pull/7368#issuecomment-1344004727	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85QG-J3	benbovy 4160723	2022-12-09T08:32:28Z	2022-12-09T09:14:17Z	MEMBER	`IndexedCoordinates` and `Indexes` have a lot of overlap. At some point we might consider merging the two classes, like @shoyer suggests in https://github.com/pydata/xarray/pull/7214#issuecomment-1295283938. The main difference is that one is a mapping of coordinates and the other is a mapping of indexes. `IndexedCoordinates` is mostly reusing `Indexes` and `Dataset` under the hood, it is only a facade. Alternatively to an `IndexedCoordinates` subclass I was wondering if we could reuse the `Coordinates` base class? There's some benefit of providing a subclass: besides specific constructors like `.from_pandas_multiindex()` it has a generic `__init__` for advanced use cases. Not sure it is a good idea to add this constructor to the base class? unlike Coordinates, IndexedCoordinates is immutable. What if the `Indexes` class was a facade based on `IndexedCoordinates` instead of the other way around? It would probably make more sense but it would also be a bigger refactor. I've chosen the easy way :).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1335509983	https://github.com/pydata/xarray/pull/7347#issuecomment-1335509983	https://api.github.com/repos/pydata/xarray/issues/7347	IC_kwDOAMm_X85PmkPf	benbovy 4160723	2022-12-02T16:33:59Z	2022-12-02T16:33:59Z	MEMBER	Great! (I was worried that it would mess up #7345).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Fix assign_coords resetting all dimension coords to default index 1472483025
1334986216	https://github.com/pydata/xarray/pull/7347#issuecomment-1334986216	https://api.github.com/repos/pydata/xarray/issues/7347	IC_kwDOAMm_X85PkkXo	benbovy 4160723	2022-12-02T09:35:42Z	2022-12-02T09:35:42Z	MEMBER	@dcherian we can merge this after #7345 to make things easier for the release?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Fix assign_coords resetting all dimension coords to default index 1472483025
1326262197	https://github.com/pydata/xarray/issues/7045#issuecomment-1326262197	https://api.github.com/repos/pydata/xarray/issues/7045	IC_kwDOAMm_X85PDSe1	benbovy 4160723	2022-11-24T10:35:02Z	2022-11-24T10:35:02Z	MEMBER	I find the analogy with relational databases quite meaningful! Rectangular grids likely have been the primary use case in Xarray for a long time, but I wonder to which extent it is the case nowadays. Probably a good question to ask for the next user survey? Interestingly, the 2021 user survey results () show that "interoperability with pandas" is not a critical feature while "label-based indexing, interpolation, groupby, reindexing, etc." is most important, although the description of the latter is rather broad. It would be interesting to compute the correlation between these two variables. The results also show that "more flexible indexing (selection, alignment)" is very useful or critical for 2/3 of the participants. Not sure how to interpret those results within the context of this discussion, though. () The 2022 user survey results doesn't show significant differences in general suppose one could in principle have an array with coordinates such that none of the coordinates aligned with any particular axis, but it seems improbable. Not that improbable for unstructured meshes, curvilinear grids, staggered grids, etc. Xarray is often chosen to handle them too (e.g., uxarray, xgcm).	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Should Xarray stop doing automatic index-based alignment? 1376109308
1324753837	https://github.com/pydata/xarray/issues/7297#issuecomment-1324753837	https://api.github.com/repos/pydata/xarray/issues/7297	IC_kwDOAMm_X85O9iOt	benbovy 4160723	2022-11-23T09:17:33Z	2022-11-23T09:17:33Z	MEMBER	But does this still work properly with broadcasting? For example, let's say there is another data variable b (midx) and an operation is done like ds_stacked['c'] = ds_stacked.a + ds_stacked.b. Then it should be that c (midx) and a (x) should be "repeated" to midx.x I think it would keep things much simpler if we consider "x" and "midx" as two separate dimensions in the stacked Dataset, i.e., ds_stacked['c'] would result in a 2-d array (x, midx). There's no such thing like a "midx.x" dimension in Xarray.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	stack().unstack() not the same as original for datavars dependent on single coordinate of multi_index 1454832041
1323849354	https://github.com/pydata/xarray/issues/7297#issuecomment-1323849354	https://api.github.com/repos/pydata/xarray/issues/7297	IC_kwDOAMm_X85O6FaK	benbovy 4160723	2022-11-22T15:24:53Z	2022-11-22T15:36:46Z	MEMBER	The last example in your comment is probably the most meaningful one: ``` <xarray.Dataset> Dimensions: (x: 2, midx: 4) Coordinates: * midx (midx) object MultiIndex * x (midx) int32 1 1 2 2 * y (midx) int32 3 4 3 4 Data variables: a (x) int32 6 7 ``` To avoid name conflicts, we could just discard the original dimension coordinates x and y. Like here above, "x" becomes a dimension without coordinate. In that example, when unstacking we would retrieve the "x" dimension coordinate like in the original dataset. (note: I think it is now possible to have a dimension "x" and a coordinate "x" with different dimensions, but I haven't checked).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	stack().unstack() not the same as original for datavars dependent on single coordinate of multi_index 1454832041
1323478134	https://github.com/pydata/xarray/issues/7297#issuecomment-1323478134	https://api.github.com/repos/pydata/xarray/issues/7297	IC_kwDOAMm_X85O4qx2	benbovy 4160723	2022-11-22T10:50:01Z	2022-11-22T10:50:01Z	MEMBER	Interesting! I don't think that when adding stack / unstack we were thinking that variables with only a subset of the stacked dimensions would be a common use case. I guess it would be possible to add some option to stack only the variables that have all the dimensions to be stacked, and leave the other variables unchanged? However, one problem with keeping the original dimension coordinates is that we would have name conflicts between the single index coordinates and the multi-index coordinates. In your expected example, the "x" coordinate is part of the multi-index but it doesn't have the same dimension "midx"? I find it rather confusing.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	stack().unstack() not the same as original for datavars dependent on single coordinate of multi_index 1454832041
1316230358	https://github.com/pydata/xarray/issues/7278#issuecomment-1316230358	https://api.github.com/repos/pydata/xarray/issues/7278	IC_kwDOAMm_X85OdBTW	benbovy 4160723	2022-11-16T02:57:48Z	2022-11-16T02:57:48Z	MEMBER	👍 Use it at your own risk 😉	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	remap_label_indexers removed without deprecation update? 1444752393
1313866757	https://github.com/pydata/xarray/issues/7250#issuecomment-1313866757	https://api.github.com/repos/pydata/xarray/issues/7250	IC_kwDOAMm_X85OUAQF	benbovy 4160723	2022-11-14T14:45:39Z	2022-11-14T14:45:39Z	MEMBER	That's a bug in this method: https://github.com/pydata/xarray/blob/6f9e33e94944f247a5c5c5962a865ff98a654b30/xarray/core/indexing.py#L1528-L1532 Xarray array wrappers for pandas indexes keep track of the original dtype and should restore it when converted into numpy arrays. Something like this should work for the same method: `python def __array__(self, dtype: DTypeLike = None) -> np.ndarray: if dtype is None: dtype = self.dtype if self.level is not None: return np.asarray( self.array.get_level_values(self.level).values, dtype=dtype ) else: return super().__array__(dtype)`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	stack casts int32 dtype coordinate to int64 1433998942
1313748084	https://github.com/pydata/xarray/issues/6836#issuecomment-1313748084	https://api.github.com/repos/pydata/xarray/issues/6836	IC_kwDOAMm_X85OTjR0	benbovy 4160723	2022-11-14T13:55:02Z	2022-11-14T13:55:02Z	MEMBER	we can fix that in safe_cast_to_index() ...we cannot fix that in `safe_cast_to_index()` (or we can add a parameter to specify the desired result).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	groupby(multi-index level) not working correctly on a multi-indexed DataArray or DataSet 1318992926
1313741685	https://github.com/pydata/xarray/issues/7282#issuecomment-1313741685	https://api.github.com/repos/pydata/xarray/issues/7282	IC_kwDOAMm_X85OTht1	benbovy 4160723	2022-11-14T13:51:21Z	2022-11-14T13:51:21Z	MEMBER	Thanks @jjpr-mit and @mschrimpf for the report. See https://github.com/pydata/xarray/issues/6836#issuecomment-1313739883.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	groupby and mean on a MultiIndex level raises ValueError 1445905299
1313739883	https://github.com/pydata/xarray/issues/6836#issuecomment-1313739883	https://api.github.com/repos/pydata/xarray/issues/6836	IC_kwDOAMm_X85OThRr	benbovy 4160723	2022-11-14T13:49:47Z	2022-11-14T13:49:47Z	MEMBER	From #7282 it looks like we need to convert the multi-index level to a single index when casting the group to an index. And from #7105 we can fix that in `safe_cast_to_index()` (sometimes the full multi-index is expected) so we probably need a special case in `groupby`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	groupby(multi-index level) not working correctly on a multi-indexed DataArray or DataSet 1318992926
1311942192	https://github.com/pydata/xarray/issues/7278#issuecomment-1311942192	https://api.github.com/repos/pydata/xarray/issues/7278	IC_kwDOAMm_X85OMqYw	benbovy 4160723	2022-11-11T16:52:54Z	2022-11-11T16:52:54Z	MEMBER	You may look at the logic implemented in the `map_index_queries()` function in `xarray.core.indexing`. This function is still not public API, but it calls `.sel()` for each index object, which should be more stable (although experimental). Eventually we'll probably make `merge_sel_results()` public too. It might be useful for third-party indexes.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	remap_label_indexers removed without deprecation update? 1444752393
1305780610	https://github.com/pydata/xarray/issues/6308#issuecomment-1305780610	https://api.github.com/repos/pydata/xarray/issues/6308	IC_kwDOAMm_X85N1KGC	benbovy 4160723	2022-11-07T15:28:35Z	2022-11-07T15:28:35Z	MEMBER	The kind of data wrapped in an Xarray Dataset (e.g., a Numpy array, a Dask array or any other array #5648) is already something useful that `xr.doctor` or `xr.describe` may tell! From my experience of introducing Xarray to new users, they often completely ignore what is under the hood until something or someone makes them aware, likely after they experience some weird behavior or performance issue that is hard to figure out by themselves. Xarray objects are flexible container wrappers connected to a wide range of other Python libraries, such that it is hard to give a short introduction that covers all the important aspects (lazy / non-lazy, chunked / non-chunked, etc.). For example, it may be possible that someone who has never heard of Dask nor Zarr follows an Xarray tutorial that starts by opening a chunked dataset from a zarr store. In this case the rich repr of the Xarray Dataset doesn't even help. Rather than a performance report or a profiling tool, the proposal here (still very elusive) is to provide a helper function that returns some information and explanation in plain english (why not with some hyperlinks, pretty printing, etc.) that would help users making sense of an Xarray object and its wrapped data/metadata. Some kind of interactive documentation very specific to the actual Xarray object. Some kind of smart tool that would partially "replace" custom (though very basic) user support.	{ "total_count": 2, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 2, "rocket": 0, "eyes": 0 }	xr.doctor(): diagnostics on a Dataset / DataArray ? 1151751524
1305593478	https://github.com/pydata/xarray/pull/7209#issuecomment-1305593478	https://api.github.com/repos/pydata/xarray/issues/7209	IC_kwDOAMm_X85N0caG	benbovy 4160723	2022-11-07T13:09:05Z	2022-11-07T13:09:05Z	MEMBER	The change in `Variable.to_index_variable` seems sensible (not sure when one wants a deep copy of an `IndexVariable` or an Xarray / Pandas index). `to_index_variable` may be called in some core functions of Xarray internals (e.g., in `as_variable()`) so it might be tricky to benchmark its effect Xarray-wise. Perhaps it would be good to track it down in the original issue #7181?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Optimize some copying 1421441672
1297046405	https://github.com/pydata/xarray/pull/7214#issuecomment-1297046405	https://api.github.com/repos/pydata/xarray/issues/7214	IC_kwDOAMm_X85NT1uF	benbovy 4160723	2022-10-31T12:54:50Z	2022-10-31T12:54:50Z	MEMBER	Thanks for the suggestion @shoyer, in general I like it very much! "Coordinates possibly baked by one or more indexes" feels much more natural than "indexes and their corresponding coordinates". Even though indexes have been promoted as 1st class citizens in the data model, their right place should still be in the background compared to coordinates. So having a `Coordinates` object that encapsulates the indexes makes a lot of sense to me. My main concern is about the timing, as such a broader refactor might postpone some work in progress on the public API and the documentation. Ideally this shouldn't discourage users to start experimenting with custom indexes and building an ecosystem around it, as soon as possible. There might be a fast path towards your suggestion, at least regarding the public facing API (your points 1 and 4): Keep "private" the constructor of `Indexes` and keep it immutable. Add a new `IndexedCoordinates(Coordinates)` class. Unlike `DatasetCoordinates` and `DataArrayCoordinates`, it would have a public constructor and/or alternative class methods (e.g., `.from_pandas_multi_index()` suggested by @dcherian) In general, passing any `Coordinates` object to `coords` would assign both the coordinates and the indexes. This would let us the possibility to achieve a broader (mostly internal) refactor of `Indexes` and `Coordinates` objects later without the risk of introducing too much breaking changes. Alternatively, we could just wait for that refactor to finish before implementing explicit assignment of coordinates and indexes. We already have `.set_xindex()` and `.drop_indexes()` that are relevant and we could wait before deprecating `xr.Dataset(coords={"x": pandas_midx})`. Not sure when such big refactor will happen, though, the wait could be long.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes directly to the DataArray and Dataset constructors 1422543378
1294783661	https://github.com/pydata/xarray/pull/7214#issuecomment-1294783661	https://api.github.com/repos/pydata/xarray/issues/7214	IC_kwDOAMm_X85NLNSt	benbovy 4160723	2022-10-28T09:49:02Z	2022-10-28T09:49:02Z	MEMBER	not necessarily do consistency checks (beyond verifying that the coordinate variables exist). I'd just want to add that, from my experience with debugging multi-index issues, it is hard even for advanced users to see what's going wrong when coordinates and indexes are not consistent.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes directly to the DataArray and Dataset constructors 1422543378
1294771427	https://github.com/pydata/xarray/pull/7214#issuecomment-1294771427	https://api.github.com/repos/pydata/xarray/issues/7214	IC_kwDOAMm_X85NLKTj	benbovy 4160723	2022-10-28T09:38:22Z	2022-10-28T09:38:22Z	MEMBER	Maybe a more generic Indexes class method that could be reused by 3rd-party indexes too? E.g., via some kind of hook or entrypoint... An `Indexes` accessor? Or this is going too far? 🙂	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes directly to the DataArray and Dataset constructors 1422543378
1293946521	https://github.com/pydata/xarray/pull/7214#issuecomment-1293946521	https://api.github.com/repos/pydata/xarray/issues/7214	IC_kwDOAMm_X85NIA6Z	benbovy 4160723	2022-10-27T19:04:19Z	2022-10-27T19:52:21Z	MEMBER	Explicitly providing indexes is an advanced user feature. Agreed. However, `xr.Dataset(coords={"x": pandas_midx})` is something that presumably a lot of users rely on (it is used extensively in Xarray's tests) and that we should really deprecate IMO. If we don't provide a convenient alternative, I expect many of those users will complain. it's easier to explicitly manipulate indexes in the form of a dict While generally I also prefer handling plain `dict` objects over custom dict-like objects, here I don't see much reasons of manipulating Xarray index objects independently of their coordinate variables. `Indexes` allows keeping them tied together, and it is already returned by `.xindexes`. EDIT -- For more context: initially an `Indexes` object was almost equivalent to a `Frozen(obj._indexes)`. In #5692 I tried hard and struggled to keep dealing with separate dicts of indexes and indexed variables, but in the end it made things much easier to encapsulate the variables in `Indexes`, which is also used internally in different places.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes directly to the DataArray and Dataset constructors 1422543378
1293902008	https://github.com/pydata/xarray/pull/7214#issuecomment-1293902008	https://api.github.com/repos/pydata/xarray/issues/7214	IC_kwDOAMm_X85NH2C4	benbovy 4160723	2022-10-27T18:21:02Z	2022-10-27T18:21:02Z	MEMBER	How about Indexes.from_pandas_multi_index() classmethod? Yes that would make sense. However, it would be adding another `pandas.MultiIndex` special case while we'd like to remove them in Xarray. Maybe a more generic `Indexes` class method that could be reused by 3rd-party indexes too? E.g., via some kind of hook or entrypoint... The tricky thing is that arguments would probably differ much from one index type to another. does indexes get merged with existing ._indexes? Indexes are not merged together but the new / replaced coordinate variables must be compatible with the other variables of the dataset. `Dataset.assign_indexes(indexes)` is actually implemented like this: python def assign_indexes(self, indexes: Indexes[Index]): ds_indexes = Dataset(indexes=indexes) return ( self # prepare drop-in index / coordinate replacement .drop_vars(indexes, errors="ignore") # ensure the new indexes / coordinates are compatible with the Dataset .merge( ds_indexes, compat="minimal", # probably not the right option? join="override", # fastest option? (no real effect because of `drop_vars`) combine_attrs="no_conflicts", ) ) Can we extract enough information from Index to have xr.merge(Indexes) -> Indexes work? That is actually a good idea for https://github.com/pydata/xarray/pull/7214#issuecomment-1292089179! Not sure I would reuse `xr.merge()` for this as it would make the API messy, but why not an `xr.merge_indexes()` top-level function or an `Indexes.merge()` method?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes directly to the DataArray and Dataset constructors 1422543378
1293860075	https://github.com/pydata/xarray/pull/7221#issuecomment-1293860075	https://api.github.com/repos/pydata/xarray/issues/7221	IC_kwDOAMm_X85NHrzr	benbovy 4160723	2022-10-27T17:40:52Z	2022-10-27T17:40:52Z	MEMBER	Thanks @hmaarrfk! I haven't fully understood why we had that code though? Me neither. I don't remember ever seeing this assertion error raised while refactoring things. Any idea @shoyer?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Remove debugging slow assert statement 1423312198
1293624950	https://github.com/pydata/xarray/pull/7222#issuecomment-1293624950	https://api.github.com/repos/pydata/xarray/issues/7222	IC_kwDOAMm_X85NGyZ2	benbovy 4160723	2022-10-27T14:37:10Z	2022-10-27T14:37:10Z	MEMBER	Thanks @hmaarrfk! I think the rapid return, helps by about 40% is still pretty good. Yes definitely. I think we just forgot to add it. However, I will argue that Aligner should really not be a class. The reason of using a class is mainly for better code readability and also so that it is easier to refactor later. The alignment logic is really complex with lots of intermediate objects that are created and/or used at various stages. Probably using functions with some custom containers would have achieved the same goal, to be fair. This part of Xarray internals still deserves to be improved, but that would be a lot of work especially for such a critical piece of code in Xarray.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Actually make the fast code path return early for Aligner.align 1423321834
1293531607	https://github.com/pydata/xarray/pull/7214#issuecomment-1293531607	https://api.github.com/repos/pydata/xarray/issues/7214	IC_kwDOAMm_X85NGbnX	benbovy 4160723	2022-10-27T13:31:24Z	2022-10-27T13:42:44Z	MEMBER	I also added an `.assign_indexes()` method that may be quite convenient. Like for the constructors, it only accepts an `Indexes` instance. ```python ds = xr.Dataset(coords={"x": [4, 5, 6, 7]}) ds2 = xr.Dataset(coords={"x": [1, 2, 3, 4]}) ds.assign_indexes(ds2.xindexes) <xarray.Dataset> Dimensions: (x: 4) Coordinates: * x (x) int64 1 2 3 4 Data variables: empty midx = pd.MultiIndex.from_product([["a", "b"], [1, 2]], names=("one", "two")) indexes = wrap_pandas_multiindex(midx, "x") ds.assign_indexes(indexes) <xarray.Dataset> Dimensions: (x: 4) Coordinates: * x (x) object MultiIndex * one (x) object 'a' 'a' 'b' 'b' * two (x) int64 1 2 1 2 Data variables: empty ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes directly to the DataArray and Dataset constructors 1422543378
1293545325	https://github.com/pydata/xarray/pull/7214#issuecomment-1293545325	https://api.github.com/repos/pydata/xarray/issues/7214	IC_kwDOAMm_X85NGe9t	benbovy 4160723	2022-10-27T13:41:50Z	2022-10-27T13:41:50Z	MEMBER	@pydata/xarray I'd be very happy if you could share your thoughts about the examples shown in the last three comments. If you think the API looks good like that, then I will work on adding some tests and on the documentation.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes directly to the DataArray and Dataset constructors 1422543378
1292089179	https://github.com/pydata/xarray/pull/7214#issuecomment-1292089179	https://api.github.com/repos/pydata/xarray/issues/7214	IC_kwDOAMm_X85NA7db	benbovy 4160723	2022-10-26T13:54:22Z	2022-10-26T13:54:22Z	MEMBER	Passing multiple indexes: ```python midx1 = pd.MultiIndex.from_product([["a", "b"], [1, 2]], names=("one", "two")) midx2 = pd.MultiIndex.from_product([["c", "d"], [3, 4]], names=("three", "four")) indexes1 = wrap_pandas_multiindex(midx1, "x") indexes2 = wrap_pandas_multiindex(midx2, "y") indexes = Indexes( indexes=dict(indexes1, indexes2), variables=dict(indexes1.variables, indexes2.variables) ) ds = xr.Dataset(indexes=indexes) <xarray.Dataset> Dimensions: (x: 4, y: 4) Coordinates: * x (x) object MultiIndex * one (x) object 'a' 'a' 'b' 'b' * two (x) int64 1 2 1 2 * y (y) object MultiIndex * three (y) object 'c' 'c' 'd' 'd' * four (y) int64 3 4 3 4 Data variables: empty ``` That's not looking super nice, but probably we can add some convenience function or `Indexes` method.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes directly to the DataArray and Dataset constructors 1422543378
1291911349	https://github.com/pydata/xarray/pull/7214#issuecomment-1291911349	https://api.github.com/repos/pydata/xarray/issues/7214	IC_kwDOAMm_X85NAQC1	benbovy 4160723	2022-10-26T11:47:57Z	2022-10-26T12:14:23Z	MEMBER	I implemented option 3. We can still change or revert it later if it's not the best one. A few examples: ```python import pandas as pd import xarray as xr from xarray.indexes import wrap_pandas_multiindex midx = pd.MultiIndex.from_product([["a", "b"], [1, 2]], names=("one", "two")) ``` It is now possible to pass a pandas multi-index to a Dataset like this: ```python this returns an `Indexes` object (indexes + coordinates) indexes = wrap_pandas_multiindex(midx, "x") ds = xr.Dataset(indexes=indexes) <xarray.Dataset> Dimensions: (x: 4) Coordinates: * x (x) object MultiIndex * one (x) object 'a' 'a' 'b' 'b' * two (x) int64 1 2 1 2 Data variables: empty ``` IMO the above should be preferred over passing it as a coordinate (should we deprecate it now?): ```python ds_deprecated = xr.Dataset(coords={"x": midx}) ds_deprecated.identical(ds) True eventually this would behave like this: ds_midx_as_array = xr.Dataset(coords={"x": midx}) <xarray.Dataset> Dimensions: (x: 4) Coordinates: * x (x) object ('a', 1) ('a', 2) ('b', 1) ('b', 2) Data variables: empty ``` We can pass indexes around from one Xarray object to another, e.g., ```python da = xr.DataArray([1, 2, 3, 4], dims="x", indexes=ds.xindexes) <xarray.DataArray (x: 4)> array([1, 2, 3, 4]) Coordinates: * x (x) object MultiIndex * one (x) object 'a' 'a' 'b' 'b' * two (x) int64 1 2 1 2 ``` Skip creating pandas indexes for dimension coordinates: ```python ds_noindex = xr.Dataset(coords={"x": [0, 1, 2]}, indexes={}) <xarray.Dataset> Dimensions: (x: 3) Coordinates: x (x) int64 0 1 2 Data variables: empty ds_noindex.xindexes Indexes: empty ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes directly to the DataArray and Dataset constructors 1422543378
1291638319	https://github.com/pydata/xarray/pull/7214#issuecomment-1291638319	https://api.github.com/repos/pydata/xarray/issues/7214	IC_kwDOAMm_X85M_NYv	benbovy 4160723	2022-10-26T07:52:35Z	2022-10-26T07:52:35Z	MEMBER	For passing multiple indexes at once we could probably expand the Indexes API, e.g., with an .update() method. Maybe with something else than `.update()` (let's keep `Indexes` an immutable collection?)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes directly to the DataArray and Dataset constructors 1422543378
1291059643	https://github.com/pydata/xarray/pull/7214#issuecomment-1291059643	https://api.github.com/repos/pydata/xarray/issues/7214	IC_kwDOAMm_X85M9AG7	benbovy 4160723	2022-10-25T19:50:57Z	2022-10-25T19:50:57Z	MEMBER	Hmm I'm wondering what would be best between the options below regarding the types for the `indexes` argument: `Indexes[Index]` \| `Sequence[Indexes[Index] \| None` `Indexes[Index] \| None` `Mapping[Any, Index] \| None` Any other suggestion? Option 1 is nice for passing multiple indexes, e.g., ```python pd_midx1 = pd.MultiIndex.from_arrays(..., names=("one", "two")) pd_midx2 = pd.MultiIndex.from_arrays(..., , names=("three", "four")) indexes1 = PandasMultiIndex.from_pandas_index(pd_midx1, "x") indexes2 = PandasMultiIndex.from_pandas_index(pd_midx2, "y") ds = xr.Dataset(indexes=[indexes1, indexes2]) ``` With option 1 it feels odd passing an empty list in order to avoid creating default indexes: `ds = xr.Dataset(indexes=[])`. Not really better in this regard with option 2: `ds = xr.Dataset(indexes=Indexes())`. Option 3 is better IMO: `ds = xr.Dataset(indexes={})`. Option 3 actually works in all cases since `Indexes[Index]` is a sub-type of `Mapping[Any, Index]`. However, it is not clear from this generic type that any non-empty mapping must be an instance of `Indexes` (because the latter also contains the coordinate variables). I'm leaning towards option 3. For passing multiple indexes at once we could probably expand the `Indexes` API, e.g., with an `.update()` method. What do people think?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes directly to the DataArray and Dataset constructors 1422543378
1290454937	https://github.com/pydata/xarray/issues/6392#issuecomment-1290454937	https://api.github.com/repos/pydata/xarray/issues/6392	IC_kwDOAMm_X85M6seZ	benbovy 4160723	2022-10-25T12:19:52Z	2022-10-25T12:19:52Z	MEMBER	I'm thinking of only accepting one or more instances of Indexes as `indexes` argument in the Dataset and DataArray constructors. The only exception is when `fastpath=True` a mapping can be given directly. It is much easier to handle: just check that keys returned by `Indexes.variables` do no conflict with the coordinate names in the `coords` argument It is slightly safer: it requires the user to explicitly create an `Indexes` object, thus with less chance to accidentally provide coordinate variables and index objects that do not relate to each other (we could probably add some safe guards in the `Indexes` class itself) It is more convenient: an Xarray `Index` may provide a factory method that returns an instance of `Indexes` that we just need to pass as `indexes`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes to the Dataset and DataArray constructors 1175329407
1285038821	https://github.com/pydata/xarray/pull/7185#issuecomment-1285038821	https://api.github.com/repos/pydata/xarray/issues/7185	IC_kwDOAMm_X85MmCLl	benbovy 4160723	2022-10-20T06:59:04Z	2022-10-20T06:59:04Z	MEMBER	🚀	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	indexes section in the HTML repr 1413425793
1283994902	https://github.com/pydata/xarray/pull/7185#issuecomment-1283994902	https://api.github.com/repos/pydata/xarray/issues/7185	IC_kwDOAMm_X85MiDUW	benbovy 4160723	2022-10-19T13:13:39Z	2022-10-19T13:13:39Z	MEMBER	LGTM, that's awesome! It will be super handy for quick debugging and experimenting with custom indexes.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	indexes section in the HTML repr 1413425793
1283897249	https://github.com/pydata/xarray/pull/7183#issuecomment-1283897249	https://api.github.com/repos/pydata/xarray/issues/7183	IC_kwDOAMm_X85Mhreh	benbovy 4160723	2022-10-19T11:59:08Z	2022-10-19T11:59:08Z	MEMBER	Looks all good to me! Do you want to add a what's new entry here or add it in #7185 with a link to this PR?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	use `_repr_inline_` for indexes that define it 1412926287
1283103957	https://github.com/pydata/xarray/pull/7185#issuecomment-1283103957	https://api.github.com/repos/pydata/xarray/issues/7185	IC_kwDOAMm_X85MepzV	benbovy 4160723	2022-10-18T22:57:16Z	2022-10-18T22:57:16Z	MEMBER	Thanks @keewis for opening this PR. I added some commits (hope you don't mind) to fix the CSS. I also grouped the items in the indexes section by unique index with index coordinates separated by line return, so it looks like the coordinate section while the multi-coordinate indexes are clearly visible.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	indexes section in the HTML repr 1413425793
1283038653	https://github.com/pydata/xarray/pull/7182#issuecomment-1283038653	https://api.github.com/repos/pydata/xarray/issues/7182	IC_kwDOAMm_X85MeZ29	benbovy 4160723	2022-10-18T21:40:49Z	2022-10-18T21:40:49Z	MEMBER	I wonder if it is possible to create a generic MultiIndex? Hmm that could be possible but it think there are just too many possible edge cases for something generic like that. In your specific example `python ds.set_xindex( ["a", "b"], MultiIndex([("a", PandasIndex), ("b", PandasIndex), (["a", "b"], BallTreeIndex)), )` we could probably use the BallTreeIndex for point-wise indexing (i.e., with `ds.sel(a=xr.DataArray(...), b=xr.DataArray(...))`) and use the two PandasIndex instances for other kinds of selection (e.g., with slices, scalars, etc.) so there's no conflict, but I doubt this would be what we want in other cases. I guess your suggestion is a way around the constraint in the Xarray data model that a coordinate cannot have multiple indexes? I'm afraid there's no easy solution that is generic enough. Maybe some cache to avoid rebuilding the indexes? I.e., `.set_xindex()` doesn't drop the pre-existing index(es) but rather disable them so that it is possible to re-enable them later with another `.set_xindex()` call (`.xindexes` only returns the "active" indexes but there may be other "inactive" indexes attached to a dataset).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	add MultiPandasIndex helper class 1412901282
1282295471	https://github.com/pydata/xarray/pull/7183#issuecomment-1282295471	https://api.github.com/repos/pydata/xarray/issues/7183	IC_kwDOAMm_X85Mbkav	benbovy 4160723	2022-10-18T12:19:56Z	2022-10-18T12:19:56Z	MEMBER	Yeah I think we could let the whole line after the 1st column (coordinate names) be customized by the index.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	use `_repr_inline_` for indexes that define it 1412926287
1282151989	https://github.com/pydata/xarray/pull/7183#issuecomment-1282151989	https://api.github.com/repos/pydata/xarray/issues/7183	IC_kwDOAMm_X85MbBY1	benbovy 4160723	2022-10-18T10:11:46Z	2022-10-18T10:11:46Z	MEMBER	Great @keewis! One question: should we let `repr_inline` display the class name or should we reserve a column for this and use `repr_inline` for other things? I.e., like variables have a dtype column and another column for values preview or other inline info.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	use `_repr_inline_` for indexes that define it 1412926287
1282016895	https://github.com/pydata/xarray/issues/7162#issuecomment-1282016895	https://api.github.com/repos/pydata/xarray/issues/7162	IC_kwDOAMm_X85MagZ_	benbovy 4160723	2022-10-18T08:35:29Z	2022-10-18T08:49:47Z	MEMBER	Indexes.copy_indexes might also require some update that includes the memo argument. But not sure if that will solve the issue here. That's a possible cause. Alignment may fail early because `.xindexes` returns different mappings of coordinates vs. index objects. It's worth checking if after copying the dataset, `copy.xindexes` returns the same CRSIndex object for its "x", "y" and "spatial_ref" coordinates. EDIT: checking `copy.xindexes.group_by_index()` is more convenient.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	copy of custom index does not align with original 1409811164
1282024919	https://github.com/pydata/xarray/issues/7162#issuecomment-1282024919	https://api.github.com/repos/pydata/xarray/issues/7162	IC_kwDOAMm_X85MaiXX	benbovy 4160723	2022-10-18T08:41:08Z	2022-10-18T08:41:08Z	MEMBER	The refactored alignment logic could be improved (cf. #7002). The error raised in the method below is not very helpful. https://github.com/pydata/xarray/blob/ab726c536464fbf4d8878041f950d2b0ae09b862/xarray/core/alignment.py#L294-L333	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	copy of custom index does not align with original 1409811164
1277301954	https://github.com/pydata/xarray/issues/6807#issuecomment-1277301954	https://api.github.com/repos/pydata/xarray/issues/6807	IC_kwDOAMm_X85MIhTC	benbovy 4160723	2022-10-13T09:22:27Z	2022-10-13T09:22:27Z	MEMBER	Not really a generic and parallel execution back-end, but Open-EO looks like an interesting use case too (it is a framework for managing remote execution of processing tasks on multiple big Earth observation cloud back-ends via a common API). I've suggested the idea of reusing the Xarray API here: https://github.com/Open-EO/openeo-python-client/issues/334.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Alternative parallel execution frameworks in xarray 1308715638
1276685925	https://github.com/pydata/xarray/pull/7150#issuecomment-1276685925	https://api.github.com/repos/pydata/xarray/issues/7150	IC_kwDOAMm_X85MGK5l	benbovy 4160723	2022-10-12T20:17:09Z	2022-10-12T20:17:09Z	MEMBER	Thank you @lukasbindreiter! Merging. I notice that this is your first contribution to Xarray, welcome!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Update open_dataset backend to ensure compatibility with new explicit index model 1403144601
1276433539	https://github.com/pydata/xarray/pull/6795#issuecomment-1276433539	https://api.github.com/repos/pydata/xarray/issues/6795	IC_kwDOAMm_X85MFNSD	benbovy 4160723	2022-10-12T16:19:34Z	2022-10-12T16:19:34Z	MEMBER	Looks good to me @keewis. Thanks for your work on the indexes repr! Yes I think we can skip displaying default indexes for now... The question is which indexes are considered as default, i.e., all `PandasIndex` and `PandasMultiIndex` instances (like in this PR) or just the single pandas indexes automatically created for the dimension coordinates? We can decide this later, though, it's not a problem adding more indexes in the text repr later (we'll probably need it when dropping the multi-index dimension coordinate with tuple elements). For the html repr it's easier: we could display all indexes and collapse the section by default. but I thought "dimension coordinates" (and in particular their indexes) are still used for alignment? Yes that's a good point. Let's keep "dimensions without coordinates".	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	display the indexes in the string reprs 1306887842
1272966573	https://github.com/pydata/xarray/issues/7139#issuecomment-1272966573	https://api.github.com/repos/pydata/xarray/issues/7139	IC_kwDOAMm_X85L3-2t	benbovy 4160723	2022-10-10T08:35:22Z	2022-10-10T08:35:22Z	MEMBER	Looks like the backend logic needs some updates to make it compatible with the new xarray data model with explicit indexes (i.e., possible indexed coordinates with name != dimension like for multi-index levels now), e.g., here: https://github.com/pydata/xarray/blob/8eea8bb67bad0b5ac367c082125dd2b2519d4f52/xarray/backends/api.py#L234-L241	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray.open_dataset has issues if the dataset returned by the backend contains a multiindex 1400949778
1272944063	https://github.com/pydata/xarray/issues/7148#issuecomment-1272944063	https://api.github.com/repos/pydata/xarray/issues/7148	IC_kwDOAMm_X85L35W_	benbovy 4160723	2022-10-10T08:16:37Z	2022-10-10T08:16:37Z	MEMBER	Looks like passing a `pandas.MultiIndex` object as `dim` argument to `concat` was forgotten during the explicit indexes refactor. While this can be fixed (could be tricky), we should deprecate it: it is convenient but probably too neat now that multi-indexes levels have their own, "real" coordinates (see https://github.com/pydata/xarray/issues/6293#issuecomment-1259228475). It should be preferred to explicitly chain `concat` with `assign_coords` (and `set_index`) like the last line in your example.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Concatenate using Multiindex cannot be unstacked anymore 1402168223
1271555410	https://github.com/pydata/xarray/issues/7139#issuecomment-1271555410	https://api.github.com/repos/pydata/xarray/issues/7139	IC_kwDOAMm_X85LymVS	benbovy 4160723	2022-10-07T12:55:17Z	2022-10-07T12:55:17Z	MEMBER	Hi @lukasbindreiter, could you add the whole error traceback please?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray.open_dataset has issues if the dataset returned by the backend contains a multiindex 1400949778
1271519573	https://github.com/pydata/xarray/pull/7105#issuecomment-1271519573	https://api.github.com/repos/pydata/xarray/issues/7105	IC_kwDOAMm_X85LydlV	benbovy 4160723	2022-10-07T12:20:49Z	2022-10-07T12:20:49Z	MEMBER	Tests should be ok now, although this is not a super clean workaround. IndexVariable still needs some more refactoring anyway.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Fix to_index(): return multiindex level as single index 1390999159
1267580535	https://github.com/pydata/xarray/issues/7121#issuecomment-1267580535	https://api.github.com/repos/pydata/xarray/issues/7121	IC_kwDOAMm_X85Ljb53	benbovy 4160723	2022-10-04T21:08:20Z	2022-10-04T21:08:20Z	MEMBER	Hi @veenstrajelmer, In principle with the recent explicit indexes refactor there is no need anymore to have this restriction. Although we still need to relax this constraint (see #6293 point 2), hopefully this shouldn't be hard work now.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Add rename_variables argument to xr.open_dataset() to workaround vars with same names as dims 1395962467
1266073388	https://github.com/pydata/xarray/issues/7108#issuecomment-1266073388	https://api.github.com/repos/pydata/xarray/issues/7108	IC_kwDOAMm_X85Ldr8s	benbovy 4160723	2022-10-03T21:28:43Z	2022-10-03T21:28:43Z	MEMBER	I suppose re-projecting it on a 0-360 would be the only way around this specific issue. A custom Xarray index would help, e.g., `PeriodicBoundaryIndex` (#7031) or a `GeographicIndex` leveraging libraries like S2Geometry or H3.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	.sel return errors when using floats for no apparent reason 1391699976
1266068474	https://github.com/pydata/xarray/pull/7105#issuecomment-1266068474	https://api.github.com/repos/pydata/xarray/issues/7105	IC_kwDOAMm_X85Ldqv6	benbovy 4160723	2022-10-03T21:22:42Z	2022-10-03T21:22:42Z	MEMBER	Yes I agree it would be nice if we can roll back this breaking change. However, it really conflicts with `.xindexes` that returns the same index instance for each of its corresponding coordinate. This roll back seems to mostly break things where we need to be smart while handling multi-index coordinates passed to DataArray / Dataset constructors. This might be tricky to solve. It would probably be easier to do it after #6392.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Fix to_index(): return multiindex level as single index 1390999159
1265252754	https://github.com/pydata/xarray/issues/2028#issuecomment-1265252754	https://api.github.com/repos/pydata/xarray/issues/2028	IC_kwDOAMm_X85LajmS	benbovy 4160723	2022-10-03T10:38:57Z	2022-10-03T16:45:35Z	MEMBER	With the last release v2022.09.0, this is now possible via `.set_xindex()`: ```python a = a.set_xindex("currency") a.sel(currency="EUR") <xarray.DataArray (country: 2)> array([20, 30]) Coordinates: * country (country) <U7 'Germany' 'France' * currency (country) <U3 'EUR' 'EUR' ``` Closed in #6971 (although `set_xindex` still needs to be documented in the User Guide).	{ "total_count": 9, "+1": 0, "-1": 0, "laugh": 0, "hooray": 5, "confused": 0, "heart": 3, "rocket": 1, "eyes": 0 }	slice using non-index coordinates 309691307
1265012286	https://github.com/pydata/xarray/issues/7108#issuecomment-1265012286	https://api.github.com/repos/pydata/xarray/issues/7108	IC_kwDOAMm_X85LZo4-	benbovy 4160723	2022-10-03T06:57:17Z	2022-10-03T06:57:17Z	MEMBER	TBH, I had to do some research before figuring out what was going on :).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	.sel return errors when using floats for no apparent reason 1391699976
1263548977	https://github.com/pydata/xarray/issues/7108#issuecomment-1263548977	https://api.github.com/repos/pydata/xarray/issues/7108	IC_kwDOAMm_X85LUDox	benbovy 4160723	2022-09-30T13:03:26Z	2022-09-30T13:03:26Z	MEMBER	It looks like the error is because of the non-monotonic coordinate labels for the "lon" coordinate in `nc_bug` rather than a float precision issue. The "lon" coordinate seems monotonic for `nc_ok` so it works. When a `slice` is given as indexer, Xarray internally calls `pandas.Index.slice_indexer()`, which requires that the index must be ordered and unique (docs). Unfortunately, Pandas does not mention it while it raises a KeyError. Should we first check the index in Xarray and raise a nicer error message if it is not unique / ordered?	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	.sel return errors when using floats for no apparent reason 1391699976
1262007838	https://github.com/pydata/xarray/issues/7075#issuecomment-1262007838	https://api.github.com/repos/pydata/xarray/issues/7075	IC_kwDOAMm_X85LOLYe	benbovy 4160723	2022-09-29T09:20:59Z	2022-09-29T09:20:59Z	MEMBER	What happens if you create `Dataset` objects fully in memory instead of loading data from files? Is there a significant slowdown when you increase the size of the Dataset dimensions? Could you measure the time it takes at a more fined-grained level? I.e., loading files vs. extracting a slice vs. convert to dataframe. This would help better identifying the possible source of slowdown.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Convert xarray dataset to pandas dataframe is much slower in newest xarray version 1384226112
1261998233	https://github.com/pydata/xarray/issues/7104#issuecomment-1261998233	https://api.github.com/repos/pydata/xarray/issues/7104	IC_kwDOAMm_X85LOJCZ	benbovy 4160723	2022-09-29T09:12:54Z	2022-09-29T09:12:54Z	MEMBER	Maybe we should check `pandas.MultiIndex.is_unique` in `Dataset.unstack()` Better to check this in `PandasMultiIndex.unstack()` actually.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Duplicate values on unstack 1390228572
1261996160	https://github.com/pydata/xarray/issues/7104#issuecomment-1261996160	https://api.github.com/repos/pydata/xarray/issues/7104	IC_kwDOAMm_X85LOIiA	benbovy 4160723	2022-09-29T09:11:05Z	2022-09-29T09:11:05Z	MEMBER	Thanks for the report @znichollscr. Maybe we should check `pandas.MultiIndex.is_unique` in `Dataset.unstack()` like in `Dataset.from_dataframe()`? ```python df = ds.drop_vars("lat").to_dataframe() xr.Dataset.from_dataframe(df) ValueError: cannot convert a DataFrame with a non-unique MultiIndex into xarray ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Duplicate values on unstack 1390228572
1261356747	https://github.com/pydata/xarray/issues/7069#issuecomment-1261356747	https://api.github.com/repos/pydata/xarray/issues/7069	IC_kwDOAMm_X85LLsbL	benbovy 4160723	2022-09-28T19:12:50Z	2022-09-28T19:12:50Z	MEMBER	I think we can go ahead with the release. The remaining regressions seem to affect only a limited number of use cases ; it could wait the following release if we we are not waiting too long between the two. I'd also wait for an announcement about indexes. It has been already announced at the previous release, and it'd probably be better to communicate about it (maybe via a blog post?) after improving the docs and experimenting a bit more with custom indexes...	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	release? 1382753751
1261049239	https://github.com/pydata/xarray/issues/7097#issuecomment-1261049239	https://api.github.com/repos/pydata/xarray/issues/7097	IC_kwDOAMm_X85LKhWX	benbovy 4160723	2022-09-28T15:03:36Z	2022-09-28T15:03:36Z	MEMBER	Hi @znichollscr, thanks for the report. Indeed it looks like `_coord_names` are not updated properly.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Broken state when using assign_coords with multiindex 1389148779
1261015002	https://github.com/pydata/xarray/issues/7099#issuecomment-1261015002	https://api.github.com/repos/pydata/xarray/issues/7099	IC_kwDOAMm_X85LKY_a	benbovy 4160723	2022-09-28T14:39:10Z	2022-09-28T14:39:10Z	MEMBER	Or use `Indexer` objects to group labels + options? This is slightly different than what you suggest: ```python class Dataset: `def sel( self, indexers: Mapping[Any, Any] \| Indexer \| Iterable[Indexer], indexers_kwargs: Any, ): ...` class Indexer: def init(self, labels=None, options=None, label_kwargs): ... ``` Let's assume a Dataset with `lat` / `lon` coordinates both sharing the same geographic index + another `time` dimension coordinate, then we could write: ```python indexers = [ Indexer(lon=[2, 15], lat=[45, 48], options={"foo": "bar"}), Indexer(time="2022-01-01"), ] ds.sel(indexers) ``` This could also be used to avoid code duplication when using common selection options for different indexes.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass arbitrary options to sel() 1389295853
1260892017	https://github.com/pydata/xarray/issues/7099#issuecomment-1260892017	https://api.github.com/repos/pydata/xarray/issues/7099	IC_kwDOAMm_X85LJ69x	benbovy 4160723	2022-09-28T13:11:01Z	2022-09-28T13:11:01Z	MEMBER	Or we could simply decide that `.sel()` should not accept arbitrary options and handle special cases, e.g., via accessors. It would actually make sense to have something like `.my_accessor.sel_k_neighbors()`. Not so great to have a separate method just for an optimization option, though.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass arbitrary options to sel() 1389295853
1260618693	https://github.com/pydata/xarray/issues/6392#issuecomment-1260618693	https://api.github.com/repos/pydata/xarray/issues/6392	IC_kwDOAMm_X85LI4PF	benbovy 4160723	2022-09-28T09:13:00Z	2022-09-28T12:52:01Z	MEMBER	How would we handle creating xarray objects from pandas objects where they have a multiindex? For `pandas.Series` / `pandas.DataFrame` objects, `DataArray.from_series()` / `Dataset.from_dataframe()` already expand multi-index levels as dimensions. For a `pandas.MultiIndex`, we could do like below but it is a bit tedious: ```python import pandas as pd import xarray as xr from xarray.indexes import PandasMultiIndex pd_idx = pd.MultiIndex.from_product([["a", "b"], [1, 2]], names=("foo", "bar")) idx = PandasMultiIndex(pd_idx, "x") indexes = {"x": idx, "foo": idx, "bar": idx} coords = idx.create_variables() ds = xr.Dataset(coords=coords, indexes=indexes) ``` For more convenience, we could add a class method to `PandasMultiIndex`, e.g., ```python this calls PandasMultiIndex.init() and PandasMultiIndex.create_variables() internally indexes, coords = PandasMultiIndex.from_pandas_index(pd_idx, "x") ds = xr.Dataset(coords=coords, indexes=indexes) ``` Instead of `indexes, coords` raw dictionaries, we could return an instance of the Indexes class (also returned by `Dataset.xindexes`), which encapsulates the coordinate variables: ```python xmidx = PandasMultiIndex.from_pandas_index(pd_idx, "x") ds = xr.Dataset(coords=xmidx.variables, indexes=xmidx) ``` For even more convenience, I think it might be reasonable to support special handling of `Indexes` instances given in Dataset / DataArray constructors and in `.update()`, i.e., ```python both cases below will implicitly add the coordinates found in `xmidx` (if there's no conflict with other coordinates) ds = xr.Dataset(indexes=xmidx) ds2 = xr.Dataset() ds2.update(xmidx) ``` The same approach could be used for `pandas.IntervalIndex` (as discussed in #4579).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes to the Dataset and DataArray constructors 1175329407
1260859023	https://github.com/pydata/xarray/issues/7099#issuecomment-1260859023	https://api.github.com/repos/pydata/xarray/issues/7099	IC_kwDOAMm_X85LJy6P	benbovy 4160723	2022-09-28T12:50:25Z	2022-09-28T12:50:25Z	MEMBER	Another difficulty regarding multi-coordinate indexes: ideally options should be set per index, not per coordinate.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass arbitrary options to sel() 1389295853
1260806288	https://github.com/pydata/xarray/issues/4090#issuecomment-1260806288	https://api.github.com/repos/pydata/xarray/issues/4090	IC_kwDOAMm_X85LJmCQ	benbovy 4160723	2022-09-28T12:06:03Z	2022-09-28T12:06:03Z	MEMBER	@JimmyGao0204 this is not supported by Xarray itself but the xoak has been developed for that purpose. I'm going to close this issue as Xarray now provides everything needed for selecting data using 2D lat/lon coordinates (i.e., advanced indexing, flexible indexes), and it is likely that this specific case will be further maintained in a 3rd party library like `xoak`. Feel free to comment / re-open if you think this should be built-in Xarray.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Error with indexing 2D lat/lon coordinates 623804131
1260794423	https://github.com/pydata/xarray/issues/475#issuecomment-1260794423	https://api.github.com/repos/pydata/xarray/issues/475	IC_kwDOAMm_X85LJjI3	benbovy 4160723	2022-09-28T11:55:04Z	2022-09-28T11:55:04Z	MEMBER	There hasn't been much activity here since quite some time. Meanwhile, there has been the development of the xoak package that supports point-wise indexing of Xarray objects with various indexes (either generic like `scipy.spatial.cKDTree` or more specific like pys2index's `S2PointIndex` for lat/lon point data). `xoak` leverage Xarray's advanced indexing capabilities and supports selection using both coordinates and indexers with an arbitrary number of dimensions. With the forthcoming Xarray release, it will be possible to create and assign custom indexes to DataArray / Dataset objects. The plan for `xoak` is then to just provide some custom indexes so that we can perform point-wise selection directly with `Dataset.sel()` instead of `Dataset.xoak.sel()`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	API design for pointwise indexing 95114700
1260551056	https://github.com/pydata/xarray/issues/6573#issuecomment-1260551056	https://api.github.com/repos/pydata/xarray/issues/6573	IC_kwDOAMm_X85LInuQ	benbovy 4160723	2022-09-28T08:17:09Z	2022-09-28T08:17:09Z	MEMBER	I also like the idea of alignment with some tolerance. There is an open PR #4489, which needs to be reworked in the context of the explicit index refactor. Alternatively to a new kwarg we could add an index build option, e.g., `ds.set_xindex("x", index_cls=PandasIndex, align_tolerance=1e-6)`, but then it is not obvious how to handle different tolerance values given for the indexes to compare. Maybe this could depend on the given `join` method? E.g., pick the smallest tolerance for join=inner, the largest for join=outer, the tolerance of the left index for join=left, etc.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	32- vs 64-bit coordinates coordinates in where() 1226272301
1260497579	https://github.com/pydata/xarray/issues/5874#issuecomment-1260497579	https://api.github.com/repos/pydata/xarray/issues/5874	IC_kwDOAMm_X85LIaqr	benbovy 4160723	2022-09-28T07:26:55Z	2022-09-28T07:26:55Z	MEMBER	Closed in #6971.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Need a way to speciefy the names of coordinates from the indices which droped by DataArray.reset_index. 1029088776
1259615513	https://github.com/pydata/xarray/issues/4579#issuecomment-1259615513	https://api.github.com/repos/pydata/xarray/issues/4579	IC_kwDOAMm_X85LFDUZ	benbovy 4160723	2022-09-27T14:45:19Z	2022-09-27T14:46:41Z	MEMBER	Perhaps Xarray has been too clever so far regarding how it handles pandas objects passed directly as coordinate data? `pandas.MultiIndex` objects are handled in a specific way too, which is often hard to deal with. Expanding on @max-sixty's suggestion, we could: treat all coordinate data as duck arrays, i.e., in the example above handle `da1` just like `da2` (no more special cases for pandas objects) provide an `xarray.indexes.PandasIntervalIndex` wrapper, which would inherit from `xarray.indexes.PandasIndex` with a few addtionnal options and features, e.g., like the ones @dcherian suggests in https://github.com/pydata/xarray/discussions/6783#discussioncomment-3149033 build an interval index from an existing coordinate using , e.g., `da.set_xindex("x", PandasIntervalIndex, closed="right")` figure out how to assign both a coordinate and an index from an existing `pandas.IntervalIndex` object in a convenient but more explicit way	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Invisible differences between arrays using IntervalIndex 741806260
1259441952	https://github.com/pydata/xarray/issues/5646#issuecomment-1259441952	https://api.github.com/repos/pydata/xarray/issues/5646	IC_kwDOAMm_X85LEY8g	benbovy 4160723	2022-09-27T12:34:20Z	2022-09-27T12:34:20Z	MEMBER	This is fixed in v2022.6.0 ```python xr.testing.assert_allclose(b, c) AssertionError: Left and right DataArray objects are not close Coordinates only on the left object: * x (z) int64 0 * y (z) int64 0 Coordinates only on the right object: * not-y (z) int64 0 * not-x (z) int64 0 print(b == c, "\n") ValueError: cannot re-index or align objects with conflicting indexes found for the following coordinates: 'z' (2 conflicting indexes) Conflicting indexes may occur when - they relate to different sets of coordinate and/or dimension names - they don't have the same type - they may be used to reindex data along common dimensions ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Level names in multi-level index are ignored 955617411
1259415933	https://github.com/pydata/xarray/issues/2280#issuecomment-1259415933	https://api.github.com/repos/pydata/xarray/issues/2280	IC_kwDOAMm_X85LESl9	benbovy 4160723	2022-09-27T12:12:05Z	2022-09-27T12:12:05Z	MEMBER	This is fixed in v2022.6.0. Xarray's `PandasMultiIndex` wrapper keeps track of the level coordinate dtypes.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	string coords are converted to object dtype when using MultiIndex / stacking 340316108
1259415318	https://github.com/pydata/xarray/issues/907#issuecomment-1259415318	https://api.github.com/repos/pydata/xarray/issues/907	IC_kwDOAMm_X85LEScW	benbovy 4160723	2022-09-27T12:11:35Z	2022-09-27T12:11:35Z	MEMBER	This is fixed in v2022.6.0. Xarray's `PandasMultiIndex` wrapper keeps track of the level coordinate dtypes.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	unstack() treats string coords as objects 166441031
1259349072	https://github.com/pydata/xarray/pull/6971#issuecomment-1259349072	https://api.github.com/repos/pydata/xarray/issues/6971	IC_kwDOAMm_X85LECRQ	benbovy 4160723	2022-09-27T11:14:07Z	2022-09-27T11:14:07Z	MEMBER	In the last commit I added the `xarray.indexes` namespace from which we can import `Index`, `PandasIndex` and `PandasMultiIndex`. Thanks everyone for the feedback and review! I think this is ready to merge, if we agree to address the `coord_names` typing issue in another PR?	{ "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Add set_xindex and drop_indexes methods 1357296406

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

523 rows where user = 4160723 sorted by updated_at descending

Deprecate pandas.MultiIndex special cases in Xarray

version 2022.3.0

22.5 ms ± 1.96 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)

main branch

193 ms ± 1.35 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

this PR

1.01 ms ± 10.7 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Coordinates:

* x (x) object MultiIndex

* one (x) object 'a' 'a' 'b' 'b'

* two (x) int64 1 2 1 2

* y (y) int64 0 1 2

<xarray.Dataset>

Dimensions: (x: 4)

Coordinates:

* x (x) object MultiIndex

* one (x) object 'a' 'a' 'b' 'b'

* two (x) int64 1 2 1 2

* y (y) int64 0 1 2

Data variables:

empty

<xarray.Dataset>

Dimensions: (x: 2, midx: 4)

Coordinates:

* midx (midx) object MultiIndex

* x (midx) int32 1 1 2 2

* y (midx) int32 3 4 3 4

Data variables:

a (x) int32 6 7

<xarray.Dataset>

Dimensions: (x: 4)

Coordinates:

* x (x) int64 1 2 3 4

Data variables:

empty

<xarray.Dataset>

Dimensions: (x: 4)

Coordinates:

* x (x) object MultiIndex

* one (x) object 'a' 'a' 'b' 'b'

* two (x) int64 1 2 1 2

Data variables:

empty

<xarray.Dataset>

Dimensions: (x: 4, y: 4)

Coordinates:

* x (x) object MultiIndex

* one (x) object 'a' 'a' 'b' 'b'

* two (x) int64 1 2 1 2

* y (y) object MultiIndex

* three (y) object 'c' 'c' 'd' 'd'

* four (y) int64 3 4 3 4

Data variables:

empty

this returns an Indexes object (indexes + coordinates)

<xarray.Dataset>

Dimensions: (x: 4)

Coordinates:

* x (x) object MultiIndex

* one (x) object 'a' 'a' 'b' 'b'

* two (x) int64 1 2 1 2

Data variables:

empty

True

eventually this would behave like this:

<xarray.Dataset>

Dimensions: (x: 4)

Coordinates:

* x (x) object ('a', 1) ('a', 2) ('b', 1) ('b', 2)

Data variables:

empty

<xarray.DataArray (x: 4)>

array([1, 2, 3, 4])

Coordinates:

* x (x) object MultiIndex

* one (x) object 'a' 'a' 'b' 'b'

* two (x) int64 1 2 1 2

<xarray.Dataset>

Deprecate `pandas.MultiIndex` special cases in Xarray

this returns an `Indexes` object (indexes + coordinates)

both cases below will implicitly add the coordinates found in `xmidx`