github: issue_comments: 16 rows where issue = 1485037066 sorted by updated

16 rows where issue = 1485037066 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
1464927490	https://github.com/pydata/xarray/pull/7368#issuecomment-1464927490	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85XUQUC	headtr1ck 43316012	2023-03-11T14:50:02Z	2023-03-11T14:50:24Z	COLLABORATOR	DataAlignable = TypeVar("DataAlignable", bound=DataWithCoords \| Coordinates) -> doesn't work since we cannot mix DataWithCoords and Coordinates when aligning each object (input type = output type) I think thats exactly correct. The whold idea of using a TypeVar for the inputs means that all inputs must have the same type. Consider the following example: ```python class A: ... class B(A): ... class C: ... T = TypeVar("T", bound=Union[A, C]) def f(*x: T) -> T: ... f(A(), A()) # OK f(A(), B()) # OK f(C(), C()) # OK f(A(), C()) # not ok, because the common type of A and C is object ``` Also: sorry for the late reply.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1382070832	https://github.com/pydata/xarray/pull/7368#issuecomment-1382070832	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85SYLow	benbovy 4160723	2023-01-13T16:13:16Z	2023-01-13T16:13:16Z	MEMBER	Thanks for the review @shoyer. I addressed your comments. Everything seems OK except a rather annoying mypy error that I'm struggling with: The `DataAlignable` type variable should now encompass both `DataWithCoords` and `Coordinates`, since in this PR we add alignment support for the latter. I somewhat naively tried the options below without success: `DataAlignable = TypeVar("DataAlignable", bound=DataWithCoords \| Coordinates)` -> doesn't work since we cannot mix DataWithCoords and Coordinates when aligning each object (input type = output type) `DataAlignable = TypeVar("DataAlignable", bound=DataWithCoords, Coordinates)` -> doesn't work with subclasses `DataAlignable = TypeVar("DataAlignable", Dataset, DataArray, Coordinates)` -> doesn't work with generic types `T_Dataset`, etc.? I even tried using a Protocol @headtr1ck @Illviljan any idea?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1359003371	https://github.com/pydata/xarray/pull/7368#issuecomment-1359003371	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85RAL7r	benbovy 4160723	2022-12-20T08:34:06Z	2022-12-20T08:34:06Z	MEMBER	I'm wondering if instead of `Coordinates.from_pandas_multiindex()` we might want to provide a more generic constructor available as an extension point? For example: `Coordinates.from_index(index_obj: Any, , factory=None, *kwargs=None)` `factory` could be guessed from the type of `index_obj`. Xarray would support by default the `pandas.MultiIndex` and `pandas.Index` types. Like for IO backends, we could provide a `CoordinatesFactoryEntrypoint` so that it could support other index types. One downside is that specific (mandatory?) options like `dim` for a pandas (multi-)index are not directly visible. Would it be useful or is it overkill?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1352874809	https://github.com/pydata/xarray/pull/7368#issuecomment-1352874809	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85Qozs5	benbovy 4160723	2022-12-15T10:42:59Z	2022-12-15T10:42:59Z	MEMBER	OK this is now ready for review (cc @shoyer).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1352818155	https://github.com/pydata/xarray/pull/7368#issuecomment-1352818155	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85Qol3r	benbovy 4160723	2022-12-15T09:59:03Z	2022-12-15T09:59:03Z	MEMBER	Maybe there's some way to optimize that? I don't know if we can completely avoid it with the solution implemented in this PR, though. Promoting Coordinates is pretty clean and future proof IMO (assuming that we'll further refactor Coordinates to actually store variables and indexes, i.e., not as a proxy anymore). Is the (minor? temporary?) regression in performance acceptable and can we just leave it like that for now? Fixed in 193dad3 (with some reasonable special case added in `merge_core`).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1352310432	https://github.com/pydata/xarray/pull/7368#issuecomment-1352310432	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85Qmp6g	benbovy 4160723	2022-12-14T22:33:23Z	2022-12-15T01:08:41Z	MEMBER	I did some profiling to find the cause of the decrease in performance reported in the benchmarks (dataset creation). In summary, this is explained by a `Coordinates` object (built from the `coords` mapping) that is now included in objects to align when merging data vars and coordinates. Previously all non DataArray objects in the `coords` mapping were excluded from alignment (in `deep_align`). The introduced overhead comes from a call to `Coordinates._reindex_callback()`, which (I think?) should do no more than shallow copies and/or xarray wrapping stuff. In the benchmark report this is only marked as significant when creating small datasets (1.5-2x slower), and it becomes insignificant for datasets with more data variables. Maybe there's some way to optimize that? I don't know if we can completely avoid it with the solution implemented in this PR, though. Promoting `Coordinates` is pretty clean and future proof IMO (assuming that we'll further refactor `Coordinates` to actually store variables and indexes, i.e., not as a proxy anymore). Is the (minor? temporary?) regression in performance acceptable and can we just leave it like that for now? More details about the new workflow implemented in this PR when creating a new Dataset: if Dataset's `coords` argument is a "simple" mapping, it is first internally converted into a `Coordinates` object, with the creation of default indexes for dimension coordinates if one or more DataArray objects are given in `coords`, their coordinates (variables + indexes) are extracted and merged with the other input coordinates see the implementation in `xarray.core.coordinates.create_coords_with_default_indexes` otherwise, just reuse the `Coordinates` object passed as `coords` coordinates are then merged with data variables the `Coordinates` object is aligned with every other "alignable" object found in `data_vars` coordinate indexes (if any) are passed explicitly to `align` so they are used in priority explicitly using a `Coordinates` object skips the creation of default indexes during merging (in `collect_variables_and_indexes()`)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1349321538	https://github.com/pydata/xarray/pull/7368#issuecomment-1349321538	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85QbQNC	benbovy 4160723	2022-12-13T18:03:17Z	2022-12-13T18:03:17Z	MEMBER	I think this is ready for review!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1347327518	https://github.com/pydata/xarray/pull/7368#issuecomment-1347327518	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85QTpYe	benbovy 4160723	2022-12-12T21:05:56Z	2022-12-12T21:05:56Z	MEMBER	In order to skip creating default indexes when passing a `Coordinates` object, I first tried a small refactor but in the end I found that the cleanest way to do it was to support alignment for `Coordinates`. I think it makes sense now that Coordinates is part of Xarray's public API as a "stand-alone" container like Dataset and DataArray. The "no default index with Coordinates" behavior should be consistent Xarray-wise, i.e., for DataArray / Dataset constructors and also `assign_coords`, `update`, etc. Sorry this PR is getting big, but hopefully this is almost ready (still a few tests to fix or to add).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1346344694	https://github.com/pydata/xarray/pull/7368#issuecomment-1346344694	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85QP5b2	benbovy 4160723	2022-12-12T11:55:10Z	2022-12-12T11:55:10Z	MEMBER	My suggestion would be: coords passed as a dict: create default indexes coords passed as IndexedCoordinates: do not create defaults So if we already have some coordinate data as a dict but don't want any default index, we would need to do this: `python ds = xr.Dataset(coords=xr.Coordinates(my_coord_dict))` instead of this: `python ds = xr.Dataset(coords=my_coord_dict)`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1346091151	https://github.com/pydata/xarray/pull/7368#issuecomment-1346091151	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85QO7iP	benbovy 4160723	2022-12-12T08:36:09Z	2022-12-12T08:36:09Z	MEMBER	Thanks @shoyer, I've been thinking about similar short/long term plans although so far I haven't figured out how to implement your point 3. I'll give it another try.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1345646743	https://github.com/pydata/xarray/pull/7368#issuecomment-1345646743	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85QNPCX	shoyer 1217238	2022-12-11T20:17:15Z	2022-12-11T20:17:15Z	MEMBER	I'm actually trying to merge `IndexedCoordinates` with `Coordinates` but I'm stuck: the latter is abstract and I don't really see how I could refactor it together with `DatasetCoordinates` and `DataArrayCoordinates` `Coordinates` is abstract because in the (current) Xarray data model, it doesn't actually store any data -- coordinates are stored in private attributes of the original `Dataset` (`._variables` and `._coord_names`) or `DataArray` (`._coords`). So `Coordinates` needs to serve as a proxy for the data. In the long term, I think we should refactor `Dataset`/`DataArray` to actually store data (coordinate variables, indexes and dimension sizes) on `Coordinates`, but that's a bigger refactor. For now, it's worth noting that the current `Coordinates` class isn't actually exposed in Xarray's public API, just the `DatasetCoordinates` and `DataArrayCoordinates` classes (and not even their constructors). So the intermdiate step I would try is: 1. Rename the current `Coordinates` baseclass to `AbstractCoordinates`. 2. Rename your `IndexedCoordinates` class to `Coordinates`. Expose it in the public API. Make sure it can handle `DatasetCoordinates` and `DataArrayCoordinates` in the constructor. 3. Maybe: use some Python magic to make DatasetCorodinates/DataArrayCoordinates subclasses of the new Coordinates. Or maybe make them actual subclassses, overriding many of the methods (including the constructor).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1345314909	https://github.com/pydata/xarray/pull/7368#issuecomment-1345314909	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85QL-Bd	benbovy 4160723	2022-12-10T16:59:44Z	2022-12-10T16:59:44Z	MEMBER	Long term, do you think it would make sense to merge together Indexes, Coordinates and IndexedCoordinates? They are sort of all containers for the same thing. Yes I think so. I'm actually trying to merge `IndexedCoordinates` with `Coordinates` but I'm stuck: the latter is abstract and I don't really see how I could refactor it together with `DatasetCoordinates` and `DataArrayCoordinates`. Do you have any idea on how best to proceed? Ideally, I'd see `Coordinates` be exposed in Xarray's main namespace with at least the two following constructors: ```python class Coordinates: `def __init__( self, coords: Mapping[Any, Any] \| None = None, indexes: Mapping[Any, Index] \| None = None, ): # Similar to Dataset.__init__ but without the need # to merge coords and data vars... # Probably ok to allow more flexibility / less safety here? ... @classmethod from_pandas_multiindex(cls, index: pd.MultiIndex, dim: str): ...` ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1344968954	https://github.com/pydata/xarray/pull/7368#issuecomment-1344968954	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85QKpj6	shoyer 1217238	2022-12-10T01:37:35Z	2022-12-10T01:37:35Z	MEMBER	Long term, do you think it would make sense to merge together Indexes, Coordinates and IndexedCoordinates? They are sort of all containers for the same thing.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1344944917	https://github.com/pydata/xarray/pull/7368#issuecomment-1344944917	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85QKjsV	shoyer 1217238	2022-12-10T00:31:46Z	2022-12-10T00:31:46Z	MEMBER	what do you think about the approach proposed here? I'd like to check that with you before going further with tests, docs, etc. Generally this looks great to me! How to avoid building any default index? It seems silly to add or use the `indexes` argument just for that purpose? We could address that later. My suggestion would be: `coords` passed as a `dict`: create default indexes `coords` passed as `IndexedCoordinates`: do not create defaults Alternatively to an IndexedCoordinates subclass I was wondering if we could reuse the Coordinates base class? Yes, this makes more sense to me! What if the Indexes class was a facade based on IndexedCoordinates instead of the other way around? Yes, I also agree! This makes more sense.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1344046801	https://github.com/pydata/xarray/pull/7368#issuecomment-1344046801	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85QHIbR	benbovy 4160723	2022-12-09T09:13:24Z	2022-12-09T09:16:35Z	MEMBER	I added `IndexedCoordinates.merge_coords` so that it is easier to combine different coordinates to pass to a new Dataset / DataArray, e.g., ```python midx = pd.MultiIndex.from_product([["a", "b"], [1, 2]], names=("one", "two")) coords = xr.IndexedCoordinates.from_pandas_multiindex(midx, "x") coords = coords.merge_coords({"y": [0, 1, 2]}) Coordinates: * x (x) object MultiIndex * one (x) object 'a' 'a' 'b' 'b' * two (x) int64 1 2 1 2 * y (y) int64 0 1 2 ds = xr.Dataset(coords=coords) <xarray.Dataset> Dimensions: (x: 4) Coordinates: * x (x) object MultiIndex * one (x) object 'a' 'a' 'b' 'b' * two (x) int64 1 2 1 2 * y (y) int64 0 1 2 Data variables: empty ``` `IndexedCoordinates.merge_coords` is very much like `Coordinates.merge` except that it returns a new Coordinates object instead of a Dataset. Or should we just use `merge`? It would require that: `Coordinates.merge` accepts `Mapping[Any, Any]` for its `other` argument. Only changing the type hint is enough here since the implementation already accepts any input passed to Dataset. When a Dataset is passed as `coords` argument to a new Dataset and DataArray, both variables and indexes should be extracted. It is already the case for Dataset but I think it only works for PandasIndex and PandasMultiIndex (default indexes & backwards compatibility).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066
1344004727	https://github.com/pydata/xarray/pull/7368#issuecomment-1344004727	https://api.github.com/repos/pydata/xarray/issues/7368	IC_kwDOAMm_X85QG-J3	benbovy 4160723	2022-12-09T08:32:28Z	2022-12-09T09:14:17Z	MEMBER	`IndexedCoordinates` and `Indexes` have a lot of overlap. At some point we might consider merging the two classes, like @shoyer suggests in https://github.com/pydata/xarray/pull/7214#issuecomment-1295283938. The main difference is that one is a mapping of coordinates and the other is a mapping of indexes. `IndexedCoordinates` is mostly reusing `Indexes` and `Dataset` under the hood, it is only a facade. Alternatively to an `IndexedCoordinates` subclass I was wondering if we could reuse the `Coordinates` base class? There's some benefit of providing a subclass: besides specific constructors like `.from_pandas_multiindex()` it has a generic `__init__` for advanced use cases. Not sure it is a good idea to add this constructor to the base class? unlike Coordinates, IndexedCoordinates is immutable. What if the `Indexes` class was a facade based on `IndexedCoordinates` instead of the other way around? It would probably make more sense but it would also be a bigger refactor. I've chosen the easy way :).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Expose "Coordinates" as part of Xarray's public API 1485037066

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

16 rows where issue = 1485037066 sorted by updated_at descending

Coordinates:

* x (x) object MultiIndex

* one (x) object 'a' 'a' 'b' 'b'

* two (x) int64 1 2 1 2

* y (y) int64 0 1 2

<xarray.Dataset>

Dimensions: (x: 4)

Coordinates:

* x (x) object MultiIndex

* one (x) object 'a' 'a' 'b' 'b'

* two (x) int64 1 2 1 2

* y (y) int64 0 1 2

Data variables:

empty

Advanced export