issue_comments
16 rows where author_association = "MEMBER" and issue = 1422543378 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
issue 1
- Pass indexes directly to the DataArray and Dataset constructors · 16 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1297046405 | https://github.com/pydata/xarray/pull/7214#issuecomment-1297046405 | https://api.github.com/repos/pydata/xarray/issues/7214 | IC_kwDOAMm_X85NT1uF | benbovy 4160723 | 2022-10-31T12:54:50Z | 2022-10-31T12:54:50Z | MEMBER | Thanks for the suggestion @shoyer, in general I like it very much! "Coordinates possibly baked by one or more indexes" feels much more natural than "indexes and their corresponding coordinates". Even though indexes have been promoted as 1st class citizens in the data model, their right place should still be in the background compared to coordinates. So having a My main concern is about the timing, as such a broader refactor might postpone some work in progress on the public API and the documentation. Ideally this shouldn't discourage users to start experimenting with custom indexes and building an ecosystem around it, as soon as possible. There might be a fast path towards your suggestion, at least regarding the public facing API (your points 1 and 4):
This would let us the possibility to achieve a broader (mostly internal) refactor of Alternatively, we could just wait for that refactor to finish before implementing explicit assignment of coordinates and indexes. We already have |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Pass indexes directly to the DataArray and Dataset constructors 1422543378 | |
1295283938 | https://github.com/pydata/xarray/pull/7214#issuecomment-1295283938 | https://api.github.com/repos/pydata/xarray/issues/7214 | IC_kwDOAMm_X85NNHbi | shoyer 1217238 | 2022-10-28T17:49:10Z | 2022-10-28T17:49:10Z | MEMBER |
I agree -- we should support this for backwards compatibility (even if we deprecate it).
OK, this totally makes sense. I don't love that it is possible to express invalid states in Xarray's data model. This motivated the creation of I wonder if we should consider the broader refactor of merging the This would have a number of benefits:
|
{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Pass indexes directly to the DataArray and Dataset constructors 1422543378 | |
1295217001 | https://github.com/pydata/xarray/pull/7214#issuecomment-1295217001 | https://api.github.com/repos/pydata/xarray/issues/7214 | IC_kwDOAMm_X85NM3Fp | dcherian 2448579 | 2022-10-28T16:41:19Z | 2022-10-28T16:41:36Z | MEMBER |
True, but its neatly encapsulated in one place. And if we are going to support many Index classes that build on numpy, scipy, then support for pandas indexes seems entirely in-scope.
Should we expose |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Pass indexes directly to the DataArray and Dataset constructors 1422543378 | |
1294783661 | https://github.com/pydata/xarray/pull/7214#issuecomment-1294783661 | https://api.github.com/repos/pydata/xarray/issues/7214 | IC_kwDOAMm_X85NLNSt | benbovy 4160723 | 2022-10-28T09:49:02Z | 2022-10-28T09:49:02Z | MEMBER |
I'd just want to add that, from my experience with debugging multi-index issues, it is hard even for advanced users to see what's going wrong when coordinates and indexes are not consistent. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Pass indexes directly to the DataArray and Dataset constructors 1422543378 | |
1294771427 | https://github.com/pydata/xarray/pull/7214#issuecomment-1294771427 | https://api.github.com/repos/pydata/xarray/issues/7214 | IC_kwDOAMm_X85NLKTj | benbovy 4160723 | 2022-10-28T09:38:22Z | 2022-10-28T09:38:22Z | MEMBER |
An |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Pass indexes directly to the DataArray and Dataset constructors 1422543378 | |
1293946521 | https://github.com/pydata/xarray/pull/7214#issuecomment-1293946521 | https://api.github.com/repos/pydata/xarray/issues/7214 | IC_kwDOAMm_X85NIA6Z | benbovy 4160723 | 2022-10-27T19:04:19Z | 2022-10-27T19:52:21Z | MEMBER |
Agreed. However,
While generally I also prefer handling plain EDIT -- For more context: initially an |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Pass indexes directly to the DataArray and Dataset constructors 1422543378 | |
1293909730 | https://github.com/pydata/xarray/pull/7214#issuecomment-1293909730 | https://api.github.com/repos/pydata/xarray/issues/7214 | IC_kwDOAMm_X85NH37i | shoyer 1217238 | 2022-10-27T18:28:40Z | 2022-10-27T18:28:40Z | MEMBER |
I would lean against this, only because it's easier to explicitly manipulate indexes in the form of a Explicitly providing indexes is an advanced user feature. I think it's OK to require users to do a bit more work in this case and to not necessarily do consistency checks (beyond verifying that the coordinate variables exist). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Pass indexes directly to the DataArray and Dataset constructors 1422543378 | |
1293902008 | https://github.com/pydata/xarray/pull/7214#issuecomment-1293902008 | https://api.github.com/repos/pydata/xarray/issues/7214 | IC_kwDOAMm_X85NH2C4 | benbovy 4160723 | 2022-10-27T18:21:02Z | 2022-10-27T18:21:02Z | MEMBER |
Yes that would make sense. However, it would be adding another
Indexes are not merged together but the new / replaced coordinate variables must be compatible with the other variables of the dataset.
That is actually a good idea for https://github.com/pydata/xarray/pull/7214#issuecomment-1292089179! Not sure I would reuse |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Pass indexes directly to the DataArray and Dataset constructors 1422543378 | |
1293774799 | https://github.com/pydata/xarray/pull/7214#issuecomment-1293774799 | https://api.github.com/repos/pydata/xarray/issues/7214 | IC_kwDOAMm_X85NHW_P | dcherian 2448579 | 2022-10-27T16:21:53Z | 2022-10-27T16:21:53Z | MEMBER | How about I like the last spelling: |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Pass indexes directly to the DataArray and Dataset constructors 1422543378 | |
1293531607 | https://github.com/pydata/xarray/pull/7214#issuecomment-1293531607 | https://api.github.com/repos/pydata/xarray/issues/7214 | IC_kwDOAMm_X85NGbnX | benbovy 4160723 | 2022-10-27T13:31:24Z | 2022-10-27T13:42:44Z | MEMBER | I also added an ```python ds = xr.Dataset(coords={"x": [4, 5, 6, 7]}) ds2 = xr.Dataset(coords={"x": [1, 2, 3, 4]}) ds.assign_indexes(ds2.xindexes) <xarray.Dataset>Dimensions: (x: 4)Coordinates:* x (x) int64 1 2 3 4Data variables:emptymidx = pd.MultiIndex.from_product([["a", "b"], [1, 2]], names=("one", "two")) indexes = wrap_pandas_multiindex(midx, "x") ds.assign_indexes(indexes) <xarray.Dataset>Dimensions: (x: 4)Coordinates:* x (x) object MultiIndex* one (x) object 'a' 'a' 'b' 'b'* two (x) int64 1 2 1 2Data variables:empty``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Pass indexes directly to the DataArray and Dataset constructors 1422543378 | |
1293545325 | https://github.com/pydata/xarray/pull/7214#issuecomment-1293545325 | https://api.github.com/repos/pydata/xarray/issues/7214 | IC_kwDOAMm_X85NGe9t | benbovy 4160723 | 2022-10-27T13:41:50Z | 2022-10-27T13:41:50Z | MEMBER | @pydata/xarray I'd be very happy if you could share your thoughts about the examples shown in the last three comments. If you think the API looks good like that, then I will work on adding some tests and on the documentation. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Pass indexes directly to the DataArray and Dataset constructors 1422543378 | |
1292089179 | https://github.com/pydata/xarray/pull/7214#issuecomment-1292089179 | https://api.github.com/repos/pydata/xarray/issues/7214 | IC_kwDOAMm_X85NA7db | benbovy 4160723 | 2022-10-26T13:54:22Z | 2022-10-26T13:54:22Z | MEMBER | Passing multiple indexes: ```python midx1 = pd.MultiIndex.from_product([["a", "b"], [1, 2]], names=("one", "two")) midx2 = pd.MultiIndex.from_product([["c", "d"], [3, 4]], names=("three", "four")) indexes1 = wrap_pandas_multiindex(midx1, "x") indexes2 = wrap_pandas_multiindex(midx2, "y") indexes = Indexes( indexes=dict(indexes1, indexes2), variables=dict(indexes1.variables, indexes2.variables) ) ds = xr.Dataset(indexes=indexes) <xarray.Dataset>Dimensions: (x: 4, y: 4)Coordinates:* x (x) object MultiIndex* one (x) object 'a' 'a' 'b' 'b'* two (x) int64 1 2 1 2* y (y) object MultiIndex* three (y) object 'c' 'c' 'd' 'd'* four (y) int64 3 4 3 4Data variables:empty``` That's not looking super nice, but probably we can add some convenience function or |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Pass indexes directly to the DataArray and Dataset constructors 1422543378 | |
1291911349 | https://github.com/pydata/xarray/pull/7214#issuecomment-1291911349 | https://api.github.com/repos/pydata/xarray/issues/7214 | IC_kwDOAMm_X85NAQC1 | benbovy 4160723 | 2022-10-26T11:47:57Z | 2022-10-26T12:14:23Z | MEMBER | I implemented option 3. We can still change or revert it later if it's not the best one. A few examples: ```python import pandas as pd import xarray as xr from xarray.indexes import wrap_pandas_multiindex midx = pd.MultiIndex.from_product([["a", "b"], [1, 2]], names=("one", "two")) ``` It is now possible to pass a pandas multi-index to a Dataset like this: ```python this returns an
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Pass indexes directly to the DataArray and Dataset constructors 1422543378 | |
1291638319 | https://github.com/pydata/xarray/pull/7214#issuecomment-1291638319 | https://api.github.com/repos/pydata/xarray/issues/7214 | IC_kwDOAMm_X85M_NYv | benbovy 4160723 | 2022-10-26T07:52:35Z | 2022-10-26T07:52:35Z | MEMBER |
Maybe with something else than |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Pass indexes directly to the DataArray and Dataset constructors 1422543378 | |
1291059643 | https://github.com/pydata/xarray/pull/7214#issuecomment-1291059643 | https://api.github.com/repos/pydata/xarray/issues/7214 | IC_kwDOAMm_X85M9AG7 | benbovy 4160723 | 2022-10-25T19:50:57Z | 2022-10-25T19:50:57Z | MEMBER | Hmm I'm wondering what would be best between the options below regarding the types for the
Option 1 is nice for passing multiple indexes, e.g., ```python pd_midx1 = pd.MultiIndex.from_arrays(..., names=("one", "two")) pd_midx2 = pd.MultiIndex.from_arrays(..., , names=("three", "four")) indexes1 = PandasMultiIndex.from_pandas_index(pd_midx1, "x") indexes2 = PandasMultiIndex.from_pandas_index(pd_midx2, "y") ds = xr.Dataset(indexes=[indexes1, indexes2]) ``` With option 1 it feels odd passing an empty list in order to avoid creating default indexes: Option 3 actually works in all cases since I'm leaning towards option 3. For passing multiple indexes at once we could probably expand the What do people think? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Pass indexes directly to the DataArray and Dataset constructors 1422543378 | |
1290896977 | https://github.com/pydata/xarray/pull/7214#issuecomment-1290896977 | https://api.github.com/repos/pydata/xarray/issues/7214 | IC_kwDOAMm_X85M8YZR | TomNicholas 35968931 | 2022-10-25T17:25:48Z | 2022-10-25T17:25:48Z | MEMBER | Big moves!! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Pass indexes directly to the DataArray and Dataset constructors 1422543378 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 4