github: issue_comments: 10 rows where author_association = "MEMBER" and issue = 231308952 sorted by updated

10 rows where author_association = "MEMBER" and issue = 231308952 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
454165639	https://github.com/pydata/xarray/pull/1426#issuecomment-454165639	https://api.github.com/repos/pydata/xarray/issues/1426	MDEyOklzc3VlQ29tbWVudDQ1NDE2NTYzOQ==	fujiisoup 6815844	2019-01-14T21:20:27Z	2019-01-14T21:20:27Z	MEMBER	I'll close this for the recent discussion about MultiIndex	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	scalar_level in MultiIndex 231308952
310032997	https://github.com/pydata/xarray/pull/1426#issuecomment-310032997	https://api.github.com/repos/pydata/xarray/issues/1426	MDEyOklzc3VlQ29tbWVudDMxMDAzMjk5Nw==	benbovy 4160723	2017-06-21T10:08:07Z	2017-06-21T10:58:29Z	MEMBER	Although I haven't thought about all the details regarding this, I think that in the case of multi-dimensional coordinates a "super index" would rather allow directly using these coordinates for indexing, which is currently not possible. In your 'rasm' example, it would rather look like `python <xarray.Dataset> Dimensions: (time: 36, x: 275, y: 205) Dimensions without coordinates: y, x Coordinates: * time (time) float64 7.226e+05 7.226e+05 7.227e+05 7.227e+05 ... * spatial_index (y, x) KDTree - xc (y, x) float64 189.2 189.4 189.6 189.7 189.9 190.1 190.2 190.4 ... - yc (y, x) float64 16.53 16.78 17.02 17.27 17.51 17.76 18.0 18.25 ... Dimensions without coordinates: x, y Data variables: Tair (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan ... Attributes: ...` and it would allow writing `python In [1]: ds.sel(xc=<...>, yc=<...>, method='nearest')` Note that `x` and `y` dimensions still don't have coordinates. That's actually what @shoyer suggested here. The proposal above is more about having the same API for groups of coordinates that can be indexed using a "wrapped" index object (maybe "wrapped index" is a better name than "super index"?), but the logic can be very different from one index object to another.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	scalar_level in MultiIndex 231308952
309268208	https://github.com/pydata/xarray/pull/1426#issuecomment-309268208	https://api.github.com/repos/pydata/xarray/issues/1426	MDEyOklzc3VlQ29tbWVudDMwOTI2ODIwOA==	fujiisoup 6815844	2017-06-18T10:11:33Z	2017-06-18T10:11:33Z	MEMBER	@benbovy Sorry for my late reply. I think I like your proposal, which bundles multiple concepts in xarray such as `MultiIndex` and `multi-dimensional coordinates` into one, which may result in simpler API. But actually I don't yet fully imagine how your proposal works with `multi-dimensional coordinates`. (maybe because I am not accustomed with `multi-dimensional coordinates` very well.) Currently, 'rasm' example is like python In [1]: import xarray as xr In [2]: xr.tutorial.load_dataset('rasm', decode_times=False) Out[2]: <xarray.Dataset> Dimensions: (time: 36, x: 275, y: 205) Coordinates: * time (time) float64 7.226e+05 7.226e+05 7.227e+05 7.227e+05 ... xc (y, x) float64 189.2 189.4 189.6 189.7 189.9 190.1 190.2 190.4 ... yc (y, x) float64 16.53 16.78 17.02 17.27 17.51 17.76 18.0 18.25 ... Dimensions without coordinates: x, y Data variables: Tair (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan ... Attributes: ... Does your proposal (automatically) change this like python <xarray.Dataset> Dimensions: (time: 36, xy: 56375) Coordinates: * time (time) float64 7.226e+05 7.226e+05 7.227e+05 7.227e+05 ... xc (xy) float64 189.2 189.0 188.7 188.5 188.2 187.9 187.7 187.4 ... yc (xy) float64 16.53 16.69 16.85 17.01 17.17 17.32 17.48 17.63 ... * xy (xy) SuperIndex - x (xy) int64 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... - y (xy) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ... Data variables: Tair (time, xy) float64 nan nan nan nan nan nan nan nan nan nan nan ... Attributes: ... ?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	scalar_level in MultiIndex 231308952
305520522	https://github.com/pydata/xarray/pull/1426#issuecomment-305520522	https://api.github.com/repos/pydata/xarray/issues/1426	MDEyOklzc3VlQ29tbWVudDMwNTUyMDUyMg==	benbovy 4160723	2017-06-01T15:00:06Z	2017-06-01T15:00:06Z	MEMBER	@fujiisoup I agree that given your example proposal 2 might be more intuitive, however IMHO implicit indexes seem a bit too magical indeed. Although I don't have any concrete example in mind, I guess that sometimes I would be hard to really understand what's going on. Exposing less concepts to users would be indeed an improvement, unless it makes things too implicit or magical. Let me try to give a more detailed proposal than in my previous comment, which generalizes to potential features like multi-dimensional indexers (see @shoyer's comment, which I'd be happy to start working on soon). It is actually very much like proposal 1, with only one additional concept (called "super index" below). `DataArray` and `Dataset` objects may have coordinates, which are the variables listed in `da.coords` or `ds.coords`. These variables may be 1-dimensional or n-dimensional. Among these coordinates, some are "indexed" coordinates. These are marked by `` in the `repr` and can be used in `.sel` and `.isel` as keyword arguments. Some coordinates may be grouped together and wrapped by some kinds of "super indexes". These super indexes are also marked by `` in the `repr` and the coordinates that are part of it are shown next below with the `-` marker. Each coordinate wrapped by a super index is considered as an indexed coordinate: it is still listed in `da.coords` or `ds.coords` and it can be also used in `.sel` and `.isel` as keyword argument. This is different for the super index, which is not listed in `.coords`. If needed, we might make super indexes accessible as virtual coordinates: they would then return arrays of tuples with the values of the wrapped coordinates. Examples of super indexes: `KDTree`. It allows multi-dimensional coordinates to be indexed using a KDTree. Similarly, `BallTree` or `RTree`... `MultiIndex` (or `CoordinateGroup` or any better name). It allows to explicitly define multiple indexes for a given dimension and to explicitly define the behavior when for example we select data with conflicting labels in different coordinates. It also naturally converts to a `pandas.MultiIndex` when we want to convert to a `DataFrame`. "Super index" is an additional concept that has to be understood by users, which is in principle bad, but here I think it's worth as it potentially gives a good generic model for explicit handling of various, advanced indexes that involve multiple coordinates.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	scalar_level in MultiIndex 231308952
305062228	https://github.com/pydata/xarray/pull/1426#issuecomment-305062228	https://api.github.com/repos/pydata/xarray/issues/1426	MDEyOklzc3VlQ29tbWVudDMwNTA2MjIyOA==	fujiisoup 6815844	2017-05-31T02:12:13Z	2017-05-31T02:12:13Z	MEMBER	@shoyer I personally think 2 is more intuitive for users, because it might be difficult to distinguish `python <xarray.Dataset> Dimensions: (yx: 6) Coordinates: y (yx) object 'a' 'a' 'a' 'b' 'b' 'b' Data variables: foo (yx) int64 1 2 3 4 5 6` (which may be generated by indexing from `x` in your example) from `python <xarray.Dataset> Dimensions: (y: 6) Coordinates: * y (y) object 'a' 'a' 'a' 'b' 'b' 'b' Data variables: foo (y) int64 1 2 3 4 5 6` What is the possible confusion if we adopt 2?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	scalar_level in MultiIndex 231308952
305058643	https://github.com/pydata/xarray/pull/1426#issuecomment-305058643	https://api.github.com/repos/pydata/xarray/issues/1426	MDEyOklzc3VlQ29tbWVudDMwNTA1ODY0Mw==	shoyer 1217238	2017-05-31T01:46:35Z	2017-05-31T01:46:35Z	MEMBER	If my understanding is correct, does it mean that we will support ds.sel(x='a'), ds.isel(x=[0, 1]) and ds.mean(dim='x') with your example data? Will it raise an Error if Coordinate is more than 1 dimensional? How about ds.sel(x='a', y=[1, 2])? I was only thinking about `.sel()` (as works currently with `MultiIndex`). I'm not sure about the others yet. @benbovy although a `CoordinateGroup` is definitely better than `MultiIndex-scalar`, it still feels like a very similar notion. It could make for a nice internal clean-up, but from an user perspective I think it's about as confusing as a MultiIndex -- it's just as many terms to keep track of. Right now, our user facing API in xarray exposes three related concepts: - `Coordinate` - `Index` - `MultiIndex` Eliminating any of these concepts would be an improvement. To this end, I have two (vague) proposals: 1. Eliminate `MultiIndex`. We only have an idea of "indexed" coordinates, marked by `` in the `repr`, which don't necessarily correspond to dimensions. Indexed coordinates, which are immutable, can have any number of dimensions and you can have any other of "indexed" coordinates per dimension. Indexing, concatenating and expanding dimensions should not change their nature. 2. Eliminate both `MultiIndex` and* explicit indexes. Indexes required for efficient operations are created on the fly when necessary. This might be too magical.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	scalar_level in MultiIndex 231308952
305039117	https://github.com/pydata/xarray/pull/1426#issuecomment-305039117	https://api.github.com/repos/pydata/xarray/issues/1426	MDEyOklzc3VlQ29tbWVudDMwNTAzOTExNw==	benbovy 4160723	2017-05-30T23:38:05Z	2017-05-30T23:38:05Z	MEMBER	I also fully agree that using multiple coordinate (index) variables instead of a `MultiIndex` would greatly simplify things both internally and for users! A dimension with a single 'real' coordinate (i.e., an `IndexVariable`) that warps a `MultiIndex` with multiple 'levels' that can be accessed (and indexed) as 'virtual' coordinates indeed represents a lot of unnecessary complexity!! A dimension having multiple 'real' coordinates that can be used with `.sel` - or even `.isel` - is much simpler to understand and maybe to implement. Using multiple 'real' coordinates, I don't see any reason why `ds.sel(x='a')`, `ds.isel(x=[0, 1])` or `ds.sel(x='a', y=[1, 2])` would not be supported. However, we need to choose what to do in case of conflicts, e.g., `ds.isel(x=[0, 1], y=[1, 2])`. Raise an error? Return a result equivalent to `ds.isel(yx=1)`(and) or equivalent to `ds.isel(x=[0, 1, 2])` (or)? The important practical difference is that here there are no labels along the yx, so ds['yx'][0] would not return a tuple. Also, we would need to figure out some way to explicitly signal what should become part of a MultiIndex when we convert to a pandas DataFrame. I'm thinking about something like this: `<xarray.Dataset> Dimensions: (yx: 6) Coordinates: * yx (yx) CoordinateGroup - y (yx) object 'a' 'a' 'a' 'b' 'b' 'b' - x (yx) int64 1 2 3 1 2 3 Data variables: foo (yx) int64 1 2 3 4 5 6` It may present several advantages: Instead of being listed as a dimension without coordinates (which is not true), `yx` would have a `CoordinateGroup` that would simply consist of a lightweight object that only contains references to the `x` and `y` coordinates. `CoordinateGroup` may behave like a virtual coordinate so that `ds['yx'][0]` still returns a tuple (there may not be many use cases for this, though). `set_index`, `reset_index` and `reorder_levels` can still be used to explicitly create, modify or remove a `CoordinateGroup` for a given dimension. It is trivial to convert a `CoordinateGroup` to a `MultiIndex` when we convert to a pandas `DataFrame`. According to @fmaussion's comment above, I think that using here a name like `CoordinateGroup` is much easier to understand for xarray users that using the name `MultiIndex`. In `repr()`, `x` and `y` are still shown next to each other.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	scalar_level in MultiIndex 231308952
304904137	https://github.com/pydata/xarray/pull/1426#issuecomment-304904137	https://api.github.com/repos/pydata/xarray/issues/1426	MDEyOklzc3VlQ29tbWVudDMwNDkwNDEzNw==	fmaussion 10050469	2017-05-30T14:53:48Z	2017-05-30T14:53:48Z	MEMBER	It occurs to me that if we had full support for indexing on coordinate levels, we might not need a notion of a "MultiIndex" in the public API at all. This would be awesome and so much clearer for many users including me, who understand "coordinates" much better than "MultiIndex".	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	scalar_level in MultiIndex 231308952
304902053	https://github.com/pydata/xarray/pull/1426#issuecomment-304902053	https://api.github.com/repos/pydata/xarray/issues/1426	MDEyOklzc3VlQ29tbWVudDMwNDkwMjA1Mw==	fujiisoup 6815844	2017-05-30T14:47:36Z	2017-05-30T14:47:36Z	MEMBER	@shoyer Thanks for the comment. It occurs to me that if we had full support for indexing on coordinate levels, we might not need a notion of a "MultiIndex" in the public API at all. Actually I am not yet fully comfortable with my implementation, and I like your idea as this might be much cleaner and simpler than mine. If my understanding is correct, does it mean that we will support `ds.sel(x='a')`, `ds.isel(x=[0, 1])` and `ds.mean(dim='x')` with your example data? Will it raise an Error if `Coordinate` is more than 1 dimensional? How about `ds.sel(x='a', y=[1, 2])`?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	scalar_level in MultiIndex 231308952
304778433	https://github.com/pydata/xarray/pull/1426#issuecomment-304778433	https://api.github.com/repos/pydata/xarray/issues/1426	MDEyOklzc3VlQ29tbWVudDMwNDc3ODQzMw==	shoyer 1217238	2017-05-30T05:29:11Z	2017-05-30T05:29:11Z	MEMBER	Sorry for the delay getting back to you here -- I'm still thinking through the implications of this change. This does make the handling of `MultiIndex` type data much more consistent, but calling scalars `MultiIndex-scalar` seems quite confusing to me. I think of the data-type here as closer to NumPy's structured types, except without the implied storage format for the data. However, taking a step back, I wonder if this is the right approach. In many ways, structured dtypes are similar to xarray's existing data structures, so supporting them fully means a lot of duplicated functionality. MultiIndexes (especially with scalars) should work similarly to separate variables, but they are implemented very differently under the hood (all the data lives in one variable). (See https://github.com/pandas-dev/pandas/issues/3443 for related discussion about pandas and why it doesn't support structured dtypes.) It occurs to me that if we had full support for indexing on coordinate levels, we might not need a notion of a "MultiIndex" in the public API at all. To make this more concrete, what if this was the `repr()` for the result of `ds.stack(yx=['y', 'x'])` in your first example? `<xarray.Dataset> Dimensions: (yx: 6) Coordinates: y (yx) object 'a' 'a' 'a' 'b' 'b' 'b' x (yx) int64 1 2 3 1 2 3 Data variables: foo (yx) int64 1 2 3 4 5 6` If we supported `MultiIndex`-like indexing for `x` and `y`, this could be nearly equivalent to a MultiIndex with much less code duplication. The important practical difference is that here there are no labels along the `yx`, so `ds['yx'][0]` would not return a tuple. Also, we would need to figure out some way to explicitly signal what should become part of a MultiIndex when we convert to a pandas DataFrame. Pandas has `MultiIndex` because it needed a way to group multiple arrays together into a single index array. In xarray, this is less necessary, because we have multiple coordinates to represent levels, and xarray itself no longer need a MultiIndex notion because we longer requires coordinate labels for every dimension (as of v0.9). CC @benbovy	{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	scalar_level in MultiIndex 231308952

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);