home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

8 rows where issue = 169588316 and user = 4160723 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • benbovy · 8 ✖

issue 1

  • Multi-index levels as coordinates · 8 ✖

author_association 1

  • MEMBER 8
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
246678848 https://github.com/pydata/xarray/pull/947#issuecomment-246678848 https://api.github.com/repos/pydata/xarray/issues/947 MDEyOklzc3VlQ29tbWVudDI0NjY3ODg0OA== benbovy 4160723 2016-09-13T13:21:53Z 2016-09-13T13:21:53Z MEMBER

I think the main (only?) thing left to do here is to remove the name argument you added to IndexVariable

Just removed it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multi-index levels as coordinates 169588316
244539699 https://github.com/pydata/xarray/pull/947#issuecomment-244539699 https://api.github.com/repos/pydata/xarray/issues/947 MDEyOklzc3VlQ29tbWVudDI0NDUzOTY5OQ== benbovy 4160723 2016-09-03T10:44:58Z 2016-09-03T10:44:58Z MEMBER

I've made changes according to your comments.

I don't think it would be good idea to return a dict here because computing level values can be expensive and the levels aren't always necessary (they're stored internally as an Index for each level and integer codes).

I just thought that returning a dict could be a little more convenient (i.e., using a single call) if we need to get either one particular or several or all level(s) as IndexVariable object(s). However, I admit that this is certainly overkill and that it is actually not related to removing the name attr.

BTW, I will be away over the holiday weekend (in the US), but I expect we will probably be able to merge this shortly after I get back.

Happy holidays! (I'll be on holiday next week too).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multi-index levels as coordinates 169588316
244514745 https://github.com/pydata/xarray/pull/947#issuecomment-244514745 https://api.github.com/repos/pydata/xarray/issues/947 MDEyOklzc3VlQ29tbWVudDI0NDUxNDc0NQ== benbovy 4160723 2016-09-03T00:21:06Z 2016-09-03T00:21:06Z MEMBER

Rather than adding an independent name attribute to IndexVariable, let's deprecate/remove IndexVariable.name instead (I can do this in a follow-up PR). Clearly it's useful to be able to use IndexVariable objects even for objects that do not correspond to ticks along a dimension, but for such objects, name is misleading. IndexVariable.name was never more than a convenient shortcut, and it's outlived it's usefulness.

I agree that name is misleading here. If we remove IndexVariable.name, then we can modify IndexVariable.get_level_variable such that it accepts one or more level names and returns an OrderedDict of {level_name: IndexVariable} items. That would be even better than the current implementation. Maybe we should then rename get_level_variable to get_levels_as_variables?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multi-index levels as coordinates 169588316
244507224 https://github.com/pydata/xarray/pull/947#issuecomment-244507224 https://api.github.com/repos/pydata/xarray/issues/947 MDEyOklzc3VlQ29tbWVudDI0NDUwNzIyNA== benbovy 4160723 2016-09-02T23:07:14Z 2016-09-02T23:07:14Z MEMBER

@shoyer this is ready for another round of review. I don't see any remaining issue, I added some tests and I updated the docs.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multi-index levels as coordinates 169588316
244087601 https://github.com/pydata/xarray/pull/947#issuecomment-244087601 https://api.github.com/repos/pydata/xarray/issues/947 MDEyOklzc3VlQ29tbWVudDI0NDA4NzYwMQ== benbovy 4160723 2016-09-01T14:00:02Z 2016-09-01T14:02:14Z MEMBER

Not sure how to write the tests for this PR, as there are quite many small changes spread in the API (e.g., repr, data object and coordinate properties, etc.). Should I write new tests specific to multi-index or should I modify existing tests (e.g., TestDataset.test_modify_inplace, TestDataset.tes_coord_properties, TestDataset.test_coords_modify, etc.) to include the multi-index case?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multi-index levels as coordinates 169588316
239159377 https://github.com/pydata/xarray/pull/947#issuecomment-239159377 https://api.github.com/repos/pydata/xarray/issues/947 MDEyOklzc3VlQ29tbWVudDIzOTE1OTM3Nw== benbovy 4160723 2016-08-11T13:23:52Z 2016-08-11T15:33:50Z MEMBER

I just made some updates.

I would add an attribute level_names to Coordinate

Done.

It's much cheaper to call .get_level_values() after slicing a MultiIndex than before

Done, although it currently slices an arbitrary number (30) of first elements rather than calculating the number of elements needed for display.

It seems like a fine rule to say that you cannot call .sel on both a level and the dimension name at the same time

Done.

check uniqueness of level names at Dataset/DataArray creation (which is a good idea!)

I tried but it broke some existing tests. It actually triggered data loading for Coordinate objects (via calls to to_index()). I need to further investigate this.

there is not much need for level indexing with a dictionary at all

Right, but in the current implementation this is still used internally.

I would suggest putting the logic to create new variables for levels in the private _get_virtual_variable in dataset.py. [...] If possible, it would be nice if ds['time.day'] works even if time is a multi-index level.

Done. It should also work with multi-index levels although not tested yet.

I'm currently leaning toward Option 3A...

I've chosen option 3A for the repr, but I can change it depending on others' opinions.

What happens when you write ds.coords['level_1'] = ...? [...] probably better to raise an error and note that the MultiIndex should be modified instead.

Done.

Should levels appear in ds.keys() or ds.coords.keys()?

They don't appear in there. If we keep Option 3A for the repr, I also think that we can avoid changes here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multi-index levels as coordinates 169588316
238549129 https://github.com/pydata/xarray/pull/947#issuecomment-238549129 https://api.github.com/repos/pydata/xarray/issues/947 MDEyOklzc3VlQ29tbWVudDIzODU0OTEyOQ== benbovy 4160723 2016-08-09T13:16:50Z 2016-08-09T13:16:50Z MEMBER

It seems like a fine rule to say that you cannot call .sel on both a level and the dimension name at the same time

Agreed. I have always the tendency to want to make things as flexible as possible, but this is definitely not needed here.

I'm currently leaning toward Option 3A, but I don't have a strong opinion.

I'm also +1 for Option 3A. Maybe one little argument in favor of Options 3E or 3F is that they still show a consistent repr when a scalar is returned from the multi-index (see below), even though I don't like how they would display duplicate information.

```

da.sel(level_1='a', level_2=0) <xarray.DataArray 'test' (y: 4)> array([ 0.66068869, 0.20374398, 0.43967166, 0.09447679]) Coordinates: x object ('a', 0) * y (y) int64 0 1 2 3 ```

Possibly it would be a good idea to rename Coordinate -> IndexVariable

As a recent xarray user, I indeed remember that I initially found confusing to have Dataset or DataArray "coordinates" that can be either Coordinate or Variable objects. I find that the name IndexVariable is more representative of the object.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multi-index levels as coordinates 169588316
237997724 https://github.com/pydata/xarray/pull/947#issuecomment-237997724 https://api.github.com/repos/pydata/xarray/issues/947 MDEyOklzc3VlQ29tbWVudDIzNzk5NzcyNA== benbovy 4160723 2016-08-06T01:15:02Z 2016-08-06T01:15:02Z MEMBER

In the example above, DataArray.level_1 now returns a DataArray object, although I haven't found another way than creating a new coordinates.DataArrayLevelCoordinates class for this.

I wasn't happy with the initial implementation of using levels in .sel, it really added too much complexity. I re-implemented it and now I find it much cleaner. However, it is not possible anymore to mix level indexers and non-dict dimension indexers, e.g., da.sel(x='a', level_2=1) doesn't work but da.sel(x={'level_1': 'a'}, level_2=1) does, which is already quite flexible though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multi-index levels as coordinates 169588316

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 41.713ms · About: xarray-datasette