home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

10 rows where author_association = "MEMBER" and issue = 188113943 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 3

  • crusaderky 5
  • max-sixty 4
  • fmaussion 1

issue 1

  • Better support for subclasses: tests, docs and API · 10 ✖

author_association 1

  • MEMBER · 10 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
523898016 https://github.com/pydata/xarray/issues/1097#issuecomment-523898016 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzg5ODAxNg== max-sixty 5635139 2019-08-22T13:07:44Z 2019-08-22T13:07:44Z MEMBER

The biggest problem is with all the Dataset methods and accessors that return a DataArray, and vice versa. Anybody who wants to create a coupled pair of Dataset and DataArray subclasses will need to hunt down all methods and accessors that return the other class in the pair and override them.

This is correct - functions which convert between DataArray & Dataset wouldn't retain type. That said, this can still be helpful without the ability to change type.

There's a bigger piece of work which would solve this too, at the cost of abstraction: have class attributes which define _array_type=DataArray on Dataset and similar for DataArray. pandas used this

May I ask what are the practical use cases for subclassing? In several years worth of day-to-day use of xarray I always found that encapsulation felt much more natural.

Right, good question and we should catch ourselves from adding every abstraction. I have one specific use-case we've already found helpful: we have an object that is mostly a Dataset, with the addition of some behaviors and constructors - for example from_[dataset_x_and_y] or assert_invariants. Alternatives: - A class which had the dataset as an attribute - but this means 90%+ of object use is x.data, and the object can't be passed into methods expecting a dataset. - Prior to using xarray, we frequently used this pattern to aggregate lots of pandas dataframes (and I've seen this in the wild too) - A normal dataset with some associated functions in a module - but makes discovering the functions harder - The current state - an object inherited from Dataset with methods

We don't use accessors because the behaviors are specific to a class, rather than every xarray object.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
523885536 https://github.com/pydata/xarray/issues/1097#issuecomment-523885536 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzg4NTUzNg== crusaderky 6213168 2019-08-22T12:32:37Z 2019-08-22T12:32:50Z MEMBER

Nevermind __slots__. I just tried and there is no noticeable speedup.

I had thought the primary saving was memory (and fairly significant with lots of objects)

It is one of the two savings - the other theoretically being attribute access time. I think I'll go on with a pull request for further discussion.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
523881963 https://github.com/pydata/xarray/issues/1097#issuecomment-523881963 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzg4MTk2Mw== max-sixty 5635139 2019-08-22T12:22:02Z 2019-08-22T12:22:26Z MEMBER

Nevermind __slots__. I just tried and there is no noticeable speedup.

I had thought the primary saving was memory (and fairly significant with lots of objects)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
523878745 https://github.com/pydata/xarray/issues/1097#issuecomment-523878745 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzg3ODc0NQ== crusaderky 6213168 2019-08-22T12:12:42Z 2019-08-22T12:12:42Z MEMBER

Nevermind __slots__. I just tried and there is no noticeable speedup. Code at https://github.com/crusaderky/xarray/commit/26e0477759ce2ef9d1f8fd9d1b23234741517c5a

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
523842624 https://github.com/pydata/xarray/issues/1097#issuecomment-523842624 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzg0MjYyNA== crusaderky 6213168 2019-08-22T10:17:27Z 2019-08-22T10:17:27Z MEMBER

There's also the argument that I would love, at some point, to migrate the xarray objects to use __slots__. From a quick benchmark on my mac, that speeds up every single attribute access by 7ns. I suspect that would very quickly add up. However, any subclass that defines extra attributes (as opposed to just methods) would break on the transition between __dict__ an __slots__.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
523834809 https://github.com/pydata/xarray/issues/1097#issuecomment-523834809 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzgzNDgwOQ== crusaderky 6213168 2019-08-22T09:53:00Z 2019-08-22T09:53:45Z MEMBER

There's also a funny, pickle-friendly hack that allows you to add methods to a Dataset without subclassing it - thanks to the Dataset.__getattr__ magic: ``` def custom(self: Dataset, ...): ...

ds = Dataset(...) ds.attrs['custom'] = partial(custom, ds) ds.custom(...) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
523833526 https://github.com/pydata/xarray/issues/1097#issuecomment-523833526 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzgzMzUyNg== crusaderky 6213168 2019-08-22T09:49:00Z 2019-08-22T09:49:00Z MEMBER

The biggest problem is with all the Dataset methods and accessors that return a DataArray, and vice versa. Anybody who wants to create a coupled pair of Dataset and DataArray subclasses will need to hunt down all methods and accessors that return the other class in the pair and override them.

May I ask what are the practical use cases for subclassing? In several years worth of day-to-day use of xarray I always found that encapsulation felt much more natural.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
523750630 https://github.com/pydata/xarray/issues/1097#issuecomment-523750630 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzc1MDYzMA== max-sixty 5635139 2019-08-22T05:12:08Z 2019-08-22T05:12:08Z MEMBER

For that case, you could put a breakpoint in and see what's calling it. It is bemusing

For subclass support, you could see whether there are methods that return DataArray / Dataset rather than self.__class__. Maybe we could add a generalized test that could run over a number of methods and assert they return a subclass

I think we're in an equilibrium where subclassing isn't supported for most operations, so it's not used, so we don't hear about it's failures. A moderate push could move us out of that equilibrium!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
523600557 https://github.com/pydata/xarray/issues/1097#issuecomment-523600557 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzYwMDU1Nw== max-sixty 5635139 2019-08-21T18:50:22Z 2019-08-21T18:50:22Z MEMBER

I think we should be able improve subclass support. (We use it internally at times, mainly for encapsulating logic on specific objects we inherit from Dataset).

@arokem if you're keen to investigate further, would be interesting to know what's happening there. It's possible it's an issue with DataArray's repr rather than the subclass itself

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
342274111 https://github.com/pydata/xarray/issues/1097#issuecomment-342274111 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDM0MjI3NDExMQ== fmaussion 10050469 2017-11-06T20:21:27Z 2017-11-06T20:21:27Z MEMBER

Do we have examples of such subclasses? It would be interesting to document the cases where the accessors are not good enough.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.524ms · About: xarray-datasette