home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

14 rows where issue = 188113943 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 5

  • crusaderky 5
  • max-sixty 4
  • arokem 2
  • lxkain 2
  • fmaussion 1

author_association 2

  • MEMBER 10
  • NONE 4

issue 1

  • Better support for subclasses: tests, docs and API · 14 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
523898016 https://github.com/pydata/xarray/issues/1097#issuecomment-523898016 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzg5ODAxNg== max-sixty 5635139 2019-08-22T13:07:44Z 2019-08-22T13:07:44Z MEMBER

The biggest problem is with all the Dataset methods and accessors that return a DataArray, and vice versa. Anybody who wants to create a coupled pair of Dataset and DataArray subclasses will need to hunt down all methods and accessors that return the other class in the pair and override them.

This is correct - functions which convert between DataArray & Dataset wouldn't retain type. That said, this can still be helpful without the ability to change type.

There's a bigger piece of work which would solve this too, at the cost of abstraction: have class attributes which define _array_type=DataArray on Dataset and similar for DataArray. pandas used this

May I ask what are the practical use cases for subclassing? In several years worth of day-to-day use of xarray I always found that encapsulation felt much more natural.

Right, good question and we should catch ourselves from adding every abstraction. I have one specific use-case we've already found helpful: we have an object that is mostly a Dataset, with the addition of some behaviors and constructors - for example from_[dataset_x_and_y] or assert_invariants. Alternatives: - A class which had the dataset as an attribute - but this means 90%+ of object use is x.data, and the object can't be passed into methods expecting a dataset. - Prior to using xarray, we frequently used this pattern to aggregate lots of pandas dataframes (and I've seen this in the wild too) - A normal dataset with some associated functions in a module - but makes discovering the functions harder - The current state - an object inherited from Dataset with methods

We don't use accessors because the behaviors are specific to a class, rather than every xarray object.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
523885536 https://github.com/pydata/xarray/issues/1097#issuecomment-523885536 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzg4NTUzNg== crusaderky 6213168 2019-08-22T12:32:37Z 2019-08-22T12:32:50Z MEMBER

Nevermind __slots__. I just tried and there is no noticeable speedup.

I had thought the primary saving was memory (and fairly significant with lots of objects)

It is one of the two savings - the other theoretically being attribute access time. I think I'll go on with a pull request for further discussion.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
523881963 https://github.com/pydata/xarray/issues/1097#issuecomment-523881963 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzg4MTk2Mw== max-sixty 5635139 2019-08-22T12:22:02Z 2019-08-22T12:22:26Z MEMBER

Nevermind __slots__. I just tried and there is no noticeable speedup.

I had thought the primary saving was memory (and fairly significant with lots of objects)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
523878745 https://github.com/pydata/xarray/issues/1097#issuecomment-523878745 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzg3ODc0NQ== crusaderky 6213168 2019-08-22T12:12:42Z 2019-08-22T12:12:42Z MEMBER

Nevermind __slots__. I just tried and there is no noticeable speedup. Code at https://github.com/crusaderky/xarray/commit/26e0477759ce2ef9d1f8fd9d1b23234741517c5a

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
523842624 https://github.com/pydata/xarray/issues/1097#issuecomment-523842624 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzg0MjYyNA== crusaderky 6213168 2019-08-22T10:17:27Z 2019-08-22T10:17:27Z MEMBER

There's also the argument that I would love, at some point, to migrate the xarray objects to use __slots__. From a quick benchmark on my mac, that speeds up every single attribute access by 7ns. I suspect that would very quickly add up. However, any subclass that defines extra attributes (as opposed to just methods) would break on the transition between __dict__ an __slots__.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
523834809 https://github.com/pydata/xarray/issues/1097#issuecomment-523834809 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzgzNDgwOQ== crusaderky 6213168 2019-08-22T09:53:00Z 2019-08-22T09:53:45Z MEMBER

There's also a funny, pickle-friendly hack that allows you to add methods to a Dataset without subclassing it - thanks to the Dataset.__getattr__ magic: ``` def custom(self: Dataset, ...): ...

ds = Dataset(...) ds.attrs['custom'] = partial(custom, ds) ds.custom(...) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
523833526 https://github.com/pydata/xarray/issues/1097#issuecomment-523833526 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzgzMzUyNg== crusaderky 6213168 2019-08-22T09:49:00Z 2019-08-22T09:49:00Z MEMBER

The biggest problem is with all the Dataset methods and accessors that return a DataArray, and vice versa. Anybody who wants to create a coupled pair of Dataset and DataArray subclasses will need to hunt down all methods and accessors that return the other class in the pair and override them.

May I ask what are the practical use cases for subclassing? In several years worth of day-to-day use of xarray I always found that encapsulation felt much more natural.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
523750630 https://github.com/pydata/xarray/issues/1097#issuecomment-523750630 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzc1MDYzMA== max-sixty 5635139 2019-08-22T05:12:08Z 2019-08-22T05:12:08Z MEMBER

For that case, you could put a breakpoint in and see what's calling it. It is bemusing

For subclass support, you could see whether there are methods that return DataArray / Dataset rather than self.__class__. Maybe we could add a generalized test that could run over a number of methods and assert they return a subclass

I think we're in an equilibrium where subclassing isn't supported for most operations, so it's not used, so we don't hear about it's failures. A moderate push could move us out of that equilibrium!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
523723363 https://github.com/pydata/xarray/issues/1097#issuecomment-523723363 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzcyMzM2Mw== arokem 118582 2019-08-22T02:41:55Z 2019-08-22T02:41:55Z NONE

Yeah: I'd be happy to dig around a bit. Where should I look?

Also, @mbeyeler might be able to share a bit more about sub-classing woes. I think that he made a commendable

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
523600557 https://github.com/pydata/xarray/issues/1097#issuecomment-523600557 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzYwMDU1Nw== max-sixty 5635139 2019-08-21T18:50:22Z 2019-08-21T18:50:22Z MEMBER

I think we should be able improve subclass support. (We use it internally at times, mainly for encapsulating logic on specific objects we inherit from Dataset).

@arokem if you're keen to investigate further, would be interesting to know what's happening there. It's possible it's an issue with DataArray's repr rather than the subclass itself

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
523593685 https://github.com/pydata/xarray/issues/1097#issuecomment-523593685 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDUyMzU5MzY4NQ== arokem 118582 2019-08-21T18:33:04Z 2019-08-21T18:33:04Z NONE

Just to add a puzzling attempt to subclass DataArray (provided by @mbeyeler) :

Are we doing it wrong? Is there documentation that shows how to do something this dumb "the right way"? Would be great to have that (which is essentially a 👍 for this issue).

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
342282646 https://github.com/pydata/xarray/issues/1097#issuecomment-342282646 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDM0MjI4MjY0Ng== lxkain 15930505 2017-11-06T20:52:41Z 2017-11-06T20:52:41Z NONE

Initially, I have tried this:

class Signal(Track): # 1D wave (no coords) and 1D time-value combined (with coords) def __init__(self, data: np.ndarray, coords=None, dims=None, name: str=None, attrs: dict=None): assert data.ndim == 2 assert data.shape[1] == 1 if dims is None: dims = ('time', 'amplitude') if coords is not None: assert 'time' in coords super().__init__(data, coords=coords, dims=dims, name=name, attrs=attrs)

However, I just know discovered the accessors and will have a look.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
342274111 https://github.com/pydata/xarray/issues/1097#issuecomment-342274111 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDM0MjI3NDExMQ== fmaussion 10050469 2017-11-06T20:21:27Z 2017-11-06T20:21:27Z MEMBER

Do we have examples of such subclasses? It would be interesting to document the cases where the accessors are not good enough.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943
342271714 https://github.com/pydata/xarray/issues/1097#issuecomment-342271714 https://api.github.com/repos/pydata/xarray/issues/1097 MDEyOklzc3VlQ29tbWVudDM0MjI3MTcxNA== lxkain 15930505 2017-11-06T20:12:41Z 2017-11-06T20:12:41Z NONE

Agreed. Just purely for information, I made a very simple subclass, but then a simple print statement didn't work:

Traceback (most recent call last): File "", line 1, in <module> File "/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/common.py", line 97, in repr return formatting.array_repr(self) File "/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/formatting.py", line 392, in array_repr summary.append(repr(arr.coords)) File "/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/formatting.py", line 63, in repr return ensure_valid_repr(self.unicode()) File "/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/coordinates.py", line 46, in unicode return formatting.coords_repr(self) File "/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/formatting.py", line 319, in coords_repr col_width = _calculate_col_width(_get_col_items(coords)) File "/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/formatting.py", line 281, in _get_col_items for k, v in mapping.items(): File "/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/_collections_abc.py", line 744, in iter yield (key, self._mapping[key]) File "/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/coordinates.py", line 191, in getitem return self._data._getitem_coord(key) File "/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/dataarray.py", line 465, in _getitem_coord return self._replace_maybe_drop_dims(var, name=key) File "/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/dataarray.py", line 257, in _replace_maybe_drop_dims return self._replace(variable, coords, name) File "/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/dataarray.py", line 248, in _replace return type(self)(variable, coords, name=name, fastpath=True) TypeError: init() got an unexpected keyword argument 'fastpath'

The last statement basically called my subclass, instead of DataArray, but my subclass didn't work well with that because it restricted data to be two-dimensional, but the library code required a dimension of 1 at this point.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Better support for subclasses: tests, docs and API 188113943

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.601ms · About: xarray-datasette