html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1097#issuecomment-523898016,https://api.github.com/repos/pydata/xarray/issues/1097,523898016,MDEyOklzc3VlQ29tbWVudDUyMzg5ODAxNg==,5635139,2019-08-22T13:07:44Z,2019-08-22T13:07:44Z,MEMBER,"> The biggest problem is with all the Dataset methods _and accessors_ that return a DataArray, and vice versa. Anybody who wants to create a coupled pair of Dataset and DataArray subclasses will need to hunt down all methods and accessors that return the other class in the pair and override them.
This is correct - functions which convert between `DataArray` & `Dataset` wouldn't retain type. That said, this can still be helpful without the ability to change type.
There's a bigger piece of work which would solve this too, at the cost of abstraction: have class attributes which define `_array_type=DataArray` on `Dataset` and similar for `DataArray`. pandas used this
> May I ask what are the practical use cases for subclassing? In several years worth of day-to-day use of xarray I always found that encapsulation felt much more natural.
Right, good question and we should catch ourselves from adding every abstraction. I have one specific use-case we've already found helpful: we have an object that is _mostly_ a Dataset, with the addition of some behaviors and constructors - for example `from_[dataset_x_and_y]` or `assert_invariants`. Alternatives:
- A class which had the dataset as an attribute - but this means 90%+ of object use is `x.data`, and the object can't be passed into methods expecting a dataset.
- Prior to using xarray, we frequently used this pattern to aggregate lots of pandas dataframes (and I've seen this in the wild too)
- A normal dataset with some associated functions in a module - but makes discovering the functions harder
- The current state - an object inherited from `Dataset` with methods
We don't use accessors because the behaviors are specific to a class, rather than every xarray object.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188113943
https://github.com/pydata/xarray/issues/1097#issuecomment-523885536,https://api.github.com/repos/pydata/xarray/issues/1097,523885536,MDEyOklzc3VlQ29tbWVudDUyMzg4NTUzNg==,6213168,2019-08-22T12:32:37Z,2019-08-22T12:32:50Z,MEMBER,"> > Nevermind `__slots__`. I just tried and there is no noticeable speedup.
>
> I had thought the primary saving was memory (and fairly significant with lots of objects)
It is one of the two savings - the other _theoretically_ being attribute access time. I think I'll go on with a pull request for further discussion.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188113943
https://github.com/pydata/xarray/issues/1097#issuecomment-523881963,https://api.github.com/repos/pydata/xarray/issues/1097,523881963,MDEyOklzc3VlQ29tbWVudDUyMzg4MTk2Mw==,5635139,2019-08-22T12:22:02Z,2019-08-22T12:22:26Z,MEMBER,"> Nevermind `__slots__`. I just tried and there is no noticeable speedup.
I had thought the primary saving was memory (and fairly significant with lots of objects)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188113943
https://github.com/pydata/xarray/issues/1097#issuecomment-523878745,https://api.github.com/repos/pydata/xarray/issues/1097,523878745,MDEyOklzc3VlQ29tbWVudDUyMzg3ODc0NQ==,6213168,2019-08-22T12:12:42Z,2019-08-22T12:12:42Z,MEMBER,"Nevermind ``__slots__``. I just tried and there is no noticeable speedup.
Code at https://github.com/crusaderky/xarray/commit/26e0477759ce2ef9d1f8fd9d1b23234741517c5a","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188113943
https://github.com/pydata/xarray/issues/1097#issuecomment-523842624,https://api.github.com/repos/pydata/xarray/issues/1097,523842624,MDEyOklzc3VlQ29tbWVudDUyMzg0MjYyNA==,6213168,2019-08-22T10:17:27Z,2019-08-22T10:17:27Z,MEMBER,"There's also the argument that I would love, at some point, to migrate the xarray objects to use ``__slots__``. From a quick benchmark on my mac, that speeds up _every single attribute access_ by 7ns. I suspect that would very quickly add up. However, any subclass that defines extra attributes (as opposed to just methods) would break on the transition between ``__dict__`` an ``__slots__``.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188113943
https://github.com/pydata/xarray/issues/1097#issuecomment-523834809,https://api.github.com/repos/pydata/xarray/issues/1097,523834809,MDEyOklzc3VlQ29tbWVudDUyMzgzNDgwOQ==,6213168,2019-08-22T09:53:00Z,2019-08-22T09:53:45Z,MEMBER,"There's also a funny, pickle-friendly hack that allows you to add methods to a Dataset without subclassing it - thanks to the ``Dataset.__getattr__`` magic:
```
def custom(self: Dataset, ...):
...
ds = Dataset(...)
ds.attrs['custom'] = partial(custom, ds)
ds.custom(...)
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188113943
https://github.com/pydata/xarray/issues/1097#issuecomment-523833526,https://api.github.com/repos/pydata/xarray/issues/1097,523833526,MDEyOklzc3VlQ29tbWVudDUyMzgzMzUyNg==,6213168,2019-08-22T09:49:00Z,2019-08-22T09:49:00Z,MEMBER,"The biggest problem is with all the Dataset methods _and accessors_ that return a DataArray, and vice versa. Anybody who wants to create a coupled pair of Dataset and DataArray subclasses will need to hunt down all methods and accessors that return the other class in the pair and override them.
May I ask what are the practical use cases for subclassing? In several years worth of day-to-day use of xarray I always found that encapsulation felt much more natural.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188113943
https://github.com/pydata/xarray/issues/1097#issuecomment-523750630,https://api.github.com/repos/pydata/xarray/issues/1097,523750630,MDEyOklzc3VlQ29tbWVudDUyMzc1MDYzMA==,5635139,2019-08-22T05:12:08Z,2019-08-22T05:12:08Z,MEMBER,"For that case, you could put a breakpoint in and see what's calling it. It is bemusing
For subclass support, you could see whether there are methods that return `DataArray` / `Dataset` rather than `self.__class__`.
Maybe we could add a generalized test that could run over a number of methods and assert they return a subclass
I think we're in an equilibrium where subclassing isn't supported for most operations, so it's not used, so we don't hear about it's failures. A moderate push could move us out of that equilibrium! ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188113943
https://github.com/pydata/xarray/issues/1097#issuecomment-523723363,https://api.github.com/repos/pydata/xarray/issues/1097,523723363,MDEyOklzc3VlQ29tbWVudDUyMzcyMzM2Mw==,118582,2019-08-22T02:41:55Z,2019-08-22T02:41:55Z,NONE,"Yeah: I'd be happy to dig around a bit. Where should I look?
Also, @mbeyeler might be able to share a bit more about sub-classing woes. I think that he made a commendable ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188113943
https://github.com/pydata/xarray/issues/1097#issuecomment-523600557,https://api.github.com/repos/pydata/xarray/issues/1097,523600557,MDEyOklzc3VlQ29tbWVudDUyMzYwMDU1Nw==,5635139,2019-08-21T18:50:22Z,2019-08-21T18:50:22Z,MEMBER,"I think we should be able improve subclass support. (We use it internally at times, mainly for encapsulating logic on specific objects we inherit from `Dataset`).
@arokem if you're keen to investigate further, would be interesting to know what's happening there. It's possible it's an issue with `DataArray`'s `repr` rather than the subclass itself","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188113943
https://github.com/pydata/xarray/issues/1097#issuecomment-523593685,https://api.github.com/repos/pydata/xarray/issues/1097,523593685,MDEyOklzc3VlQ29tbWVudDUyMzU5MzY4NQ==,118582,2019-08-21T18:33:04Z,2019-08-21T18:33:04Z,NONE,"Just to add a puzzling attempt to subclass `DataArray` (provided by @mbeyeler) :

Are we doing it wrong? Is there documentation that shows how to do something this dumb ""the right way""? Would be great to have that (which is essentially a 👍 for this issue).
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188113943
https://github.com/pydata/xarray/issues/1097#issuecomment-342282646,https://api.github.com/repos/pydata/xarray/issues/1097,342282646,MDEyOklzc3VlQ29tbWVudDM0MjI4MjY0Ng==,15930505,2017-11-06T20:52:41Z,2017-11-06T20:52:41Z,NONE,"Initially, I have tried this:
```
class Signal(Track): # 1D wave (no coords) and 1D time-value combined (with coords)
def __init__(self,
data: np.ndarray,
coords=None,
dims=None,
name: str=None,
attrs: dict=None):
assert data.ndim == 2
assert data.shape[1] == 1
if dims is None:
dims = ('time', 'amplitude')
if coords is not None:
assert 'time' in coords
super().__init__(data, coords=coords, dims=dims, name=name, attrs=attrs)
```
However, I just know discovered the accessors and will have a look.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188113943
https://github.com/pydata/xarray/issues/1097#issuecomment-342274111,https://api.github.com/repos/pydata/xarray/issues/1097,342274111,MDEyOklzc3VlQ29tbWVudDM0MjI3NDExMQ==,10050469,2017-11-06T20:21:27Z,2017-11-06T20:21:27Z,MEMBER,Do we have examples of such subclasses? It would be interesting to document the cases where the [accessors](http://xarray.pydata.org/en/latest/internals.html#extending-xarray) are not good enough.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188113943
https://github.com/pydata/xarray/issues/1097#issuecomment-342271714,https://api.github.com/repos/pydata/xarray/issues/1097,342271714,MDEyOklzc3VlQ29tbWVudDM0MjI3MTcxNA==,15930505,2017-11-06T20:12:41Z,2017-11-06T20:12:41Z,NONE,"Agreed. Just purely for information, I made a very simple subclass, but then a simple print statement didn't work:
Traceback (most recent call last):
File """", line 1, in
File ""/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/common.py"", line 97, in __repr__
return formatting.array_repr(self)
File ""/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/formatting.py"", line 392, in array_repr
summary.append(repr(arr.coords))
File ""/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/formatting.py"", line 63, in __repr__
return ensure_valid_repr(self.__unicode__())
File ""/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/coordinates.py"", line 46, in __unicode__
return formatting.coords_repr(self)
File ""/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/formatting.py"", line 319, in coords_repr
col_width = _calculate_col_width(_get_col_items(coords))
File ""/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/formatting.py"", line 281, in _get_col_items
for k, v in mapping.items():
File ""/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/_collections_abc.py"", line 744, in __iter__
yield (key, self._mapping[key])
File ""/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/coordinates.py"", line 191, in __getitem__
return self._data._getitem_coord(key)
File ""/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/dataarray.py"", line 465, in _getitem_coord
return self._replace_maybe_drop_dims(var, name=key)
File ""/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/dataarray.py"", line 257, in _replace_maybe_drop_dims
return self._replace(variable, coords, name)
File ""/Users/kain/CloudStation/CSLU/sci/timeview/miniconda/envs/timeview/lib/python3.6/site-packages/xarray/core/dataarray.py"", line 248, in _replace
return type(self)(variable, coords, name=name, fastpath=True)
TypeError: __init__() got an unexpected keyword argument 'fastpath'
The last statement basically called my subclass, instead of DataArray, but my subclass didn't work well with that because it restricted data to be two-dimensional, but the library code required a dimension of 1 at this point.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188113943