home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 374460958

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
374460958 MDU6SXNzdWUzNzQ0NjA5NTg= 2517 Treat accessor dataarrays as members of parent dataset 6153603 closed 0     5 2018-10-26T16:37:45Z 2018-11-05T22:40:46Z 2018-11-05T22:40:46Z CONTRIBUTOR      

Code Sample

```python import xarray as xr import pandas as pd

What I'm doing with comparison, I'd like to do with actual

comparison = xr.Dataset({'data': (['time'], [100, 30, 10, 3, 1]), 'altitude': (['time'], [5, 10, 15, 20, 25])}, coords={'time': pd.date_range('2014-09-06', periods=5, freq='1s')})

With altitude as a data var, I can do the following:

comparison.swap_dims({'time': 'altitude'}).interp(altitude=12.0).data

And

for (time, g) in comparison.groupby('time'): print(time) print(g.altitude.values)

@xr.register_dataset_accessor('acc') class Accessor(object): def init(self, xarray_ds): self._ds = xarray_ds self._altitude = None

@property
def altitude(self):
    """ An expensive calculation that results in data that not everyone needs. """
    if self._altitude is None:
        self._altitude = xr.DataArray([5, 10, 15, 20, 25],
                                      coords=[('time', self._ds.time)])
    return self._altitude

actual = xr.Dataset({'data': (['time'], [100, 30, 10, 3, 1])}, coords={'time': pd.date_range('2014-09-06', periods=5, freq='1s')})

This doesn't work:

actual.swap_dims({'time': 'altitude'}).interp(altitude=12.0).data

Neither does this:

for (time, g) in actual.groupby('time'): print(time) print(g.acc.altitude.values)

```

Problem description

I've been using accessors to extend xarray with some custom computation. The altitude in the above dataset is not used every time the data is loaded, but when it is, it is an expensive computation to make (which is why I put it in as an accessor; if it isn't needed, it isn't computed).

Problem is, once it has been computed, I'd like to be able to use it as if it is a regular data_var of the dataset. For example, to interp on the newly computed column, or use it in a groupby.

Please advise if I'm going about this in the wrong way and how I should think about this problem instead.

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 4.18.16-arch1-1-ARCH machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_CA.UTF-8 LOCALE: en_CA.UTF-8 xarray: 0.10.8 pandas: 0.23.1 numpy: 1.14.5 scipy: 1.1.0 netCDF4: 1.4.0 h5netcdf: 0.6.1 h5py: 2.8.0 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: 0.17.5 distributed: 1.21.8 matplotlib: 2.2.2 cartopy: None seaborn: None setuptools: 39.2.0 pip: 9.0.3 conda: None pytest: 3.6.1 IPython: None sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2517/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 5 rows from issue in issue_comments
Powered by Datasette · Queries took 0.749ms · About: xarray-datasette