home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

8 rows where issue = 124915222 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 5

  • shoyer 2
  • lesommer 2
  • fmaussion 2
  • rabernat 1
  • rafa-guedes 1

author_association 3

  • MEMBER 5
  • NONE 2
  • CONTRIBUTOR 1

issue 1

  • Subclassing Dataset and DataArray · 8 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
221418034 https://github.com/pydata/xarray/issues/706#issuecomment-221418034 https://api.github.com/repos/pydata/xarray/issues/706 MDEyOklzc3VlQ29tbWVudDIyMTQxODAzNA== lesommer 7727985 2016-05-24T22:14:30Z 2016-05-24T22:14:30Z NONE

@shoyer oops, just found that the new functionnality has already been pulled. thanks.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Subclassing Dataset and DataArray 124915222
221417290 https://github.com/pydata/xarray/issues/706#issuecomment-221417290 https://api.github.com/repos/pydata/xarray/issues/706 MDEyOklzc3VlQ29tbWVudDIyMTQxNzI5MA== lesommer 7727985 2016-05-24T22:11:17Z 2016-05-24T22:11:17Z NONE

@shoyer : the approach you propose for registering additional methods for datasets or dataarray would certainly open very nice applications for xarray. This is for instance something that would very useful to the library we have discussed here (see e.g. this issue about oocgcm). Is there a way how I could contribute to having this register functionality available in xarray ?

@rabernat : your idea of a spectral analysis package on the top of xarray is interesting. I am happy to contribute to this (probably in the frame of the library mentionned above ?). As many others I guess, I have my own script for this (here), but having a more robust and shared code is certainly a good way to go.

Julien

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Subclassing Dataset and DataArray 124915222
192190921 https://github.com/pydata/xarray/issues/706#issuecomment-192190921 https://api.github.com/repos/pydata/xarray/issues/706 MDEyOklzc3VlQ29tbWVudDE5MjE5MDkyMQ== fmaussion 10050469 2016-03-04T08:57:02Z 2016-03-04T08:57:02Z MEMBER

Thanks, this looks very good. Any timeline for the xarray.register_accessor() functionality? ;)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Subclassing Dataset and DataArray 124915222
192114256 https://github.com/pydata/xarray/issues/706#issuecomment-192114256 https://api.github.com/repos/pydata/xarray/issues/706 MDEyOklzc3VlQ29tbWVudDE5MjExNDI1Ng== shoyer 1217238 2016-03-04T05:39:43Z 2016-03-04T05:39:43Z MEMBER

This would already be quite cool! But would the mechanism allow to pass arguments to the MyLibGis class at construction time? This might also be wordy, maybe something like ds = xray.DataArray(data, gis={'arg1':42})?

My suggested approach here would be to simply write functions instead, e.g.,

def make_gis_array(data, gis=None): data = xr.DataArray(data) data.attrs['gis'] = gis # or whatever

This is similar to how I would suggest inserting lazy variables, i.e., write your own functions using dask.array:

def add_lazy_vars(data): if 'P' in data and 'PB' in data: data['TP'] = data['P'].chunk() + data['PB'].chunk() return data

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Subclassing Dataset and DataArray 124915222
191697006 https://github.com/pydata/xarray/issues/706#issuecomment-191697006 https://api.github.com/repos/pydata/xarray/issues/706 MDEyOklzc3VlQ29tbWVudDE5MTY5NzAwNg== fmaussion 10050469 2016-03-03T10:29:08Z 2016-03-03T10:29:08Z MEMBER

I find @shoyer 's suggestion about custom accessor attributes very interesting!

the simplest of my use cases would be quite easy to implement:

``` python

MyLib

class MyLibGis(object): def init(self, xray_obj): self.obj = xray_obj self.georef = read_georef(xray_obj)

def subset(self, shapefile=None, roi=None):
    """Return a subset of DataSet (or DataArray)"""
    # compute regions of interests
    slicex, slicey = self.georef(stuff...)
    # return a sel of DataSet
    return self.obj.sel(x=slicex, y=slicey)

xray.register_accessor('gis', MyLibGis)

user code

import mylib import xray ds = xray.DataArray(...) ds = ds.gis.subset(shapefile='/path/to/shape') ```

This would already be quite cool! But would the mechanism allow to pass arguments to the MyLibGis class at construction time? This might also be wordy, maybe something like

ds = xray.DataArray(data, gis={'arg1':42})?

I guess that with these two mechanisms, I would be able to do almost everything I want to do with my netcdf files.

However, one other very important use case for me would be to add lazy "diagnostic" variables to a netcdf dataset. For example, if an atmospheric model output file contains the variables P andPB, then the dataset automatically proposes a new variable TP, which is the sum of P andPB. From the user perspective, this variable is no different than a variable on file. Of course, the data should be computed only on demand. It doesn't seem possible to do this without subclassing, but maybe I missed something?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Subclassing Dataset and DataArray 124915222
170173475 https://github.com/pydata/xarray/issues/706#issuecomment-170173475 https://api.github.com/repos/pydata/xarray/issues/706 MDEyOklzc3VlQ29tbWVudDE3MDE3MzQ3NQ== rafa-guedes 7799184 2016-01-09T00:59:14Z 2016-01-09T00:59:14Z CONTRIBUTOR

Cool, thanks @shoyer. Yes @rabernat I totally agree with you and I would be very keen to collaborate on a library like that, I think that would be useful for many people.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Subclassing Dataset and DataArray 124915222
169099306 https://github.com/pydata/xarray/issues/706#issuecomment-169099306 https://api.github.com/repos/pydata/xarray/issues/706 MDEyOklzc3VlQ29tbWVudDE2OTA5OTMwNg== shoyer 1217238 2016-01-05T19:05:57Z 2016-01-05T19:06:06Z MEMBER

Back when I was doing spectroscopy in grad school, I wrote some routines to keep track of the units in Fourier transforms. I put this up on GitHub last year: https://github.com/shoyer/fourier-transform. I'm sure I'm not the only person to have written this code, but it still might be a useful point of departure.

As for xray, I agree that the full extent of what you're describing is probably out of scope for xarray itself. However, a basic labeled FFT does seem like it would be a useful addition to the core library.

Nevertheless, I am very interested in supporting external packages like this, either via subclassing or a similar mechanism.

One possibility would be a mechanism for registering "namespace" packages that define additional methods (as I have mentioned previously). You could write something like:

``` python

this code exists in your library "specarray"

class SpecArray(object): def init(self, xray_obj): self.obj = xray_obj

def fft(self):
    ...
    return freq, transformed_obj

xray.register_accessor('spec', SpecArray)

this is what user code looks like

import specarray import xray ds = xray.DataArray(...) ds.spec.fft() # calls the SpecArray.fft method ```

This might be easier than maintaining a full subclass, which tends to require a lot of work and presents backwards compatibility issues when we update internal methods.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Subclassing Dataset and DataArray 124915222
169008093 https://github.com/pydata/xarray/issues/706#issuecomment-169008093 https://api.github.com/repos/pydata/xarray/issues/706 MDEyOklzc3VlQ29tbWVudDE2OTAwODA5Mw== rabernat 1197350 2016-01-05T13:57:34Z 2016-01-05T13:57:34Z MEMBER

Hi Rafael,

I do lots of multidimensional spectral analysis on geophysical data (mostly ocean satellite fields, this paper http://journals.ametsoc.org/doi/abs/10.1175/JPO-D-14-0160.1, for example), and I have recently started trying passing some of these calculations through xray. An example is in this notebook https://gist.github.com/rabernat/be4526e157eb1fc69f50, where I define a function to compute an isotropic power spectrum over specified dimensions.

One huge source of confusion for students starting out with such calculations is the questions, what are the spectral coordinates that come out of fft? (E.g. is it "shifted"?, is there a 2 pi factor in the units?, etc.) Because of xray's data model, these difficulties can be completely bypassed by including verbose descriptions of the dimensions and coordinates.

My view is that spectral analysis is out of scope for xray. However, I think there is the need for a domain specific spectral analysis package focused on geophysical data, which would naturally be built on xray. (As a comparison, consider the nitime http://nipy.org/nitime/ package for neuroimaging timeseries analysis.) This is something that I, and probably many others, would be interested in collaborating on. Some features I would like to see are: - wrapping of numpy fft to work on xray dataarrays, including proper handling of coordinates (pretty easy) - support for different windowing / multitaper methods - proper treatment of errors - built-in plotting - parallelization for out-of-core data (this is a hard one with fft but would be very useful)

I think such a package would really take off in popularity and would help to displace MATLAB for this very common type of analysis. The question is whether there really is enough common interest among different scientists to justify a new package, as opposed to everyone just "rolling their own" solution. Based on your email, it sounds like you might be interested in such an effort.

Cheers, Ryan Abernathey

.

On Tue, Jan 5, 2016 at 2:55 AM, Rafael Guedes notifications@github.com wrote:

Hi guys,

I have started writing some SpecArray class which inherits from DataArray and defines some methods useful for dealing with wave spectra, such as calculating spectral wave statistics like significant wave height, peak wave period, etc, interpolating, splitting, and performing some other tasks. I'd like to ask please if: - Is this something you guys would maybe be interested to add to your library? - Is there a simple way to ensure the methods I am defining are preserved when creating a Dataset out of this SpecArray object? currently I can create / add to a Dataset using this new object, but all new methods get lost by doing that.

Thanks, Rafael

— Reply to this email directly or view it on GitHub https://github.com/pydata/xarray/issues/706.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Subclassing Dataset and DataArray 124915222

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.491ms · About: xarray-datasette