home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

6 rows where issue = 210704949 and user = 1217238 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • shoyer · 6 ✖

issue 1

  • Add trapz to DataArray for mathematical integration · 6 ✖

author_association 1

  • MEMBER 6
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
359543819 https://github.com/pydata/xarray/issues/1288#issuecomment-359543819 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDM1OTU0MzgxOQ== shoyer 1217238 2018-01-22T19:50:25Z 2018-01-22T19:50:25Z MEMBER

I opened https://github.com/pydata/xarray/issues/1850 to discuss xarray-contrib.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
293982053 https://github.com/pydata/xarray/issues/1288#issuecomment-293982053 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI5Mzk4MjA1Mw== shoyer 1217238 2017-04-13T18:24:07Z 2017-04-13T18:24:07Z MEMBER

Perhaps a new package would be in order?

I would also be very happy to include many of these in a submodule inside xarray, e.g., xarray.scipy for wrappers of the scipy API. This would make it easier to use internal methods like apply_ufunc (though hopefully that will be public API soon).

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
287840030 https://github.com/pydata/xarray/issues/1288#issuecomment-287840030 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4Nzg0MDAzMA== shoyer 1217238 2017-03-20T17:43:12Z 2017-03-20T17:43:12Z MEMBER

By the way, the cumtrapz implementation I pasted above matches the scipy version when initial=0, which I also think would be a more sane default for integration.

Yes, I agree with both of you that we should fix initial=0. (I don't know if I would even bother with adding the option.)

As far as implementation is concerned. Is there any performance downside to using xarrays shift operators versus delving deeper into dask with map_blocks, etc? I looked into using dasks cumreduction function, but am not sure it is possible to implement the trapezoid method in that way without changing dask.

From a performance perspective, it would be totally fine to implement this either in terms of high level xarray operations like shift/sum/cumsum (manipulating full xarray objects) or in terms of high level dask.array operations like dask.array.cumsum (manipulating dask arrays). I would whatever is easiest. I'm pretty sure there is no reason why you need to get into dask's low-level API like map_blocks and cumreduction.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
287680627 https://github.com/pydata/xarray/issues/1288#issuecomment-287680627 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4NzY4MDYyNw== shoyer 1217238 2017-03-20T05:22:10Z 2017-03-20T05:22:10Z MEMBER

Sorry for letting this lapse.

Yes, we absolutely want this functionality in some form.

My concern is that this doesn't feel like functionality that inherently belongs as a method on a DataArray--if doesn't need to be a method, it shouldn't be. In numpy and scipy, these are separate functions and I think they work fine that way.

This is a fair point, and I agree with you from a purist OO-programming/software-engineering perspective (TensorFlow, for example, takes this approach). But with xarray, we have been taking a different path, putting methods on objects for the convenience of method chaining (like pandas). So from a consistency perspective, I think it's fine to keep these as methods. This is somewhat similar even to NumPy, where a number of the most commonly used functions are also methods.

Perhaps allow generic extension of da.integrate by letting the method keyword of da.integrate accept a function as an argument that performs the actual integration?

I don't see a big advantage to adding such an extension point. Almost assuredly it's less text and more clear to simply write ds.pipe(my_integrate, 'x') or my_integrate(ds, 'x') rather than ds.integrate('x', my_integrate).

Maybe this could be implemented by adding an optional cumulative flag.

I normally don't like adding flags for switching functionality entirely but maybe that would make sense here if there's enough shared code (e.g., simply substituting cumsum for sum). The alternative is something like cum_integrate which sounds kind of awkward and is one more additional method.

One thing that can be useful to do before writing code is to write out a docstring with all the bells and whistles we might eventually add. So let's give that a shot here and see if integrate still makes sense: ``` integrate(dim, method='trapz', cumulative=False)

Arguments

dim : str or DataArray DataArray or reference to an existing coordinate, labeling what to integrate over. cumulative : bool, optional Whether to do a non-cumulative (default) or cumulative integral. method : 'trapz' or 'simps', optional Whether to use the trapezoidal rule or Simpson's rule. ```

I could also imagine possibly adding a bounds or limits argument that specifies multiple limits for controlling multiple integrals at once (e.g., dim='x' and bounds=[0, 10, 20, 30, 40, 50] would result in an x dimension of length 5). This would certainly be useful for some of my current work. But maybe we should save this sort of add for later...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
283109247 https://github.com/pydata/xarray/issues/1288#issuecomment-283109247 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4MzEwOTI0Nw== shoyer 1217238 2017-02-28T17:34:05Z 2017-02-28T19:00:00Z MEMBER

As usual @rabernat raises some excellent points!

I weakly prefer not to use the name integrate and instead keep the standard scipy names because they make clear the numerical algorithm that is being applied.

Yes, this is a totally valid concern, if a user might expect integrate to be calculating something different.

One point in favor of calling this integrate is that the name is highly searchable, which provides an excellent place to include documentation about how to integrate in general (including links to other packages, like pangeo's vector calculus package). But we know that nobody reads documentation ;).

But where does it end? Why not implement the rest of the scipy.ode module?

Looking at the rest of scipy.integrate, in some ways the functionality of trapz/cumtrapz/simps is uniquely well suited for xarray: they are focused on data ("given fixed samples") rather than solving a system of equations ("given a function").

numpy.gradient feels very complementary as well, so I could see that as also in scope, but there are similar concerns for the name. There might be some value in complementary names for integrals/gradients.

As a community we need to develop a roadmap that clearly defines the scope of xarray.

I doubt we'll be able to come up with hard and fast rules, but maybe we can enumerate some principles, e.g.,

  • Features should be useful to users in multiple fields.
  • Features should be primarily about working with labeled data.
  • We are aiming for the 20% of functionality that covers 80% of use cases, not the long tail.
  • We don't want implementations of any complex numerical methods in xarray (like NumPy rather than SciPy).
  • Sometimes it's OK to include a feature in xarray because it makes logical sense with the rest of the package even if it's slightly domain specific (e.g., CF-conventions for netCDF files).
{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
282968309 https://github.com/pydata/xarray/issues/1288#issuecomment-282968309 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4Mjk2ODMwOQ== shoyer 1217238 2017-02-28T07:55:14Z 2017-02-28T08:09:23Z MEMBER

I agree that the API should mostly copy the mean/sum reduce methods (and in fact the implementation could probably share much of the logic). But there's still a question of whether the API should expose multiple methods like DataArray.trapz/DataArray.simps or a single method like DataArray.integrate (with method='simps'/method='trapz').

As long as there isn't something else we'd want to reserve the name for, I like the sound of integrate a little better, because it's more self-descriptive. trapz is only obvious if you know the name of the NumPy method. In contrast, integrate is the obvious way to approximate an integral. I would only hold off on using integrate if there is different functionality that comes to mind with the same.

It looks like SciPy implements Simpson's rule with the same API (see scipy.integrate.simps), so that would be easy to support, too. Given how prevalent SciPy is these days, I would have no compunctions about making scipy required for this method and defaulting to method='simps' for DataArray.integrate.

It would be useful to have dask.array versions of these functions, too, but that's not essential for a first pass. The implementation of trapz is very simple, so this would be quite easy to add to dask.

CC @spencerahill @rabernat @lesommer in case any of you have opinions about this

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 176.364ms · About: xarray-datasette