home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

26 rows where issue = 210704949 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 13

  • shoyer 6
  • nbren12 3
  • fmaussion 3
  • jeffbparker 3
  • rabernat 2
  • lamorton 2
  • dopplershift 1
  • gajomi 1
  • scollis 1
  • marberi 1
  • yvikhlya 1
  • spencerahill 1
  • roxyboy 1

author_association 3

  • MEMBER 11
  • NONE 9
  • CONTRIBUTOR 6

issue 1

  • Add trapz to DataArray for mathematical integration · 26 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
359543819 https://github.com/pydata/xarray/issues/1288#issuecomment-359543819 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDM1OTU0MzgxOQ== shoyer 1217238 2018-01-22T19:50:25Z 2018-01-22T19:50:25Z MEMBER

I opened https://github.com/pydata/xarray/issues/1850 to discuss xarray-contrib.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
359536283 https://github.com/pydata/xarray/issues/1288#issuecomment-359536283 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDM1OTUzNjI4Mw== roxyboy 8934026 2018-01-22T19:25:43Z 2018-01-22T19:28:32Z NONE

I've also contributed to developing a python package (xrft) for fft keeping the awareness of the metadata of multidimensional xarray datasets.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
359534363 https://github.com/pydata/xarray/issues/1288#issuecomment-359534363 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDM1OTUzNDM2Mw== nbren12 1386642 2018-01-22T19:19:25Z 2018-01-22T19:19:25Z CONTRIBUTOR

I would also be very interested in seeing your codes @lamorton. Overall, I think the xarray community could really benefit from some kind of centralized contrib package which has a low barrier to entry for these kinds of functions. So far, I suspect there has been a large amount of code duplication for routine tasks like the fft, since I have also written a function for that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
359525739 https://github.com/pydata/xarray/issues/1288#issuecomment-359525739 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDM1OTUyNTczOQ== lamorton 23484003 2018-01-22T18:51:34Z 2018-01-22T19:15:31Z NONE

@gajomi I can find a place to upload what I have. I foresee some difficulty making a general wrapper due to the issue of naming conventions, but I like the idea too.

Edit: Here's what I have so far ... YMMV, it's still kinda rough. https://github.com/lamorton/SciPyXarray

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
359521912 https://github.com/pydata/xarray/issues/1288#issuecomment-359521912 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDM1OTUyMTkxMg== gajomi 244887 2018-01-22T18:38:50Z 2018-01-22T18:58:03Z CONTRIBUTOR

I've written wrappers for svd, fft, psd, gradient, and specgram, for starts

@lamorton I really like the suggestion from @shoyer about submodules for throwing wrappers from other libraries, but in the meantime I think I might like very much to check out your implementation of fft and gradient in particular if these are somewhere public. I have been hacking at at least the latter and other functions in the numpy/scipy scope.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
348622691 https://github.com/pydata/xarray/issues/1288#issuecomment-348622691 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDM0ODYyMjY5MQ== yvikhlya 2567105 2017-12-01T21:45:59Z 2017-12-01T21:45:59Z NONE

Hello. I discovered xarray a few days ago, and find it very useful for my job. Integral along a coordinate is one of few things which I found missing so far.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
311495001 https://github.com/pydata/xarray/issues/1288#issuecomment-311495001 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDMxMTQ5NTAwMQ== scollis 825351 2017-06-27T21:43:36Z 2017-06-27T21:43:36Z NONE

Adding my +1 without offering to do the work. :) This would be very welcome!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
302654511 https://github.com/pydata/xarray/issues/1288#issuecomment-302654511 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDMwMjY1NDUxMQ== marberi 2119690 2017-05-19T09:25:33Z 2017-05-19T09:25:33Z NONE

+1 for integrate. I found this thread when having the same problem.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
293982053 https://github.com/pydata/xarray/issues/1288#issuecomment-293982053 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI5Mzk4MjA1Mw== shoyer 1217238 2017-04-13T18:24:07Z 2017-04-13T18:24:07Z MEMBER

Perhaps a new package would be in order?

I would also be very happy to include many of these in a submodule inside xarray, e.g., xarray.scipy for wrappers of the scipy API. This would make it easier to use internal methods like apply_ufunc (though hopefully that will be public API soon).

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
293979667 https://github.com/pydata/xarray/issues/1288#issuecomment-293979667 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI5Mzk3OTY2Nw== lamorton 23484003 2017-04-13T18:14:53Z 2017-04-13T18:14:53Z NONE

If you give a mouse a cookie, he'll ask for a glass of milk. There are a whole slew of Numpy/Scipy functions that would really benefit from using xarray to organize input/out. I've written wrappers for svd, fft, psd, gradient, and specgram, for starts. Perhaps a new package would be in order?

{
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
287840030 https://github.com/pydata/xarray/issues/1288#issuecomment-287840030 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4Nzg0MDAzMA== shoyer 1217238 2017-03-20T17:43:12Z 2017-03-20T17:43:12Z MEMBER

By the way, the cumtrapz implementation I pasted above matches the scipy version when initial=0, which I also think would be a more sane default for integration.

Yes, I agree with both of you that we should fix initial=0. (I don't know if I would even bother with adding the option.)

As far as implementation is concerned. Is there any performance downside to using xarrays shift operators versus delving deeper into dask with map_blocks, etc? I looked into using dasks cumreduction function, but am not sure it is possible to implement the trapezoid method in that way without changing dask.

From a performance perspective, it would be totally fine to implement this either in terms of high level xarray operations like shift/sum/cumsum (manipulating full xarray objects) or in terms of high level dask.array operations like dask.array.cumsum (manipulating dask arrays). I would whatever is easiest. I'm pretty sure there is no reason why you need to get into dask's low-level API like map_blocks and cumreduction.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
287768014 https://github.com/pydata/xarray/issues/1288#issuecomment-287768014 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4Nzc2ODAxNA== nbren12 1386642 2017-03-20T14:03:10Z 2017-03-20T14:03:10Z CONTRIBUTOR

I usually agree that using too many (or any) switches within functions is not ideal. However, I think this is more important for low level or internal routines. For user facing interfaces, I think it is okay. After all, many numpy and scipy functions have convenient switches that control the return values.

By the way, the cumtrapz implementation I pasted above matches the scipy version when initial=0, which I also think would be a more sane default for integration.

As far as implementation is concerned. Is there any performance downside to using xarrays shift operators versus delving deeper into dask with map_blocks, etc? I looked into using dasks cumreduction function, but am not sure it is possible to implement the trapezoid method in that way without changing dask. On Mon, Mar 20, 2017 at 8:48 AM Fabien Maussion notifications@github.com wrote:

An argument against a single function is that the shape of the returned array is different in each case. Also, cumtrapz https://docs.scipy.org/doc/scipy-0.10.1/reference/generated/scipy.integrate.trapz.html has an inital keyword which changes the shape of the returned array. It is currently set to None per default, but should be set to 0 per default IMO.

I this is not a problem, I also like to have one single function for integration (simpler from a user perspective).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/1288#issuecomment-287749414, or mute the thread https://github.com/notifications/unsubscribe-auth/ABUokrAdpysuufZxHdLSdc1nseH9PtOkks5rnnWYgaJpZM4MOCxc .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
287749414 https://github.com/pydata/xarray/issues/1288#issuecomment-287749414 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4Nzc0OTQxNA== fmaussion 10050469 2017-03-20T12:48:23Z 2017-03-20T12:48:23Z MEMBER

An argument against a single function is that the shape of the returned array is different in each case. Also, cumtrapz has an inital keyword which changes the shape of the returned array. It is currently set to None per default, but should be set to 0 per default IMO.

I this is not a problem, I also like to have one single function for integration (simpler from a user perspective).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
287680627 https://github.com/pydata/xarray/issues/1288#issuecomment-287680627 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4NzY4MDYyNw== shoyer 1217238 2017-03-20T05:22:10Z 2017-03-20T05:22:10Z MEMBER

Sorry for letting this lapse.

Yes, we absolutely want this functionality in some form.

My concern is that this doesn't feel like functionality that inherently belongs as a method on a DataArray--if doesn't need to be a method, it shouldn't be. In numpy and scipy, these are separate functions and I think they work fine that way.

This is a fair point, and I agree with you from a purist OO-programming/software-engineering perspective (TensorFlow, for example, takes this approach). But with xarray, we have been taking a different path, putting methods on objects for the convenience of method chaining (like pandas). So from a consistency perspective, I think it's fine to keep these as methods. This is somewhat similar even to NumPy, where a number of the most commonly used functions are also methods.

Perhaps allow generic extension of da.integrate by letting the method keyword of da.integrate accept a function as an argument that performs the actual integration?

I don't see a big advantage to adding such an extension point. Almost assuredly it's less text and more clear to simply write ds.pipe(my_integrate, 'x') or my_integrate(ds, 'x') rather than ds.integrate('x', my_integrate).

Maybe this could be implemented by adding an optional cumulative flag.

I normally don't like adding flags for switching functionality entirely but maybe that would make sense here if there's enough shared code (e.g., simply substituting cumsum for sum). The alternative is something like cum_integrate which sounds kind of awkward and is one more additional method.

One thing that can be useful to do before writing code is to write out a docstring with all the bells and whistles we might eventually add. So let's give that a shot here and see if integrate still makes sense: ``` integrate(dim, method='trapz', cumulative=False)

Arguments

dim : str or DataArray DataArray or reference to an existing coordinate, labeling what to integrate over. cumulative : bool, optional Whether to do a non-cumulative (default) or cumulative integral. method : 'trapz' or 'simps', optional Whether to use the trapezoidal rule or Simpson's rule. ```

I could also imagine possibly adding a bounds or limits argument that specifies multiple limits for controlling multiple integrals at once (e.g., dim='x' and bounds=[0, 10, 20, 30, 40, 50] would result in an x dimension of length 5). This would certainly be useful for some of my current work. But maybe we should save this sort of add for later...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
287673906 https://github.com/pydata/xarray/issues/1288#issuecomment-287673906 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4NzY3MzkwNg== nbren12 1386642 2017-03-20T03:45:18Z 2017-03-20T03:45:18Z CONTRIBUTOR

I would also like to see an integrate function. I have had one monkey patched in my own xarray routines for a while now. Also wanted: cumtrapz and friends. Maybe this could be implemented by adding an optional cumulative flag. This shouldn't be too hard to do. For example, in the following cumtrapz implementation all that would need to be changed is the final cumsum call.

```python def cumtrapz(A, dim): """Cumulative Simpson's rule (aka Tai's method)

Notes
-----
Simpson rule is given by
    int f (x) = sum (f_i+f_i+1) dx / 2
"""
x = A[dim]
dx = x - x.shift(**{dim:1})
dx = dx.fillna(0.0)
return ((A.shift(**{dim:1}) + A)*dx/2.0)\
      .fillna(0.0)\
      .cumsum(dim)

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
283162736 https://github.com/pydata/xarray/issues/1288#issuecomment-283162736 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4MzE2MjczNg== jeffbparker 16630731 2017-02-28T21:15:39Z 2017-02-28T21:15:39Z NONE

The issue is that certain types of gridded data (such as output from numerical models) should actually not be integrated with the trapezoidal rule but rather should use the native finite volume discretization for their computational grid.

  • We are aiming for the 20% of functionality that covers 80% of use cases, not the long tail.
  • We don't want implementations of any complex numerical methods in xarray (like NumPy rather than SciPy).

I can see the problems down the road that @rabernat brings up. Say you have a high-order finite volume discretization and some numerical implementation of high-order integration for that gridding. What would your interface be? You could write it as new_integrate(da, dim, domain) but then it may be confusing to have da.integrate be different (and less accurate).

That might bring us back to the algorithmically descriptive name trapz, but then what about @shoyer's point that given a fixed gridding, da.integrate is the most readable choice of name? Perhaps allow generic extension of da.integrate by letting the method keyword of da.integrate accept a function as an argument that performs the actual integration?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
283158736 https://github.com/pydata/xarray/issues/1288#issuecomment-283158736 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4MzE1ODczNg== dopplershift 221526 2017-02-28T21:00:30Z 2017-02-28T21:00:30Z CONTRIBUTOR

👍 for the functionality (both intergrate and gradient) that work with DataArray. My concern is that this doesn't feel like functionality that inherently belongs as a method on a DataArray--if doesn't need to be a method, it shouldn't be. In numpy and scipy, these are separate functions and I think they work fine that way.

Another way to look at it is that methods are there to encapsulate some kind of manipulation of internal state or to ensure that some kind of invariant is maintained. I don't see how integrate is doing any of this for DataArray--seems like everything integrate would do would be doing can be accomplished using the public API. So really what you're buying is doing this: python da.integrate(dim='x', method='trapezoidal') instead of python integrate(da, dim='x', method='trapezoidal`) If you want to see what the pathological case of putting everything as a method for convenience looks like, go look at all the plot methods on matplotlib's Axes class. Pay special attention to the tangled web of stuff that comes from having ready access to the class's internals.

My real preference would just to have this work: python ds = xr.tutorial.load_dataset('rasm') np.trapz(ds['Tair'], axis='x') but I have no idea what that would take, so I'm perfectly fine with xarray gaining its own implementation.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
283143896 https://github.com/pydata/xarray/issues/1288#issuecomment-283143896 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4MzE0Mzg5Ng== spencerahill 6200806 2017-02-28T19:52:23Z 2017-02-28T19:52:23Z CONTRIBUTOR

I like the integrate idea. Nothing further to add not already covered nicely via the above concerns by @rabernat and responses by @shoyer.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
283109247 https://github.com/pydata/xarray/issues/1288#issuecomment-283109247 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4MzEwOTI0Nw== shoyer 1217238 2017-02-28T17:34:05Z 2017-02-28T19:00:00Z MEMBER

As usual @rabernat raises some excellent points!

I weakly prefer not to use the name integrate and instead keep the standard scipy names because they make clear the numerical algorithm that is being applied.

Yes, this is a totally valid concern, if a user might expect integrate to be calculating something different.

One point in favor of calling this integrate is that the name is highly searchable, which provides an excellent place to include documentation about how to integrate in general (including links to other packages, like pangeo's vector calculus package). But we know that nobody reads documentation ;).

But where does it end? Why not implement the rest of the scipy.ode module?

Looking at the rest of scipy.integrate, in some ways the functionality of trapz/cumtrapz/simps is uniquely well suited for xarray: they are focused on data ("given fixed samples") rather than solving a system of equations ("given a function").

numpy.gradient feels very complementary as well, so I could see that as also in scope, but there are similar concerns for the name. There might be some value in complementary names for integrals/gradients.

As a community we need to develop a roadmap that clearly defines the scope of xarray.

I doubt we'll be able to come up with hard and fast rules, but maybe we can enumerate some principles, e.g.,

  • Features should be useful to users in multiple fields.
  • Features should be primarily about working with labeled data.
  • We are aiming for the 20% of functionality that covers 80% of use cases, not the long tail.
  • We don't want implementations of any complex numerical methods in xarray (like NumPy rather than SciPy).
  • Sometimes it's OK to include a feature in xarray because it makes logical sense with the rest of the package even if it's slightly domain specific (e.g., CF-conventions for netCDF files).
{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
283128841 https://github.com/pydata/xarray/issues/1288#issuecomment-283128841 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4MzEyODg0MQ== fmaussion 10050469 2017-02-28T18:53:41Z 2017-02-28T18:53:41Z MEMBER

I weakly prefer not to use the name integrate and instead keep the standard scipy names because they make clear the numerical algorithm that is being applied.

Yes, this is a totally valid concern, if a user might expect integrate to be calculating something different. One point in favor of calling this integrate is that the name is highly searchable, which provides an excellent place to include documentation about how to integrate in general (including links to other packages, like pangeo's vector calculus package). But we know that nobody reads documentation ;).

integrate would allow to do things like:

da.integrate(how='rectangle')

da.integrate(how='trapezoidal')

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
283127924 https://github.com/pydata/xarray/issues/1288#issuecomment-283127924 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4MzEyNzkyNA== rabernat 1197350 2017-02-28T18:50:11Z 2017-02-28T18:50:11Z MEMBER

And I'm fine with integrate if that is the consensus here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
283062107 https://github.com/pydata/xarray/issues/1288#issuecomment-283062107 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4MzA2MjEwNw== rabernat 1197350 2017-02-28T14:59:54Z 2017-02-28T14:59:54Z MEMBER

Having an xarray wrapper on trapz or cumtrapz would definitely be useful for many users. I weakly prefer not to use the name integrate and instead keep the standard scipy names because they make clear the numerical algorithm that is being applied. The issue is that certain types of gridded data (such as output from numerical models) should actually not be integrated with the trapezoidal rule but rather should use the native finite volume discretization for their computational grid. The goal of our hypothetical pangeo vector calculus package is to implement integrals and derivatives in such a context. A built-in xarray integration function would apply in cases where the data is assumed to be continuous, and where no auxiliary information about the grid (beyond the coordinates) is available.

I will also make the same comment I always make when such feature requests are raised: yes, it always seems desirable to add new features to xarray on a function-by-function basis. But where does it end? Why not implement the rest of the scipy.ode module? And why stop there? As a community we need to develop a roadmap that clearly defines the scope of xarray. Once apply is stable, it might not be that hard to wrap a large fraction of the scipy library. But maybe that should live in a separate package.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
282970673 https://github.com/pydata/xarray/issues/1288#issuecomment-282970673 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4Mjk3MDY3Mw== jeffbparker 16630731 2017-02-28T08:07:49Z 2017-02-28T08:14:06Z NONE

An integrate method is probably better for the reason you describe---it's more obvious. I believe the name trapz came from Matlab originally.

With a general integrate, it's probably also useful to allow optional input arguments for lower_bound and upper_bound as a convenience for integrating over a subset of the data instead of the user doing that in a slice. If those arguments aren't given, they would default to all of the data.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
282968309 https://github.com/pydata/xarray/issues/1288#issuecomment-282968309 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4Mjk2ODMwOQ== shoyer 1217238 2017-02-28T07:55:14Z 2017-02-28T08:09:23Z MEMBER

I agree that the API should mostly copy the mean/sum reduce methods (and in fact the implementation could probably share much of the logic). But there's still a question of whether the API should expose multiple methods like DataArray.trapz/DataArray.simps or a single method like DataArray.integrate (with method='simps'/method='trapz').

As long as there isn't something else we'd want to reserve the name for, I like the sound of integrate a little better, because it's more self-descriptive. trapz is only obvious if you know the name of the NumPy method. In contrast, integrate is the obvious way to approximate an integral. I would only hold off on using integrate if there is different functionality that comes to mind with the same.

It looks like SciPy implements Simpson's rule with the same API (see scipy.integrate.simps), so that would be easy to support, too. Given how prevalent SciPy is these days, I would have no compunctions about making scipy required for this method and defaulting to method='simps' for DataArray.integrate.

It would be useful to have dask.array versions of these functions, too, but that's not essential for a first pass. The implementation of trapz is very simple, so this would be quite easy to add to dask.

CC @spencerahill @rabernat @lesommer in case any of you have opinions about this

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
282970580 https://github.com/pydata/xarray/issues/1288#issuecomment-282970580 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4Mjk3MDU4MA== fmaussion 10050469 2017-02-28T08:07:20Z 2017-02-28T08:07:20Z MEMBER

+1 for integrate

The cumulative integral is of very frequent use in atmospheric sciences, too : https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.integrate.cumtrapz.html

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949
282963661 https://github.com/pydata/xarray/issues/1288#issuecomment-282963661 https://api.github.com/repos/pydata/xarray/issues/1288 MDEyOklzc3VlQ29tbWVudDI4Mjk2MzY2MQ== jeffbparker 16630731 2017-02-28T07:28:54Z 2017-02-28T07:37:58Z NONE

I don't at the moment see a reason to use a different API than DataArray.mean or DataArray.sum. DataArrays assume a default spacing of 1 if coordinates are not given, which is exactly what np.trapz does. So the API for trapz might look like:

DataArray.trapz(dim=None, axis=None, skipna=None, keep_attrs=False, **kwargs)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add trapz to DataArray for mathematical integration 210704949

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1518.879ms · About: xarray-datasette