home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 499477363

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
499477363 MDU6SXNzdWU0OTk0NzczNjM= 3349 Implement polyfit? 1197350 closed 0     25 2019-09-27T14:25:14Z 2020-03-25T17:17:45Z 2020-03-25T17:17:45Z MEMBER      

Fitting a line (or curve) to data along a specified axis is a long-standing need of xarray users. There are many blog posts and SO questions about how to do it: - http://atedstone.github.io/rate-of-change-maps/ - https://gist.github.com/luke-gregor/4bb5c483b2d111e52413b260311fbe43 - https://stackoverflow.com/questions/38960903/applying-numpy-polyfit-to-xarray-dataset - https://stackoverflow.com/questions/52094320/with-xarray-how-to-parallelize-1d-operations-on-a-multidimensional-dataset - https://stackoverflow.com/questions/36275052/applying-a-function-along-an-axis-of-a-dask-array

The main use case in my domain is finding the temporal trend on a 3D variable (e.g. temperature in time, lon, lat).

Yes, you can do it with apply_ufunc, but apply_ufunc is inaccessibly complex for many users. Much of our existing API could be removed and replaced with apply_ufunc calls, but that doesn't mean we should do it.

I am proposing we add a Dataarray method called polyfit. It would work like this:

```python x_ = np.linspace(0, 1, 10) y_ = np.arange(5) a_ = np.cos(y_)

x = xr.DataArray(x_, dims=['x'], coords={'x': x_}) a = xr.DataArray(a_, dims=['y']) f = a*x p = f.polyfit(dim='x', deg=1)

equivalent numpy code

p_ = np.polyfit(x_, f.values.transpose(), 1) np.testing.assert_allclose(p_[0], a_) ```

Numpy's polyfit function is already vectorized in the sense that it accepts 1D x and 2D y, performing the fit independently over each column of y. To extend this to ND, we would just need to reshape the data going in and out of the function. We do this already in other packages. For dask, we could simply require that the dimension over which the fit is calculated be contiguous, and then call map_blocks.

Thoughts?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3349/reactions",
    "total_count": 9,
    "+1": 9,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 25 rows from issue in issue_comments
Powered by Datasette · Queries took 0.849ms · About: xarray-datasette