html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/4849#issuecomment-810635014,https://api.github.com/repos/pydata/xarray/issues/4849,810635014,MDEyOklzc3VlQ29tbWVudDgxMDYzNTAxNA==,39069044,2021-03-30T23:04:52Z,2021-03-30T23:04:52Z,CONTRIBUTOR,I think so. I pushed a merge commit to get this up to date with the current release.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,797302408
https://github.com/pydata/xarray/pull/4849#issuecomment-782731258,https://api.github.com/repos/pydata/xarray/issues/4849,782731258,MDEyOklzc3VlQ29tbWVudDc4MjczMTI1OA==,39069044,2021-02-20T18:46:55Z,2021-02-20T18:46:55Z,CONTRIBUTOR,"Thanks for the review @dcherian! The latest commit has been refactored with a couple helper functions and associated tests, and any steps that served no purpose other than consistency with `polyfit` have been removed.
If you can think of any more specific test cases that should be included, happy to add them.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,797302408
https://github.com/pydata/xarray/pull/4849#issuecomment-777602757,https://api.github.com/repos/pydata/xarray/issues/4849,777602757,MDEyOklzc3VlQ29tbWVudDc3NzYwMjc1Nw==,39069044,2021-02-11T16:02:54Z,2021-02-11T16:02:54Z,CONTRIBUTOR,"I've been playing around with this some more, and found the performance to be much better using a process-heavy dask scheduler. For example:
```python
import xarray as xr
import numpy as np
import time
import dask
def exponential(x, a, xc):
return np.exp((x - xc) / a)
x = np.arange(-5, 5, 0.001)
t = np.arange(-5, 5, 0.01)
X, T = np.meshgrid(x, t)
Z1 = np.random.uniform(low=-5, high=5, size=X.shape)
Z2 = exponential(Z1, 3, X) + np.random.normal(scale=0.1, size=Z1.shape)
ds = xr.Dataset(
data_vars=dict(var1=([""t"", ""x""], Z1), var2=([""t"", ""x""], Z2)),
coords={""t"": t, ""x"": x},
)
ds = ds.chunk({'x':10})
def test_fit():
start = time.time()
fit = ds.var2.curvefit(
coords=ds.var1,
func=exponential,
reduce_dim=""t"",
).compute()
print(f'Fitting time: {time.time() - start:.2f}s')
with dask.config.set(scheduler='threads'):
test_fit()
with dask.config.set(scheduler='processes'):
test_fit()
with dask.distributed.Client() as client:
test_fit()
with dask.distributed.Client(n_workers=8, threads_per_worker=1) as client:
test_fit()
```
On my 8-core machine, takes:
```
Fitting time: 8.32s
Fitting time: 2.71s
Fitting time: 4.40s
Fitting time: 3.43s
```
According to [this](https://github.com/scipy/scipy/pull/8740) the underlying scipy routines should be thread safe.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,797302408
https://github.com/pydata/xarray/pull/4849#issuecomment-772874760,https://api.github.com/repos/pydata/xarray/issues/4849,772874760,MDEyOklzc3VlQ29tbWVudDc3Mjg3NDc2MA==,39069044,2021-02-03T22:37:50Z,2021-02-03T22:37:50Z,CONTRIBUTOR,"Added some checks that will raise errors if `inspect.signature` is not able to determine the function arguments, either for functions with variable-length args or things like numpy ufuncs that don't seem to work with `signature`. In these cases, `param_names` can be passed manually.
Also added minimal tests, but these should probably be expanded.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,797302408
https://github.com/pydata/xarray/pull/4849#issuecomment-771907303,https://api.github.com/repos/pydata/xarray/issues/4849,771907303,MDEyOklzc3VlQ29tbWVudDc3MTkwNzMwMw==,39069044,2021-02-02T19:19:01Z,2021-02-02T19:19:01Z,CONTRIBUTOR,"Added a couple usage examples in the docs, including one that replicates the [scipy](https://scipython.com/blog/non-linear-least-squares-fitting-of-a-two-dimensional-data/) example of fitting multiple peaks. Because of the wrapper function and variable args, this requires supplying `param_names` manually. Works nicely though.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,797302408
https://github.com/pydata/xarray/pull/4849#issuecomment-771320891,https://api.github.com/repos/pydata/xarray/issues/4849,771320891,MDEyOklzc3VlQ29tbWVudDc3MTMyMDg5MQ==,39069044,2021-02-02T03:12:41Z,2021-02-02T03:12:41Z,CONTRIBUTOR,"Some more progress here.
1. I liked the idea of being able to supply `p0` and `bounds` as a dictionary mapping between parameter names and values, which is much more in the style of xarray. So I've included the logic to translate between this and the scipy args which are just ordered lists. The logic first prioritizes any of these values supplied directly in keyword args, then looks for defaults in the function definition, and otherwise falls back to the scipy defaults.
2. There's now a working implementation of multi-dimensional fits using `ravel` to flatten the coordinate arrays. One limitation is that `func` must be specified like `f((x, y), z)` as scipy takes a single `x` argument for the fitting coordinate. Maybe there is a clever way around this, but I think it's ok.
3. Some changes to the API.
The best way to specify the fitting coordinates is a bit tricky to figure out. My original use case for this was needing to fit a relationship between two time/lat/lon dataarrays with the fit done over all time. But probably a more common use would be to just fit a curve over one or two dimensions that already exist in your data. So it would be great to handle these possibilities seamlessly.
What I've settled on for now is a `coords` argument that takes a list of coordinates that should be the same length as the input coordinates of your fitting function. Additionally, there is a `reduce_dim` arg, so that, say if you want to fit a function in time but aggregate over all lat and lon, you can supply those dimensions here. So with a 3D x/y/time array, any of the following should work:
```python
# Fit a 1d function in time, returns parameters with dims (x, y)
da.curvefit(coords='time', ...)
# Fit a 2d function in space, returns parameters with dims (t)
da.curvefit(coords=['x', 'y'], ...)
# Fit a 1d function with another 3d dataarray and aggregate over time, returns parameters with dims (x, y)
da.curvefit(coords=da1, reduce_dim='time', ...)
```
The logic to make this work got a bit complicated, since we need to supply the right `input_core_dims` to `apply_ufunc`, and also to explicitly broadcast the coordinates to ensure the `ravel` operation works. Any thoughts on cleanup here appreciated.
Will eventually need to add tests and improve docs and examples. Tests especially I could use some help on.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,797302408
https://github.com/pydata/xarray/pull/4849#issuecomment-770395357,https://api.github.com/repos/pydata/xarray/issues/4849,770395357,MDEyOklzc3VlQ29tbWVudDc3MDM5NTM1Nw==,39069044,2021-01-31T14:59:12Z,2021-01-31T14:59:12Z,CONTRIBUTOR,"> 1. Different fit coefficients as differently-named variables in the output, rather than indexed with a coordinate.
I think the way I configured things now does replicate the polyfit results. For example:
```python
ds = xr.tutorial.open_dataset('air_temperature')
ds['air2'] = ds.air.copy()
ds.polyfit(dim='time', deg=2)
Dimensions: (degree: 3, lat: 25, lon: 53)
Coordinates:
* degree (degree) int64 2 1 0
* lat (lat) float64 75.0 72.5 70.0 ... 20.0 17.5 15.0
* lon (lon) float64 200.0 202.5 205.0 ... 327.5 330.0
Data variables:
air_polyfit_coefficients (degree, lat, lon) float64 -1.162e-32 ... 1.13...
air2_polyfit_coefficients (degree, lat, lon) float64 -1.162e-32 ... 1.14...
```
Compared to this:
```python
def square(x, a, b ,c):
return a*np.power(x, 2) + b*x + c
ds.curvefit(x=ds.time, dim='time', func=square)
Dimensions: (lat: 25, lon: 53, param: 3)
Coordinates:
* lat (lat) float32 75.0 72.5 70.0 ... 20.0 17.5 15.0
* lon (lon) float32 200.0 202.5 205.0 ... 327.5 330.0
* param (param) 2. Initial guesses for each fit parameter.
Yeah this would be good. Should be easy to look for default values in the function itself using `inspect`.
> 3. (Stretch goal) Ability to fit >1D functions
Looks like this could be possible with a call to `ravel` like [this](https://scipython.com/blog/non-linear-least-squares-fitting-of-a-two-dimensional-data/). I'll do some experimenting.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,797302408