home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

4 rows where repo = 13221727, type = "issue" and user = 20118130 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue · 4 ✖

state 1

  • closed 4

repo 1

  • xarray · 4 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1674818753 I_kwDOAMm_X85j07TB 7768 Supplying multidimensional initial guess to `curvefit` mgunyho 20118130 closed 0     5 2023-04-19T12:37:53Z 2024-03-25T20:02:14Z 2023-05-31T12:43:09Z CONTRIBUTOR      

Is your feature request related to a problem?

Hi, I'm trying to use DataArray.curvefit to fit a bunch of data. Let's say the data dimensions are (x, experiment_index), and I'm trying to fit m * x + b, where m will be different for each experiment_index. I would like to supply an initial guess p0 to curvefit that depends on experiment_index, but it seems like this is not supported. Here's a minimal example:

```python import numpy as np import xarray as xr

x = xr.DataArray(coords=[("x", np.linspace(0, 10, 101))]).x i = xr.DataArray(coords=[("experiment_index", [1, 2, 3])]).experiment_index

data = 2.0 * i * x + 5

m_guess = 2 * i

data.curvefit( "x", lambda x, m, b: m * x + b, p0={"m": m_guess} # I would like to provide a guess for 'm' as a function of experiment_index ) ```

Describe the solution you'd like

I would like to be able to provide arrays as the values of p0, so that I can have different initial guesses for different slices of the data.

I suppose this could also be implemented for bounds.

Describe alternatives you've considered

I could wrap curvefit in a for-loop, for example

python result = [] for y in data.transpose("experiment_index", ...): result.append(y.curvefit( "x", lambda x, m, b: m * x + b, p0={"m": m_guess.sel(experiment_index=y.experiment_index).item()}, )) result = xr.concat(result, dim="experiment_index")

But this is quite cumbersome, especially for multidimensional data.

Additional context

The above example gives the error *** ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part. because curve_fit tries to do np.atleast_1d([m_guess, 1]), but it should be np.atleast_1d([m_guess[0], 1]).

~~The above example gives the error~~ ValueError: operands could not be broadcast together with shapes (3,) (101,) ~~which comes from scipy.curve_fit tying to compute m * x, where m is the DataArray m_guess, but x is a plain Numpy array, basically x.data.~~ - this applies for scipy 1.7.

This toy example of course works with just a scalar guess like p0={"m": 2}, but in my case the function is more complicated and fit might fail if the initial guess is too far off.

The initial guess is inserted into kwargs passed to curve_fit here: https://github.com/pydata/xarray/blob/c75ac8b7ab33be95f74d9e6f10b8173c68828751/xarray/core/dataset.py#L8659

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7768/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1698656265 I_kwDOAMm_X85lP3AJ 7823 DataArray.to_dataset(dim) silently drops variable if it is already a dim mgunyho 20118130 closed 0     3 2023-05-06T14:37:59Z 2023-11-14T22:28:18Z 2023-11-14T22:28:18Z CONTRIBUTOR      

What happened?

If I have a DataArray da which I split into a Dataset using da.to_dataset(dim), and one of the values of da[dim] also happens to be one of the dimensions of da, that variable is silently missing from the resulting dataset.

What did you expect to happen?

If a variable cannot be created because it is already a dimension, it should raise an exception, or possibly issue a warning and rename the variable, so that no data is lost.

Minimal Complete Verifiable Example

```Python import xarray as xr

da = xr.DataArray( np.zeros((3, 3)), coords={ # note how 'foo' is one of the coordinate values, and also the name of a dimension "x": ["foo", "bar", "baz"], "foo": [1, 2, 3], } )

this produces a Dataset with the variables 'bar' and 'baz', 'foo' is missing (because it is already a coordinate)

print(da.to_dataset("x"))

this produces a dataset with the variables 'foo', 'bar', and 'baz', as epected

print(da.rename({"foo": "qux"}).to_dataset("x")) ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

```Python

Output of first conversion

<xarray.Dataset> Dimensions: (foo: 3) Coordinates: * foo (foo) int64 1 2 3 Data variables: bar (foo) float64 0.0 0.0 0.0 baz (foo) float64 0.0 0.0 0.0

Output of second conversion

<xarray.Dataset> Dimensions: (qux: 3) Coordinates: * qux (qux) int64 1 2 3 Data variables: foo (qux) float64 0.0 0.0 0.0 bar (qux) float64 0.0 0.0 0.0 baz (qux) float64 0.0 0.0 0.0 ```

Anything else we need to know?

This came up when I did to_dataset("param") on the fit result returned by curvefit, and one of the data dimensions happened to be the same as one of the arguments of the function which I was fitting. I was initially very confused by this.

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.10.10 (main, Mar 01 2023, 21:10:14) [GCC] python-bits: 64 OS: Linux OS-release: 6.2.12-1-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: ('en_GB', 'UTF-8') libhdf5: 1.12.2 libnetcdf: None xarray: 2023.4.2 pandas: 2.0.1 numpy: 1.23.5 scipy: 1.10.1 netCDF4: None pydap: None h5netcdf: 1.1.0 h5py: 3.8.0 Nio: None zarr: 2.14.2 cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: 1.3.7 dask: 2023.4.1 distributed: None matplotlib: 3.7.1 cartopy: None seaborn: 0.12.2 numbagg: None fsspec: 2023.4.0 cupy: None pint: None sparse: 0.14.0 flox: None numpy_groupies: None setuptools: 65.5.0 pip: 22.3.1 conda: None pytest: 7.3.1 mypy: None IPython: 8.13.2 sphinx: 6.2.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7823/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1752541983 I_kwDOAMm_X85odasf 7908 `plot.scatter(hue_style="invalid")` does not raise an exception mgunyho 20118130 closed 0     0 2023-06-12T11:30:22Z 2023-07-13T23:17:50Z 2023-07-13T23:17:50Z CONTRIBUTOR      

What happened?

If I do a scatterplot with hue_style=x, where x is not "continuous" or "discrete", the result is the same as passing hue_style="continuous".

Probably related to #7907.

What did you expect to happen?

An invalid value should raise an exception.

Minimal Complete Verifiable Example

```Python import matplotlib.pyplot as plt import numpy as np import xarray as xr

x = xr.DataArray( np.random.default_rng().random((10, 3)), coords=[ ("idx", np.linspace(0, 1, 10)), ("color", [1, 2, 3]), ] )

x.plot.scatter(x="idx", hue="color", hue_style="invalid", ax=plt.figure().gca()) plt.show() ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.8.10 (default, May 26 2023, 14:05:08) [GCC 9.4.0] python-bits: 64 OS: Linux OS-release: 5.14.0-1059-oem machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2023.1.0 pandas: 1.4.3 numpy: 1.23.0 scipy: None netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.5.3 cartopy: None seaborn: None numbagg: None fsspec: None cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 44.0.0 pip: 20.0.2 conda: None pytest: None mypy: None IPython: 8.12.2 sphinx: None

I also tried this on main at 3459e6fa, the behavior is the same.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7908/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1752520008 I_kwDOAMm_X85odVVI 7907 `plot.scatter(hue_style="discrete")` does nothing mgunyho 20118130 closed 0     4 2023-06-12T11:21:33Z 2023-07-13T23:17:49Z 2023-07-13T23:17:49Z CONTRIBUTOR      

What happened?

I was trying to do a scatterplot of my data with one dimension determining the color. The dimension has only a few values so I used hue_style="discrete" to have a different color for each value. However, the resulting scatterplot has a continuous colorbar, which is the same as when I pass hue_style="continuous":

What did you expect to happen?

The colorbar should have discrete colors. I was also expecting the colors to be from the default matplotlib color palette, C0, C1, etc, when there's less than 10 items, like this:

Although the examples in the documentation show the discrete case also using viridis.

What I was really expecting is a plot like one would get by passing add_colorbar=False, add_legend=True:

But that may be a bit too automagical.

Minimal Complete Verifiable Example

```Python import matplotlib.pyplot as plt import numpy as np import xarray as xr

x = xr.DataArray( np.random.default_rng().random((10, 3)), coords=[ ("idx", np.linspace(0, 1, 10)), ("color", [1, 2, 3]), ] ) y = x + np.random.default_rng().random(x.shape)

ds = xr.Dataset({ "x": x, "y": y, })

the output is the same regardless of hue_style="discrete" or "continuous" or just leaving it out

ds.plot.scatter(x="x", y="y", hue="color", hue_style="discrete", ax=plt.figure().gca()) ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

This is the code for the "expected" plot: ```python from matplotlib.colors import ListedColormap

ds.plot.scatter( x="x", y="y", hue="color", hue_style="discrete", ax=plt.figure().gca(),

# these lines added in addition to the MVCE
cmap=ListedColormap(["C0", "C1", "C2"]),
vmin=0.5, vmax=3.5,
cbar_kwargs=dict(ticks=ds.color.data),

) ```

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.8.10 (default, May 26 2023, 14:05:08) [GCC 9.4.0] python-bits: 64 OS: Linux OS-release: 5.14.0-1059-oem machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2023.1.0 pandas: 1.4.3 numpy: 1.23.0 scipy: None netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.5.3 cartopy: None seaborn: None numbagg: None fsspec: None cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 44.0.0 pip: 20.0.2 conda: None pytest: None mypy: None IPython: 8.12.2 sphinx: None

I also tried this on main at 3459e6fa, the behavior is the same.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7907/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 60.516ms · About: xarray-datasette