home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

12 rows where user = 20118130 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, draft, created_at (date), updated_at (date), closed_at (date)

type 2

  • pull 8
  • issue 4

state 2

  • closed 11
  • open 1

repo 1

  • xarray 12
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1674818753 I_kwDOAMm_X85j07TB 7768 Supplying multidimensional initial guess to `curvefit` mgunyho 20118130 closed 0     5 2023-04-19T12:37:53Z 2024-03-25T20:02:14Z 2023-05-31T12:43:09Z CONTRIBUTOR      

Is your feature request related to a problem?

Hi, I'm trying to use DataArray.curvefit to fit a bunch of data. Let's say the data dimensions are (x, experiment_index), and I'm trying to fit m * x + b, where m will be different for each experiment_index. I would like to supply an initial guess p0 to curvefit that depends on experiment_index, but it seems like this is not supported. Here's a minimal example:

```python import numpy as np import xarray as xr

x = xr.DataArray(coords=[("x", np.linspace(0, 10, 101))]).x i = xr.DataArray(coords=[("experiment_index", [1, 2, 3])]).experiment_index

data = 2.0 * i * x + 5

m_guess = 2 * i

data.curvefit( "x", lambda x, m, b: m * x + b, p0={"m": m_guess} # I would like to provide a guess for 'm' as a function of experiment_index ) ```

Describe the solution you'd like

I would like to be able to provide arrays as the values of p0, so that I can have different initial guesses for different slices of the data.

I suppose this could also be implemented for bounds.

Describe alternatives you've considered

I could wrap curvefit in a for-loop, for example

python result = [] for y in data.transpose("experiment_index", ...): result.append(y.curvefit( "x", lambda x, m, b: m * x + b, p0={"m": m_guess.sel(experiment_index=y.experiment_index).item()}, )) result = xr.concat(result, dim="experiment_index")

But this is quite cumbersome, especially for multidimensional data.

Additional context

The above example gives the error *** ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part. because curve_fit tries to do np.atleast_1d([m_guess, 1]), but it should be np.atleast_1d([m_guess[0], 1]).

~~The above example gives the error~~ ValueError: operands could not be broadcast together with shapes (3,) (101,) ~~which comes from scipy.curve_fit tying to compute m * x, where m is the DataArray m_guess, but x is a plain Numpy array, basically x.data.~~ - this applies for scipy 1.7.

This toy example of course works with just a scalar guess like p0={"m": 2}, but in my case the function is more complicated and fit might fail if the initial guess is too far off.

The initial guess is inserted into kwargs passed to curve_fit here: https://github.com/pydata/xarray/blob/c75ac8b7ab33be95f74d9e6f10b8173c68828751/xarray/core/dataset.py#L8659

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7768/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1698656265 I_kwDOAMm_X85lP3AJ 7823 DataArray.to_dataset(dim) silently drops variable if it is already a dim mgunyho 20118130 closed 0     3 2023-05-06T14:37:59Z 2023-11-14T22:28:18Z 2023-11-14T22:28:18Z CONTRIBUTOR      

What happened?

If I have a DataArray da which I split into a Dataset using da.to_dataset(dim), and one of the values of da[dim] also happens to be one of the dimensions of da, that variable is silently missing from the resulting dataset.

What did you expect to happen?

If a variable cannot be created because it is already a dimension, it should raise an exception, or possibly issue a warning and rename the variable, so that no data is lost.

Minimal Complete Verifiable Example

```Python import xarray as xr

da = xr.DataArray( np.zeros((3, 3)), coords={ # note how 'foo' is one of the coordinate values, and also the name of a dimension "x": ["foo", "bar", "baz"], "foo": [1, 2, 3], } )

this produces a Dataset with the variables 'bar' and 'baz', 'foo' is missing (because it is already a coordinate)

print(da.to_dataset("x"))

this produces a dataset with the variables 'foo', 'bar', and 'baz', as epected

print(da.rename({"foo": "qux"}).to_dataset("x")) ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

```Python

Output of first conversion

<xarray.Dataset> Dimensions: (foo: 3) Coordinates: * foo (foo) int64 1 2 3 Data variables: bar (foo) float64 0.0 0.0 0.0 baz (foo) float64 0.0 0.0 0.0

Output of second conversion

<xarray.Dataset> Dimensions: (qux: 3) Coordinates: * qux (qux) int64 1 2 3 Data variables: foo (qux) float64 0.0 0.0 0.0 bar (qux) float64 0.0 0.0 0.0 baz (qux) float64 0.0 0.0 0.0 ```

Anything else we need to know?

This came up when I did to_dataset("param") on the fit result returned by curvefit, and one of the data dimensions happened to be the same as one of the arguments of the function which I was fitting. I was initially very confused by this.

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.10.10 (main, Mar 01 2023, 21:10:14) [GCC] python-bits: 64 OS: Linux OS-release: 6.2.12-1-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: ('en_GB', 'UTF-8') libhdf5: 1.12.2 libnetcdf: None xarray: 2023.4.2 pandas: 2.0.1 numpy: 1.23.5 scipy: 1.10.1 netCDF4: None pydap: None h5netcdf: 1.1.0 h5py: 3.8.0 Nio: None zarr: 2.14.2 cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: 1.3.7 dask: 2023.4.1 distributed: None matplotlib: 3.7.1 cartopy: None seaborn: 0.12.2 numbagg: None fsspec: 2023.4.0 cupy: None pint: None sparse: 0.14.0 flox: None numpy_groupies: None setuptools: 65.5.0 pip: 22.3.1 conda: None pytest: 7.3.1 mypy: None IPython: 8.13.2 sphinx: 6.2.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7823/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1985010450 PR_kwDOAMm_X85fAHx- 8433 Raise exception in to_dataset if resulting variable is also the name of a coordinate mgunyho 20118130 closed 0     12 2023-11-09T07:38:20Z 2023-11-14T22:28:17Z 2023-11-14T22:28:17Z CONTRIBUTOR   0 pydata/xarray/pulls/8433

Let me know if you think the error message is unclear or too verbose or too fancy or something.

  • [x] Closes #7823
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8433/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1857713530 PR_kwDOAMm_X85YTTNH 8089 WIP: Factor out a function for checking dimension-related errors mgunyho 20118130 open 0     4 2023-08-19T13:35:29Z 2023-09-12T18:59:32Z   CONTRIBUTOR   1 pydata/xarray/pulls/8089

This is a WIP follow-up for #8079 and I think also for #7051. The pattern

python missing_dims = set(dims) - set(self.dims) if missing_dims: raise ValueError(f"Dimensions {missing_dims} not found in data dimensions {tuple(self.dims)}") occurs in many methods, with small variations in the way missing_dims is calculated, the error message, and also if it's ValueError or KeyError. So it would make sense to factor it out. But I'm not familiar enough with the context around #7051 to know how to deal with sets vs tuples, so this is just a sketch for now.

  • [ ] Tests added
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [ ] New functions/methods are listed in api.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8089/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1855291078 PR_kwDOAMm_X85YLGz2 8079 Consistently report all dimensions in error messages if invalid dimensions are given mgunyho 20118130 closed 0     11 2023-08-17T16:03:53Z 2023-09-09T04:55:43Z 2023-09-09T04:55:43Z CONTRIBUTOR   0 pydata/xarray/pulls/8079

Hello,

I noticed that arr.min("nonexistent") raises an error with a very helpful message ValueError: 'nonexistent' not found in array dimensions ('x', 'y', 'z') while arr.idxmin("nonexistent") raises KeyError: 'Dimension "nonexistent" not in dimension' [sic]

IMO, the list of dimensions should always be shown in the error message for these kinds of errors, it makes debugging much easier. With this PR, I have implemented this behavior for all such functions that I could find.

There is quite a consistent pattern which I think could be factored out into a function, but I didn't have a clear enough picture of the structure of the whole code to do it.

I didn't fix the tests yet, I'll do it if you think this can be merged.

  • [x] Searched list of issues, couldn't find one related to this
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8079/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1752541983 I_kwDOAMm_X85odasf 7908 `plot.scatter(hue_style="invalid")` does not raise an exception mgunyho 20118130 closed 0     0 2023-06-12T11:30:22Z 2023-07-13T23:17:50Z 2023-07-13T23:17:50Z CONTRIBUTOR      

What happened?

If I do a scatterplot with hue_style=x, where x is not "continuous" or "discrete", the result is the same as passing hue_style="continuous".

Probably related to #7907.

What did you expect to happen?

An invalid value should raise an exception.

Minimal Complete Verifiable Example

```Python import matplotlib.pyplot as plt import numpy as np import xarray as xr

x = xr.DataArray( np.random.default_rng().random((10, 3)), coords=[ ("idx", np.linspace(0, 1, 10)), ("color", [1, 2, 3]), ] )

x.plot.scatter(x="idx", hue="color", hue_style="invalid", ax=plt.figure().gca()) plt.show() ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.8.10 (default, May 26 2023, 14:05:08) [GCC 9.4.0] python-bits: 64 OS: Linux OS-release: 5.14.0-1059-oem machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2023.1.0 pandas: 1.4.3 numpy: 1.23.0 scipy: None netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.5.3 cartopy: None seaborn: None numbagg: None fsspec: None cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 44.0.0 pip: 20.0.2 conda: None pytest: None mypy: None IPython: 8.12.2 sphinx: None

I also tried this on main at 3459e6fa, the behavior is the same.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7908/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1752520008 I_kwDOAMm_X85odVVI 7907 `plot.scatter(hue_style="discrete")` does nothing mgunyho 20118130 closed 0     4 2023-06-12T11:21:33Z 2023-07-13T23:17:49Z 2023-07-13T23:17:49Z CONTRIBUTOR      

What happened?

I was trying to do a scatterplot of my data with one dimension determining the color. The dimension has only a few values so I used hue_style="discrete" to have a different color for each value. However, the resulting scatterplot has a continuous colorbar, which is the same as when I pass hue_style="continuous":

What did you expect to happen?

The colorbar should have discrete colors. I was also expecting the colors to be from the default matplotlib color palette, C0, C1, etc, when there's less than 10 items, like this:

Although the examples in the documentation show the discrete case also using viridis.

What I was really expecting is a plot like one would get by passing add_colorbar=False, add_legend=True:

But that may be a bit too automagical.

Minimal Complete Verifiable Example

```Python import matplotlib.pyplot as plt import numpy as np import xarray as xr

x = xr.DataArray( np.random.default_rng().random((10, 3)), coords=[ ("idx", np.linspace(0, 1, 10)), ("color", [1, 2, 3]), ] ) y = x + np.random.default_rng().random(x.shape)

ds = xr.Dataset({ "x": x, "y": y, })

the output is the same regardless of hue_style="discrete" or "continuous" or just leaving it out

ds.plot.scatter(x="x", y="y", hue="color", hue_style="discrete", ax=plt.figure().gca()) ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

This is the code for the "expected" plot: ```python from matplotlib.colors import ListedColormap

ds.plot.scatter( x="x", y="y", hue="color", hue_style="discrete", ax=plt.figure().gca(),

# these lines added in addition to the MVCE
cmap=ListedColormap(["C0", "C1", "C2"]),
vmin=0.5, vmax=3.5,
cbar_kwargs=dict(ticks=ds.color.data),

) ```

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.8.10 (default, May 26 2023, 14:05:08) [GCC 9.4.0] python-bits: 64 OS: Linux OS-release: 5.14.0-1059-oem machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2023.1.0 pandas: 1.4.3 numpy: 1.23.0 scipy: None netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.5.3 cartopy: None seaborn: None numbagg: None fsspec: None cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 44.0.0 pip: 20.0.2 conda: None pytest: None mypy: None IPython: 8.12.2 sphinx: None

I also tried this on main at 3459e6fa, the behavior is the same.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7907/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1740268634 PR_kwDOAMm_X85SHW1Z 7891 Add errors option to curvefit mgunyho 20118130 closed 0     3 2023-06-04T09:43:06Z 2023-06-16T03:15:07Z 2023-06-16T03:15:06Z CONTRIBUTOR   0 pydata/xarray/pulls/7891
  • [x] Closes #6317 and closes #6515
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst

This is a rebased version of #6515, with the arg errors = "raise" | "ignore" added to Dataset and DataArray, and with tests. Let me know if the tests should be expanded further.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7891/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1741050111 PR_kwDOAMm_X85SJ-xN 7893 Fix flaky doctest for curvefit mgunyho 20118130 closed 0     1 2023-06-05T06:10:30Z 2023-06-09T15:38:58Z 2023-06-09T15:38:58Z CONTRIBUTOR   0 pydata/xarray/pulls/7893

Fix flaky doctest introduced in #7821, see https://github.com/pydata/xarray/pull/7821#issuecomment-1537142237.

This uses the NUMBER option to compare the output with less decimal precision. It's not part of standard doctest but an extension from pytest: https://docs.pytest.org/en/7.1.x/how-to/doctest.html#using-doctest-options

Another option would be to use ... and the built-in +ELLIPSIS option, but IMO the current version is less confusing for someone reading the example.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7893/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1698626185 PR_kwDOAMm_X85P6owK 7821 Implement multidimensional initial guess and bounds for `curvefit` mgunyho 20118130 closed 0     6 2023-05-06T13:09:49Z 2023-06-01T15:51:40Z 2023-05-31T12:43:07Z CONTRIBUTOR   0 pydata/xarray/pulls/7821
  • [x] Closes #7768
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst

With this PR, it's possible to pass an initial guess to curvefit that is a DataArray, which will be broadcast to the data dimensions. This way, the initial guess can vary with the data coordinates.

I also added examples of using curvefit to the documentation, both a basic example and one with the multidimensional guess.

I have a couple of questions: - Should we change the signature to p0: dict[str, float | DataArray] | None, instead of dict[str, Any] (and same for bounds)? scipy only optimizes over scalars, so I think it would be safe to assume that the values should either be those, or arrays that can be broadcast. - The usage example of curvefit is only in the docstring for DataArray, so now the docs differ between DA and dataset. But the example uses a DataArray only, so this should be ok, right?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7821/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1698632575 PR_kwDOAMm_X85P6qCY 7822 Fix typos in contribution guide mgunyho 20118130 closed 0     1 2023-05-06T13:29:22Z 2023-05-07T09:12:57Z 2023-05-07T07:34:56Z CONTRIBUTOR   0 pydata/xarray/pulls/7822  
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7822/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1345816120 PR_kwDOAMm_X849h97w 6944 Fix step plots with hue mgunyho 20118130 closed 0     2 2022-08-22T05:00:14Z 2022-08-28T12:39:33Z 2022-08-25T15:56:11Z CONTRIBUTOR   0 pydata/xarray/pulls/6944

This PR fixes the broadcasting error when trying to plot multiple step plots, like arr.plot.step(..., hue=...) or arr.plot(..., drawstyle="steps-mid"). Previously, this raised a shape error, as mentioned in https://github.com/pydata/xarray/issues/4288#issuecomment-666485140. Some other relevant work was started (but apparently unfinished) in #4868 and #4866, this doesn't implement those.

  • [x] Tests added
  • [x] Fixes applied
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6944/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 20.905ms · About: xarray-datasette