home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

9 rows where state = "open", type = "issue" and user = 14371165 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, created_at (date), updated_at (date)

type 1

  • issue · 9 ✖

state 1

  • open · 9 ✖

repo 1

  • xarray 9
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1680031454 I_kwDOAMm_X85kIz7e 7780 mypy does not understand output of binary operations Illviljan 14371165 open 0     8 2023-04-23T13:38:55Z 2024-04-28T20:07:04Z   MEMBER      

What happened?

When doing operations on numpy arrays and xarray variables mypy does not understand that the output is always a xarray variable regardless of the order. See example.

What did you expect to happen?

mypy to pass for the example code.

Minimal Complete Verifiable Example

```Python import numpy as np import xarray as xr

x = np.array([1, 2, 4]) v = xr.Variable(["x"], x)

numpy first:

xv = x * v xv.values # error: "ndarray[Any, dtype[bool_]]" has no attribute "values" [attr-defined] if isinstance(xv, xr.Variable): xv.values

variable first:

vx = v * x vx.values if isinstance(vx, xr.Variable): vx.values ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

Seen in #7741

Environment

xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.9.16 (main, Mar 8 2023, 10:39:24) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: en libhdf5: 1.10.6 libnetcdf: None xarray: 2023.4.2 pandas: 2.0.0 numpy: 1.23.5 scipy: 1.10.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: None dask: 2023.4.0 distributed: 2023.4.0 matplotlib: 3.5.3 cartopy: None seaborn: 0.12.2 numbagg: None fsspec: 2023.4.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 67.7.1 pip: 23.1.1 conda: 23.3.1 pytest: 7.3.1 mypy: 1.2.0 IPython: 8.12.0 sphinx: 6.1.3
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7780/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
2215603817 I_kwDOAMm_X86ED25p 8892 ffill's tolerance argument can be strings Illviljan 14371165 open 0     1 2024-03-29T15:49:40Z 2024-04-02T01:50:34Z   MEMBER      

What happened?

ffill, bfill reindex etc. have tolerance arguments that also supports strings. And we test for it here:

https://github.com/pydata/xarray/blob/2120808bbe45f3d4f0b6a01cd43bac4df4039092/xarray/tests/test_groupby.py#L2016-L2025

But our typing assumes it's floats only: https://github.com/pydata/xarray/blob/2120808bbe45f3d4f0b6a01cd43bac4df4039092/xarray/core/resample.py#L69-L94

What did you expect to happen?

Since our pytests pass, mypy should pass as well.

Minimal Complete Verifiable Example

```python import numpy as np import pandas as pd

import xarray as xr

https://github.com/pydata/xarray/blob/2120808bbe45f3d4f0b6a01cd43bac4df4039092/xarray/tests/test_groupby.py#L2016

Test tolerance keyword for upsample methods bfill, pad, nearest

times = pd.date_range("2000-01-01", freq="1D", periods=2) times_upsampled = pd.date_range("2000-01-01", freq="6h", periods=5) array = xr.DataArray(np.arange(2), [("time", times)])

Forward fill

actual = array.resample(time="6h").ffill(tolerance="12h") expected = xr.DataArray([0.0, 0.0, 0.0, np.nan, 1.0], [("time", times_upsampled)]) xr.testing.assert_identical(expected, actual)

```

Environment

master

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8892/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1826978659 I_kwDOAMm_X85s5Xtj 8028 Setting datarrays with non-dimension coordinates errors Illviljan 14371165 open 0     6 2023-07-28T19:20:31Z 2023-08-10T15:25:23Z   MEMBER      

What happened?

I'm not sure if this is a bug or a feature but I was expecting this example to work since the new coord is just a slight rewrite of the original dimension coordinate:

```python import xarray as xr

ds = xr.tutorial.open_dataset("air_temperature")

Change the first time value:

ds["air_new"] = ds.air.copy() air_new_changed = ds.air_new[{"time": 0}] * 3 ds.air_new.loc[air_new_changed.coords] = air_new_changed # Works! :)

Add a another coord along time axis and change

the first time value:

ds["air_new"] = ds.air.copy().assign_coords( {"time_float": ds.time.astype(float)} ) air_new_changed = ds.air_new[{"time": 0}] * 4 ds.air_new.loc[air_new_changed.coords] = air_new_changed # Error! :(

Traceback (most recent call last):

Cell In[25], line 5 ds.air_new.loc[air_new_changed.coords] = air_new_changed

File ~\AppData\Local\mambaforge\envs\jw\lib\site-packages\xarray\core\dataarray.py:222 in setitem dim_indexers = map_index_queries(self.data_array, key).dim_indexers

File ~\AppData\Local\mambaforge\envs\jw\lib\site-packages\xarray\core\indexing.py:182 in map_index_queries grouped_indexers = group_indexers_by_index(obj, indexers, options)

File ~\AppData\Local\mambaforge\envs\jw\lib\site-packages\xarray\core\indexing.py:144 in group_indexers_by_index raise KeyError(f"no index found for coordinate {key!r}")

KeyError: "no index found for coordinate 'time_float'" ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8028/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1537068105 I_kwDOAMm_X85bncxJ 7450 Backend array documentation typo Illviljan 14371165 open 0     0 2023-01-17T21:37:26Z 2023-01-17T21:56:12Z   MEMBER      

What happened?

https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html#indexing-examples

I believe there's a typo in the BASIC indexing support example: ```python

shall support integers

backend_array._raw_indexing_method(1, 1) ```

Should be: ```python

shall support integers

backend_array._raw_indexing_method((1, 1)) ```

Suggestion of possible fixes: * Make sure it is a typo. * Create a valid custom MyBackendArray and initialize it. So it is easier to tell if it's a typo. * Add type hinting so mypy can easier catch these errors.

What did you expect to happen?

No response

Minimal Complete Verifiable Example

No response

MVCE confirmation

  • [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [ ] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [ ] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7450/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1410534774 I_kwDOAMm_X85UEw12 7170 Scatter plots overlap in facetgrid in 3d Illviljan 14371165 open 0     0 2022-10-16T16:06:56Z 2022-10-16T16:08:55Z   MEMBER      

What happened?

Any matplotlib gurus have any ideas how to nicely fit 3d plots in facetgrid? python ds = xr.tutorial.scatter_example_dataset(seed=42) fg = ds.plot.scatter(x="A", y="B", z="z", hue="y", markersize="x", row="x", col="w")

2d looks fine: python fg = ds.plot.scatter(x="A", y="B", hue="y", markersize="x", row="x", col="w")

What did you expect to happen?

No plots overlapping each other, even if rotating the plots.

Minimal Complete Verifiable Example

No response

MVCE confirmation

  • [x] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [x] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [x] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [x] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:30:19) [MSC v.1929 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: en LOCALE: ('Swedish_Sweden', '1252') libhdf5: 1.12.2 libnetcdf: 4.8.1 xarray: 2022.9.1.dev266+gbd01f9cc.d20221006 pandas: 1.5.0 numpy: 1.23.3 scipy: 1.9.1 netCDF4: 1.6.1 pydap: installed h5netcdf: 1.0.2 h5py: 3.7.0 Nio: None zarr: 2.13.2 cftime: 1.6.2 nc_time_axis: 1.4.1 PseudoNetCDF: 3.2.2 rasterio: 1.3.2 cfgrib: None iris: 3.3.0 bottleneck: 1.3.5 dask: 2022.9.2 distributed: 2022.9.2 matplotlib: 3.6.0 cartopy: 0.21.0 seaborn: 0.12.0 numbagg: 0.2.1 fsspec: 2022.8.2 cupy: None pint: 0.19.2 sparse: 0.13.0 flox: 0.5.10.dev21+g91b6e19 numpy_groupies: 0.9.19 setuptools: 65.4.1 pip: 22.2.2 conda: None pytest: 7.1.3 IPython: 7.33.0 sphinx: 5.2.3 C:\Users\J.W\anaconda3\envs\xarray-tests\lib\site-packages\_distutils_hack\__init__.py:33: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.")
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7170/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1377088142 I_kwDOAMm_X85SFLKO 7050 Type annotation guidelines Illviljan 14371165 open 0     2 2022-09-18T15:04:54Z 2022-09-23T01:55:19Z   MEMBER      

Dask has a pretty nice guideline for type hinting, see https://github.com/dask/community/issues/255.

Notable for us is to avoid adding typing in docstrings to avoid duplicating information.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7050/reactions",
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
597785475 MDU6SXNzdWU1OTc3ODU0NzU= 3962 Interpolation - Support extrapolation method "clip" Illviljan 14371165 open 0     4 2020-04-10T09:07:13Z 2022-05-02T13:42:24Z   MEMBER      

Hello,

I would like an option in da.interp()that instead of returning NaNs during extrapolation returns the data corresponding to the end of the breakpoint data set range.

One way to do this is to limit the new coordinates to the array coordinates minimum and maximum value, I did a simple example with this solution down below. I think this is a rather safe way as we are just modifying the inputs to all the various interpolation classes that xarray is using at the moment. But it does look a little weird when printing the extrapolated value, the coordinates shows the limited value instead of the requested coordinates. Maybe this can be handled elegantly somewhere in the source code?

MATLAB uses this quite frequently in their interpolation functions: * https://mathworks.com/help/simulink/ug/methods-for-estimating-missing-points.html * https://mathworks.com/help/simulink/slref/2dlookuptable.html

MCVE Code Sample

```python import numpy as np import xarray as xr

def interp(da, coords, extrapolation='clip'): """ Linear interpolation that clips the inputs to the coords min and max value.

Parameters
----------
da : DataArray
    DataArray to interpolate.
coords : dict
    Coordinates for the interpolated value.
"""
if extrapolation == 'clip':
    for k, v in da.coords.items():
        coords[k] = np.maximum(coords[k], np.min(v.values))
        coords[k] = np.minimum(coords[k], np.max(v.values))

return da.interp(coords)

Create coordinates:

x = np.linspace(1000, 6000, 4) y = np.linspace(100, 1200, 3)

Create data:

X = np.meshgrid(*[x, y], indexing='ij') data = X[0] * X[1]

Create DataArray:

da = xr.DataArray(data=data, coords=[('x', x), ('y', y)], name='data')

Attempt to extrapolate:

datai = interp(da, {'x': 7000, 'y': 375}) ```

Expected Output

python print(datai) <xarray.DataArray 'data' ()> array(2250000.) Coordinates: x float64 6e+03 y float64 375.0

Versions

Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.7.7 (default, Mar 23 2020, 23:19:08) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: en LOCALE: None.None libhdf5: 1.10.4 libnetcdf: None xarray: 0.15.0 pandas: 1.0.3 numpy: 1.18.1 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2.13.0 distributed: 2.13.0 matplotlib: 3.1.3 cartopy: None seaborn: 0.10.0 numbagg: None setuptools: 46.1.3.post20200330 pip: 20.0.2 conda: 4.8.3 pytest: 5.4.1 IPython: 7.13.0 sphinx: 2.4.4
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3962/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
957201551 MDU6SXNzdWU5NTcyMDE1NTE= 5655 Allow .attrs to use dict-likes Illviljan 14371165 open 0     2 2021-07-31T08:31:55Z 2022-01-09T03:32:04Z   MEMBER      

Is your feature request related to a problem? Please describe. Reading attributes from h5py-files is rather slow. So instead of retrieving it immediately I wanted to create a lazy dict-class that only retrieves the attribute values when necessary. But this is difficult to achieve since xarray keeps forcing the attrs to dicts in a lot of places.

Describe the solution you'd like * Replace in https://github.com/pydata/xarray/blob/dddac11b01330791ffab4dfc72d226e71821973e/xarray/core/variable.py#L865 and https://github.com/pydata/xarray/blob/dddac11b01330791ffab4dfc72d226e71821973e/xarray/core/dataset.py#L798 with a asdict(value) function that checks if the input is a valid dict-like, if not convert to dict. Things that might be good to check: * MutableMapping * hasattr(dict_like, "copy") * isinstance(dict_like, dict) == True * Remove unneccessary conversions to dict. For example https://github.com/pydata/xarray/blob/dddac11b01330791ffab4dfc72d226e71821973e/xarray/core/merge.py#L523 should not be necessary as attrs from variables/dataarrays/datasets have already been forced to dicts when they were initialized.

Describe alternatives you've considered * One could lazify with dicts as well, for example by replacing the value with a function. This however won't look good in reprs, that's why having a convienence class is nice. * dict(LazyDict) always forces to dict, it does not let it pass through unchanged even if isinstance(LazyDict, dict) == True.

Interesting reading: https://stackoverflow.com/questions/16669367/setup-dictionary-lazily https://stackoverflow.com/questions/3387691/how-to-perfectly-override-a-dict

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5655/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
779938616 MDU6SXNzdWU3Nzk5Mzg2MTY= 4770 Interpolation always returns floats Illviljan 14371165 open 0     1 2021-01-06T03:16:43Z 2021-01-12T16:30:54Z   MEMBER      

What happened: When interpolating datasets integer arrays are forced to floats.

What you expected to happen: To retain the same dtype after interpolation.

Minimal Complete Verifiable Example:

```python import numpy as np import dask.array as da a = np.arange(0, 2) b = np.core.defchararray.add("long_variable_name", a.astype(str)) coords = dict(time=da.array([0, 1])) data_vars = dict() for v in b: data_vars[v] = xr.DataArray( name=v, data=da.array([0, 1], dtype=int), dims=["time"], coords=coords, ) ds1 = xr.Dataset(data_vars)

print(ds1) Out[35]: <xarray.Dataset> Dimensions: (time: 4) Coordinates: * time (time) float64 0.0 0.5 1.0 2.0 Data variables: long_variable_name0 (time) int32 dask.array<chunksize=(4,), meta=np.ndarray> long_variable_name1 (time) int32 dask.array<chunksize=(4,), meta=np.ndarray>

Interpolate:

ds1 = ds1.interp( time=da.array([0, 0.5, 1, 2]), assume_sorted=True, method="linear", kwargs=dict(fill_value="extrapolate"), )

dask array thinks it's an integer array:

print(ds1.long_variable_name0) Out[55]: <xarray.DataArray 'long_variable_name0' (time: 4)> dask.array<dask_aware_interpnd, shape=(4,), dtype=int32, chunksize=(4,), chunktype=numpy.ndarray> Coordinates: * time (time) float64 0.0 0.5 1.0 2.0

But once computed it turns out is a float:

print(ds1.long_variable_name0.compute()) Out[38]: <xarray.DataArray 'long_variable_name0' (time: 4)> array([0. , 0.5, 1. , 2. ]) Coordinates: * time (time) float64 0.0 0.5 1.0 2.0

```

Anything else we need to know?: An easy first step is to also force np.float_ in da.blockwise in missing.interp_func.

The more difficult way is to somehow be able to change back the dataarrays into the old dtype without affecting performance. I did a test simply adding .astype()to the returned value in missing.interp and it doubled the calculation time.

I was thinking the conversion to floats in scipy could be avoided altogether by adding a (non-)public option to ignore any dtype checks and just let the user handle the "unsafe" interpolations.

Related: https://github.com/scipy/scipy/issues/11093

Environment:

Output of <tt>xr.show_versions()</tt> xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows libhdf5: 1.10.4 libnetcdf: None xarray: 0.16.2 pandas: 1.1.5 numpy: 1.17.5 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2020.12.0 distributed: 2020.12.0 matplotlib: 3.3.2 cartopy: None seaborn: 0.11.1 numbagg: None pint: None setuptools: 51.0.0.post20201207 pip: 20.3.3 conda: 4.9.2 pytest: 6.2.1 IPython: 7.19.0 sphinx: 3.4.0
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4770/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 482.858ms · About: xarray-datasette