home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

6 rows where state = "closed", type = "issue" and user = 3698640 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)

type 1

  • issue · 6 ✖

state 1

  • closed · 6 ✖

repo 1

  • xarray 6
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1284056895 I_kwDOAMm_X85MiSc_ 6720 readthedocs failing on main delgadom 3698640 closed 0     0 2022-06-24T18:26:39Z 2022-06-25T11:00:50Z 2022-06-25T11:00:50Z CONTRIBUTOR      

What is your issue?

I'm pretty sure my PR https://github.com/pydata/xarray/pull/6542 is the culprit. I never figured out how to get around the build timeout with these docs edits.

If you all are on top of this then no worries - feel free to close. Just wanted to point in the right direction so you don't need to go hunting.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6720/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1199122647 I_kwDOAMm_X85HeSjX 6466 drop and other shouldn't be mutually exclusive in DataWithCoords.where delgadom 3698640 closed 0     0 2022-04-10T17:42:07Z 2022-04-12T15:33:05Z 2022-04-12T15:33:05Z CONTRIBUTOR      

Is your feature request related to a problem?

xr.Dataset.where and xr.DataArray.where currently do not allow providing both other and drop=True, stating that they are mutually exclusive. Conceptually, this doesn't strike me as true. Drop does not drop all points where cond is False, only those where an entire coordinate label evaluates to False. This is most important when trying to avoid type promotion, such as in the below example, where an integer FillValue might be preferred over dtypes.NA.

```python In [2]: da = xr.DataArray(np.arange(16).reshape(4, 4), dims=['x', 'y'])

In [3]: da Out[3]: <xarray.DataArray (x: 4, y: 4)> array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]]) Dimensions without coordinates: x, y If *`other`* is not provided, the array is promoted to `float64` to accommodate the default,`dtypes.NA`:python In [4]: da.where(da > 6, drop=True) Out[4]: <xarray.DataArray (x: 3, y: 4)> array([[nan, nan, nan, 7.], [ 8., 9., 10., 11.], [12., 13., 14., 15.]]) Dimensions without coordinates: x, y However, the combination of *`other`* and *`drop=True`* is not allowed:python In [5]: da.where(da > 6, -1, drop=True)


ValueError Traceback (most recent call last) Input In [5], in <module> ----> 1 da.where(da > 6, -1, drop=True)

File ~/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/xarray/core/common.py:1268, in DataWithCoords.where(self, cond, other, drop) 1266 if drop: 1267 if other is not dtypes.NA: -> 1268 raise ValueError("cannot set other if drop=True") 1270 if not isinstance(cond, (Dataset, DataArray)): 1271 raise TypeError( 1272 f"cond argument is {cond!r} but must be a {Dataset!r} or {DataArray!r}" 1273 )

ValueError: cannot set other if drop=True ```

Describe the solution you'd like

Current implementation

The current behavior is enforced within the block handling the drop argument (currently https://github.com/pydata/xarray/blob/main/xarray/core/common.py#L1266-L1268):

python if drop: if other is not dtypes.NA: raise ValueError("cannot set `other` if drop=True")

Proposed fix

I just removed the above if statement on a fork, and the example now works!

```python

import xarray as xr, numpy as np da = xr.DataArray(np.arange(16).reshape(4, 4), dims=['x', 'y']) da <xarray.DataArray (x: 4, y: 4)> array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]]) Dimensions without coordinates: x, y da.where(da > 6, drop=True) <xarray.DataArray (x: 3, y: 4)> array([[nan, nan, nan, 7.], [ 8., 9., 10., 11.], [12., 13., 14., 15.]]) Dimensions without coordinates: x, y da.where(da > 6, -1, drop=True) <xarray.DataArray (x: 3, y: 4)> array([[-1, -1, -1, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]]) Dimensions without coordinates: x, y ``` Making this change fails a single test in https://github.com/pydata/xarray/blob/main/xarray/tests/test_dataset.py#L4548-L4549, which explicitly checks for this behavior:

python with pytest.raises(ValueError, match=r"cannot set"): ds.where(ds > 1, other=0, drop=True) This could easily be reworked to check for valid handling of both arguments.

Describe alternatives you've considered

No response

Additional context

I haven't yet investigated what would happen with chunked, sparse, or other complex arrays, or if it's compatible with trees and other things on the roadmap. It's possible this breaks things I'm not imagining. Currently, where(cond, other) and where(cond, drop=True) are well-tested, flexible operations, and I don't see why allowing their union would break anything, but I'll wait to hear from the experts on that front!

I'm definitely open to creating a pull request (and have the simple implementation I've outlined here ready to go).

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6466/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
186895655 MDU6SXNzdWUxODY4OTU2NTU= 1075 Support creating DataSet from streaming object delgadom 3698640 closed 0     16 2016-11-02T19:19:04Z 2020-06-01T06:37:08Z 2018-01-11T23:58:41Z CONTRIBUTOR      

The use case is for netCDF files stored on s3 or other generic cloud storage

```python import requests, xarray as xr fp = 'http://nasanex.s3.amazonaws.com/NEX-GDDP/BCSD/rcp45/day/atmos/tasmax/r1i1p1/v1.0/tasmax_day_BCSD_rcp45_r1i1p1_MPI-ESM-LR_2029.nc'

data = requests.get(fp, stream=True) ds = xr.open_dataset(data.content) # raises TypeError: embedded NUL character ```

Ideal would be integration with the (hopefully) soon-to-be implemented dask.distributed features discussed in #798.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1075/reactions",
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
596115014 MDU6SXNzdWU1OTYxMTUwMTQ= 3951 series.to_xarray() fails when MultiIndex not sorted in xarray 0.15.1 delgadom 3698640 closed 0     4 2020-04-07T19:56:26Z 2020-04-08T02:19:11Z 2020-04-08T02:19:11Z CONTRIBUTOR      

series.to_xarray() fails when MultiIndex not sorted in xarray 0.15.1

Summary

It seems that series.to_xarray() fails (returns incorrect data) in xarray 0.15.1 when the dataframe's MultiIndex dimensions are not sorted

Demonstration

xarray should be able to handle MultiIndices with unsorted dimensions. Using a fresh conda environment with xarray 0.14.1:

```python $ conda run -n py37xr14 python test.py

df alpha B A num 0 1 4 1 2 5 2 3 6

df.stack('alpha') num alpha 0 B 1 A 4 1 B 2 A 5 2 B 3 A 6 dtype: int64

df.stack('alpha').to_xarray() <xarray.DataArray (num: 3, alpha: 2)> array([[1, 4], [2, 5], [3, 6]]) Coordinates: * num (num) int64 0 1 2 * alpha (alpha) object 'B' 'A' ```

This fails in xarray 0.15.1 - note the data is not merely reordered - the data in column 'B' now has the incorrect values 4, 5, 6 rather than 1, 2, 3:

```python $ conda run -n py37xr15 python test.py

df alpha B A num 0 1 4 1 2 5 2 3 6

df.stack('alpha') num alpha 0 B 1 A 4 1 B 2 A 5 2 B 3 A 6 dtype: int64

df.stack('alpha').to_xarray() <xarray.DataArray (num: 3, alpha: 2)> array([[4, 1], [5, 2], [6, 3]]) Coordinates: * num (num) int64 0 1 2 * alpha (alpha) object 'B' 'A' ```

Test setup & environment info

contents of test.py ```python import pandas as pd df = pd.DataFrame({'B': [1, 2, 3], 'A': [4, 5, 6]}) df = df.rename_axis('num').rename_axis('alpha', axis=1) print(">>> df") print(df) print("\n>>> df.stack('alpha')") print(df.stack('alpha')) print("\n>>> df.stack('alpha').to_xarray()") print(df.stack('alpha').to_xarray()) ```
packages in py37xr14 environment ```bash $ conda list -n py37xr14 # packages in environment at /Users/delgadom/miniconda3/envs/py37xr14: # # Name Version Build Channel ca-certificates 2020.4.5.1 hecc5488_0 conda-forge certifi 2020.4.5.1 py37hc8dfbb8_0 conda-forge libblas 3.8.0 16_openblas conda-forge libcblas 3.8.0 16_openblas conda-forge libcxx 9.0.1 2 conda-forge libffi 3.2.1 h4a8c4bd_1007 conda-forge libgfortran 4.0.0 2 conda-forge liblapack 3.8.0 16_openblas conda-forge libopenblas 0.3.9 h3d69b6c_0 conda-forge llvm-openmp 9.0.1 h28b9765_2 conda-forge ncurses 6.1 h0a44026_1002 conda-forge numpy 1.18.1 py37h7687784_1 conda-forge openssl 1.1.1f h0b31af3_0 conda-forge pandas 1.0.3 py37h94625e5_0 conda-forge pip 20.0.2 py_2 conda-forge python 3.7.6 h90870a6_5_cpython conda-forge python-dateutil 2.8.1 py_0 conda-forge python_abi 3.7 1_cp37m conda-forge pytz 2019.3 py_0 conda-forge readline 8.0 hcfe32e1_0 conda-forge setuptools 46.1.3 py37hc8dfbb8_0 conda-forge six 1.14.0 py_1 conda-forge sqlite 3.30.1 h93121df_0 conda-forge tk 8.6.10 hbbe82c9_0 conda-forge wheel 0.34.2 py_1 conda-forge xarray 0.14.1 py_1 conda-forge xz 5.2.5 h0b31af3_0 conda-forge zlib 1.2.11 h0b31af3_1006 conda-forge ```
packages in py37xr15 environment ```bash $ conda list -n py37xr15 # packages in environment at /Users/delgadom/miniconda3/envs/py37xr15: # # Name Version Build Channel ca-certificates 2020.4.5.1 hecc5488_0 conda-forge certifi 2020.4.5.1 py37hc8dfbb8_0 conda-forge libblas 3.8.0 16_openblas conda-forge libcblas 3.8.0 16_openblas conda-forge libcxx 9.0.1 2 conda-forge libffi 3.2.1 h4a8c4bd_1007 conda-forge libgfortran 4.0.0 2 conda-forge liblapack 3.8.0 16_openblas conda-forge libopenblas 0.3.9 h3d69b6c_0 conda-forge llvm-openmp 9.0.1 h28b9765_2 conda-forge ncurses 6.1 h0a44026_1002 conda-forge numpy 1.18.1 py37h7687784_1 conda-forge openssl 1.1.1f h0b31af3_0 conda-forge pandas 1.0.3 py37h94625e5_0 conda-forge pip 20.0.2 py_2 conda-forge python 3.7.6 h90870a6_5_cpython conda-forge python-dateutil 2.8.1 py_0 conda-forge python_abi 3.7 1_cp37m conda-forge pytz 2019.3 py_0 conda-forge readline 8.0 hcfe32e1_0 conda-forge setuptools 46.1.3 py37hc8dfbb8_0 conda-forge six 1.14.0 py_1 conda-forge sqlite 3.30.1 h93121df_0 conda-forge tk 8.6.10 hbbe82c9_0 conda-forge wheel 0.34.2 py_1 conda-forge xarray 0.15.1 py_0 conda-forge xz 5.2.5 h0b31af3_0 conda-forge zlib 1.2.11 h0b31af3_1006 conda-forge ```
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3951/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
316507549 MDU6SXNzdWUzMTY1MDc1NDk= 2071 spell check: could not `bebroadcast` delgadom 3698640 closed 0     0 2018-04-21T16:59:09Z 2018-04-21T17:42:06Z 2018-04-21T17:42:06Z CONTRIBUTOR      

Spelling error in value error raised on index-based assignment with incorrect shape

Very easy one here:

```python In [1]: import xarray as xr, pandas as pd, numpy as np

In [2]: da = xr.DataArray(np.random.random((2, 3)), dims=['x','y'])

In [3]: da Out[3]: <xarray.DataArray (x: 2, y: 3)> array([[0.882927, 0.604024, 0.316146], [0.06342 , 0.503182, 0.297988]]) Dimensions without coordinates: x, y

In [4]: da[0, 1] = [1, 2]

ValueError Traceback (most recent call last) <ipython-input-4-1fbe1d206e00> in <module>() ----> 1 da[0, 1] = [1, 2]

~/miniconda2/envs/xarray-dev/lib/python3.6/site-packages/xarray/core/dataarray.py in setitem(self, key, value) 486 key = {k: v.variable if isinstance(v, DataArray) else v 487 for k, v in self._item_key_to_dict(key).items()} --> 488 self.variable[key] = value 489 490 def delitem(self, key):

~/miniconda2/envs/xarray-dev/lib/python3.6/site-packages/xarray/core/variable.py in setitem(self, key, value) 682 'shape mismatch: value array of shape %s could not be' 683 'broadcast to indexing result with %s dimensions' --> 684 % (value.shape, len(dims))) 685 if value.ndim == 0: 686 value = Variable((), value)

ValueError: shape mismatch: value array of shape (2,) could not bebroadcast to indexing result with 0 dimensions

In [5]: xr.show_versions() /Users/delgadom/miniconda2/envs/xarray-dev/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type. from ._conv import register_converters as _register_converters

INSTALLED VERSIONS

commit: None python: 3.6.4.final.0 python-bits: 64 OS: Darwin OS-release: 17.4.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

xarray: 0.10.2 pandas: 0.22.0 numpy: 1.14.2 scipy: 1.0.0 netCDF4: 1.3.1 h5netcdf: 0.5.0 h5py: 2.7.1 Nio: None zarr: 2.2.0 bottleneck: 1.2.1 cyordereddict: None dask: 0.17.2 distributed: 1.21.4 matplotlib: 2.2.2 cartopy: 0.15.1 seaborn: 0.8.1 setuptools: 38.5.1 pip: 9.0.1 conda: None pytest: 3.4.1 IPython: 6.2.1 sphinx: 1.6.6

```

Problem description

The error message in variable.py#L682 seems to be missing an end-of-line space. Happy to create a PR.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2071/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
241389297 MDU6SXNzdWUyNDEzODkyOTc= 1472 .sel(drop=True) fails to drop coordinate delgadom 3698640 closed 0     4 2017-07-07T21:49:35Z 2017-07-10T16:08:30Z 2017-07-10T15:54:23Z CONTRIBUTOR      

Using both xarray 0.9.6 and current (0.9.6-16-gb201ff7), da.sel(drop=True, **indexers) and da.isel(drop=True, **indexers) return unexpectedly different results.

Setup:

```python In [1]: import xarray as xr, pandas as pd, numpy as np

In [2]: years = pd.Index( ...: pd.date_range('1981-01-01', '2100-01-01', freq='A', closed='left'), ...: name='time') ...: ages = pd.Index(['age0', 'age1', 'age2', 'age3'], name='age')

In [3]: arr = xr.DataArray( ...: np.random.random((len(years), 4)), dims=('time', 'age'), ...: coords={'time': years, 'age': ages})

In [4]: arr Out[4]: <xarray.DataArray (time: 119, age: 4)> array([[ 0.755194, 0.1316 , 0.283485, 0.616929], [ 0.01667 , 0.907853, 0.667366, 0.146755], [ 0.338319, 0.782972, 0.367624, 0.390907], ..., [ 0.453521, 0.807693, 0.094811, 0.603297], [ 0.405114, 0.821691, 0.633314, 0.259406], [ 0.41722 , 0.012957, 0.329089, 0.774966]]) Coordinates: * age (age) object 'age0' 'age1' 'age2' 'age3' * time (time) datetime64[ns] 1981-12-31 1982-12-31 1983-12-31 ... ```

I would expect the following operations to return identical results:

```python In [5]: arr.sel(time='2012', drop=True) Out[5]: <xarray.DataArray (time: 1, age: 4)> array([[ 0.086045, 0.467905, 0.101005, 0.503311]]) Coordinates: * age (age) object 'age0' 'age1' 'age2' 'age3' Dimensions without coordinates: time

In [6]: arr.isel(time=31, drop=True) Out[6]: <xarray.DataArray (age: 4)> array([ 0.086045, 0.467905, 0.101005, 0.503311]) Coordinates: * age (age) object 'age0' 'age1' 'age2' 'age3' `` Note thattimeis still seen as a dimension in in the.sel` results.

The same behavior is seen for Dataset.sel(drop=true):

```python In [7]: ds = xr.Dataset({'arr': arr})

In [8]: ds.sel(time='2012', drop=True) Out[8]: <xarray.Dataset> Dimensions: (age: 4, time: 1) Coordinates: * age (age) object 'age0' 'age1' 'age2' 'age3' Dimensions without coordinates: time Data variables: arr (time, age) float64 0.08604 0.4679 0.101 0.5033

In [9]: ds.isel(time=31, drop=True) Out[9]: <xarray.Dataset> Dimensions: (age: 4) Coordinates: * age (age) object 'age0' 'age1' 'age2' 'age3' Data variables: arr (age) float64 0.08604 0.4679 0.101 0.5033 ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1472/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 55.835ms · About: xarray-datasette