home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

134 rows where type = "issue" and user = 5635139 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, state_reason, created_at (date), updated_at (date), closed_at (date)

state 2

  • closed 113
  • open 21

type 1

  • issue · 134 ✖

repo 1

  • xarray 134
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1250939008 I_kwDOAMm_X85Kj9CA 6646 `dim` vs `dims` max-sixty 5635139 closed 0     4 2022-05-27T16:15:02Z 2024-04-29T18:24:56Z 2024-04-29T18:24:56Z MEMBER      

What is your issue?

I've recently been hit with this when experimenting with xr.dot and xr.corr — xr.dot takes dims, and xr.cov takes dim. Because they each take multiple arrays as positional args, kwargs are more conventional.

Should we standardize on one of these?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6646/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1960332384 I_kwDOAMm_X8502Exg 8371 Writing to regions with unaligned chunks can lose data max-sixty 5635139 closed 0     20 2023-10-25T01:17:59Z 2024-03-29T14:35:51Z 2024-03-29T14:35:51Z MEMBER      

What happened?

Writing with region with chunks that aren't aligned can lose data.

I've recreated an example below. While it's unlikely that folks are passing different values to .chunk for the template vs. the regions, I had an "auto" chunk, which can then set different chunk values.

(FWIW, this was fairly painful, and I managed to lose a lot of time by not noticing this, and then not really considering this could happen as I was trying to debug. I think we should really strive to ensure that we don't lose data / incorrectly report that we've successfully written data...)

What did you expect to happen?

If there's a risk of data loss, raise an error...

Minimal Complete Verifiable Example

```Python ds = xr.DataArray(np.arange(120).reshape(4,3,-1),dims=list("abc")).rename('var1').to_dataset().chunk(2)

ds

<xarray.Dataset>

Dimensions: (a: 4, b: 3, c: 10)

Dimensions without coordinates: a, b, c

Data variables:

var1 (a, b, c) int64 dask.array<chunksize=(2, 2, 2), meta=np.ndarray>

def write(ds): ds.chunk(5).to_zarr('foo.zarr', compute=False, mode='w') for r in (range(ds.sizes['a'])): ds.chunk(3).isel(a=[r]).to_zarr('foo.zarr', region=dict(a=slice(r, r+1)))

def read(ds): result = xr.open_zarr('foo.zarr') assert result.compute().identical(ds) print(result.chunksizes, ds.chunksizes)

write(ds); read(ds)

AssertionError

xr.open_zarr('foo.zarr').compute()['var1']

<xarray.DataArray 'var1' (a: 4, b: 3, c: 10)> array([[[ 0, 0, 0, 3, 4, 5, 0, 0, 0, 9], [ 0, 0, 0, 13, 14, 15, 0, 0, 0, 19], [ 0, 0, 0, 23, 24, 25, 0, 0, 0, 29]],

   [[ 30,  31,  32,   0,   0,  35,  36,  37,  38,   0],
    [ 40,  41,  42,   0,   0,  45,  46,  47,  48,   0],
    [ 50,  51,  52,   0,   0,  55,  56,  57,  58,   0]],

   [[ 60,  61,  62,   0,   0,  65,   0,   0,   0,  69],
    [ 70,  71,  72,   0,   0,  75,   0,   0,   0,  79],
    [ 80,  81,  82,   0,   0,  85,   0,   0,   0,  89]],

   [[  0,   0,   0,  93,  94,  95,  96,  97,  98,   0],
    [  0,   0,   0, 103, 104, 105, 106, 107, 108,   0],
    [  0,   0,   0, 113, 114, 115, 116, 117, 118,   0]]])

Dimensions without coordinates: a, b, c ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: ccc8f9987b553809fb6a40c52fa1a8a8095c8c5f python: 3.9.18 (main, Aug 24 2023, 21:19:58) [Clang 14.0.3 (clang-1403.0.22.14.1)] python-bits: 64 OS: Darwin OS-release: 22.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2023.10.2.dev10+gccc8f998 pandas: 2.1.1 numpy: 1.25.2 scipy: 1.11.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.16.0 cftime: None nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: None dask: 2023.4.0 distributed: 2023.7.1 matplotlib: 3.5.1 cartopy: None seaborn: None numbagg: 0.2.3.dev30+gd26e29e fsspec: 2021.11.1 cupy: None pint: None sparse: None flox: None numpy_groupies: 0.9.19 setuptools: 68.1.2 pip: 23.2.1 conda: None pytest: 7.4.0 mypy: 1.6.0 IPython: 8.15.0 sphinx: 4.3.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8371/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
2110888925 I_kwDOAMm_X8590Zvd 8690 Add `nbytes` to repr? max-sixty 5635139 closed 0     9 2024-01-31T20:13:59Z 2024-02-19T22:18:47Z 2024-02-07T20:47:38Z MEMBER      

Is your feature request related to a problem?

Would having the nbytes value in the Dataset repr be reasonable?

I frequently find myself logging this separately. For example:

diff <xarray.Dataset> Dimensions: (lat: 25, time: 2920, lon: 53) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 Data variables: - air (time, lat, lon) float32 dask.array<chunksize=(2920, 25, 53), meta=np.ndarray> + air (time, lat, lon) float32 15MB dask.array<chunksize=(2920, 25, 53), meta=np.ndarray> Attributes: Conventions: COARDS title: 4x daily NMC reanalysis (1948) description: Data is from NMC initialized reanalysis\n(4x/day). These a... platform: Model references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...

Describe the solution you'd like

No response

Describe alternatives you've considered

Status quo :)

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8690/reactions",
    "total_count": 6,
    "+1": 6,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
2126375172 I_kwDOAMm_X85-vekE 8726 PRs requiring approval & merging main? max-sixty 5635139 closed 0     4 2024-02-09T02:35:58Z 2024-02-09T18:23:52Z 2024-02-09T18:21:59Z MEMBER      

What is your issue?

Sorry I haven't been on the calls at all recently (unfortunately the schedule is difficult for me). Maybe this was discussed there? 

PRs now seem to require a separate approval prior to merging. Is there an upside to this? Is there any difference between those who can approve and those who can merge? Otherwise it just seems like more clicking.

PRs also now seem to require merging the latest main prior to merging? I get there's some theoretical value to this, because changes can semantically conflict with each other. But it's extremely rare that this actually happens (can we point to cases?), and it limits the immediacy & throughput of PRs. If the bad outcome does ever happen, we find out quickly when main tests fail and can revert.

(fwiw I wrote a few principles around this down a while ago here; those are much stronger than what I'm suggesting in this issue though)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8726/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1984961987 I_kwDOAMm_X852UB3D 8432 Writing a datetime coord ignores chunks max-sixty 5635139 closed 0     5 2023-11-09T07:00:39Z 2024-01-29T19:12:33Z 2024-01-29T19:12:33Z MEMBER      

What happened?

When writing a coord with a datetime type, the chunking on the coord is ignored, and the whole coord is written as a single chunk. (or at least it can be, I haven't done enough to confirm whether it'll always be...)

This can be quite inconvenient. Any attempt to write to that dataset from a distributed process will have errors, since each process will be attempting to write another process's data, rather than only its region. And less severely, the chunks won't be unified.

Minimal Complete Verifiable Example

```Python ds = xr.tutorial.load_dataset('air_temperature')

( ds.chunk() .expand_dims(a=1000) .assign_coords( time2=lambda x: x.time, time_int=lambda x: (("time"), np.full(ds.sizes["time"], 1)), ) .chunk(time=10) .to_zarr("foo.zarr", mode="w") )

xr.open_zarr('foo.zarr')

Note the chunksize=(2920,) vs chunksize=(10,)!

<xarray.Dataset> Dimensions: (a: 1000, time: 2920, lat: 25, lon: 53) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 time2 (time) datetime64[ns] dask.array<chunksize=(2920,), meta=np.ndarray> # here time_int (time) int64 dask.array<chunksize=(10,), meta=np.ndarray> # here Dimensions without coordinates: a Data variables: air (a, time, lat, lon) float32 dask.array<chunksize=(1000, 10, 25, 53), meta=np.ndarray> Attributes: Conventions: COARDS description: Data is from NMC initialized reanalysis\n(4x/day). These a... platform: Model references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly... title: 4x daily NMC reanalysis (1948)

xr.open_zarr('foo.zarr').chunks

ValueError Traceback (most recent call last) Cell In[13], line 1 ----> 1 xr.open_zarr('foo.zarr').chunks

File /opt/homebrew/lib/python3.9/site-packages/xarray/core/dataset.py:2567, in Dataset.chunks(self) 2552 @property 2553 def chunks(self) -> Mapping[Hashable, tuple[int, ...]]: 2554 """ 2555 Mapping from dimension names to block lengths for this dataset's data, or None if 2556 the underlying data is not a dask array. (...) 2565 xarray.unify_chunks 2566 """ -> 2567 return get_chunksizes(self.variables.values())

File /opt/homebrew/lib/python3.9/site-packages/xarray/core/common.py:2013, in get_chunksizes(variables) 2011 for dim, c in v.chunksizes.items(): 2012 if dim in chunks and c != chunks[dim]: -> 2013 raise ValueError( 2014 f"Object has inconsistent chunks along dimension {dim}. " 2015 "This can be fixed by calling unify_chunks()." 2016 ) 2017 chunks[dim] = c 2018 return Frozen(chunks)

ValueError: Object has inconsistent chunks along dimension time. This can be fixed by calling unify_chunks().

```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.9.18 (main, Nov 2 2023, 16:51:22) [Clang 14.0.3 (clang-1403.0.22.14.1)] python-bits: 64 OS: Darwin OS-release: 22.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: None xarray: 2023.10.1 pandas: 2.1.1 numpy: 1.26.1 scipy: 1.11.1 netCDF4: None pydap: None h5netcdf: 1.1.0 h5py: 3.8.0 Nio: None zarr: 2.16.0 cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: 1.3.7 dask: 2023.5.0 distributed: 2023.5.0 matplotlib: 3.6.0 cartopy: None seaborn: 0.12.2 numbagg: 0.6.0 fsspec: 2022.8.2 cupy: None pint: 0.22 sparse: 0.14.0 flox: 0.8.1 numpy_groupies: 0.9.22 setuptools: 68.2.2 pip: 23.3.1 conda: None pytest: 7.4.0 mypy: 1.6.1 IPython: 8.14.0 sphinx: 5.2.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8432/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1923361961 I_kwDOAMm_X85ypCyp 8263 Surprising `.groupby` behavior with float index max-sixty 5635139 closed 0     0 2023-10-03T05:50:49Z 2024-01-08T01:05:25Z 2024-01-08T01:05:25Z MEMBER      

What is your issue?

We raise an error on grouping without supplying dims, but not for float indexes — is this intentional or an oversight?

This is without flox installed

```python

da = xr.tutorial.open_dataset("air_temperature")['air']

da.drop_vars('lat').groupby('lat').sum() ```

```

ValueError Traceback (most recent call last) Cell In[8], line 1 ----> 1 da.drop_vars('lat').groupby('lat').sum() ... ValueError: cannot reduce over dimensions ['lat']. expected either '...' to reduce over all dimensions or one or more of ('time', 'lon'). ```

But with a float index, we don't raise:

python da.groupby('lat').sum()

...returns the original array:

Out[15]: <xarray.DataArray 'air' (time: 2920, lat: 25, lon: 53)> array([[[296.29 , 296.79 , 297.1 , ..., 296.9 , 296.79 , 296.6 ], [295.9 , 296.19998, 296.79 , ..., 295.9 , 295.9 , 295.19998], [296.6 , 296.19998, 296.4 , ..., 295.4 , 295.1 , 294.69998], ...

And if we try this with a non-float index, we get the error again:

python da.groupby('time').sum()

ValueError: cannot reduce over dimensions ['time']. expected either '...' to reduce over all dimensions or one or more of ('lat', 'lon').

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8263/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1916677049 I_kwDOAMm_X85yPiu5 8245 Tools for writing distributed zarrs max-sixty 5635139 open 0     0 2023-09-28T04:25:45Z 2024-01-04T00:15:09Z   MEMBER      

What is your issue?

There seems to be a common pattern for writing zarrs from a distributed set of machines, in parallel. It's somewhat described in the prose of the io docs. Quoting:

  • Creating the template — "the first step is creating an initial Zarr store without writing all of its array data. This can be done by first creating a Dataset with dummy values stored in dask, and then calling to_zarr with compute=False to write only metadata to Zarr"
  • Writing out each region from workers — "a Zarr store with the correct variable shapes and attributes exists that can be filled out by subsequent calls to to_zarr. The region provides a mapping from dimension names to Python slice objects indicating where the data should be written (in index space, not coordinate space)"

I've been using this fairly successfully recently. It's much better than writing hundreds or thousands of data variables, since many small data variables create a huge number of files.

Are there some tools we can provide to make this easier? Some ideas: - [ ] compute=False is arguably a less-than-obvious kwarg meaning "write metadata". Maybe this should be a method, maybe it's a candidate for renaming? Or maybe make_template can be an abstraction over it. Something like xarray_beam.make_template to make the template from a Dataset? - Or from an array of indexes? - https://github.com/pydata/xarray/issues/8343 - https://github.com/pydata/xarray/pull/8460 - [ ] What happens if one worker's data isn't aligned on some dimensions? Will that write to the wrong location? Could we offer an option, similar to the above, to reindex on the template dimensions?

  • [ ] When writing a region, we need to drop other vars. Can we offer this as a kwarg? Occasionally I'll add a dimension with an index to a dataset, run the function to write it — and it'll fail, because I forgot to add that index to the .drop_vars call that precedes the write. When we're writing a template, all the indexes are written up front anyway. (edit: #6260)
    • https://github.com/pydata/xarray/pull/8460

More minor papercuts: - [ ] I've hit an issue where writing a region seemed to cause the worker to attempt to load the whole array into memory — can we offer guarantees for when (non-metadata) data will be loaded during to_zarr? - [ ] How about adding raise_if_dask_computes to our public API? The alternative I've been doing is watching htop and existing if I see memory ballooning, which is less cerebral... - [ ] It doesn't seem easy to write coords on a DataArray. For example, writing xr.tutorial.load_dataset('air_temperature').assign_coords(lat2=da.lat + 2, a=(('lon',), ['a'] * len(da.lon))).chunk().to_zarr('foo.zarr', compute=False) will cause the non-index coords to be written as empty. But writing them separately conflicts with having a single variable. Currently I manually load each coord before writing, which is not super-friendly.

Some things that were in the list here, as they've been completed!! - [x] Requiring region to be specified as an int range can be inconvenient — would it feasible to have a function that grabs the template metadata, calculates the region ints, and then calculates the implied indexes? - Edit: suggested at https://github.com/pydata/xarray/issues/7702

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8245/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 1
}
    xarray 13221727 issue
1975574237 I_kwDOAMm_X851wN7d 8409 Task graphs on `.map_blocks` with many chunks can be huge max-sixty 5635139 closed 0     6 2023-11-03T07:14:45Z 2024-01-03T04:10:16Z 2024-01-03T04:10:16Z MEMBER      

What happened?

I'm getting task graphs > 1GB, I think possibly because the full indexes are being included in every task?

What did you expect to happen?

Only the relevant sections of the index would be included

Minimal Complete Verifiable Example

```Python da = xr.tutorial.load_dataset('air_temperature')

Dropping the index doesn't generally matter that much...

len(cloudpickle.dumps(da.chunk(lat=1, lon=1)))

15569320

len(cloudpickle.dumps(da.chunk().drop_vars(da.indexes)))

15477313

But with .map_blocks, it really matters — it's really big with the indexes, and the same size without:

len(cloudpickle.dumps(da.chunk(lat=1, lon=1).map_blocks(lambda x: x)))

79307120

len(cloudpickle.dumps(da.chunk(lat=1, lon=1).drop_vars(da.indexes).map_blocks(lambda x: x)))

16016173

```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.9.18 (main, Aug 24 2023, 21:19:58) [Clang 14.0.3 (clang-1403.0.22.14.1)] python-bits: 64 OS: Darwin OS-release: 22.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: None xarray: 2023.10.1 pandas: 2.1.1 numpy: 1.26.1 scipy: 1.11.1 netCDF4: None pydap: None h5netcdf: 1.1.0 h5py: 3.8.0 Nio: None zarr: 2.16.0 cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: 1.3.7 dask: 2023.5.0 distributed: 2023.5.0 matplotlib: 3.6.0 cartopy: None seaborn: 0.12.2 numbagg: 0.6.0 fsspec: 2022.8.2 cupy: None pint: 0.22 sparse: 0.14.0 flox: 0.7.2 numpy_groupies: 0.9.22 setuptools: 68.1.2 pip: 23.2.1 conda: None pytest: 7.4.0 mypy: 1.6.1 IPython: 8.14.0 sphinx: 5.2.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8409/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
2052840951 I_kwDOAMm_X856W933 8566 Use `ddof=1` for `std` & `var` max-sixty 5635139 open 0     2 2023-12-21T17:47:21Z 2023-12-27T16:58:46Z   MEMBER      

What is your issue?

I've discussed this a bunch with @dcherian (though I'm not sure he necessarily agrees, I'll let him comment)

Currently xarray uses ddof=0 for std & var. This is: - Rarely what someone actually wants — xarray data is almost always a sample of some underlying distribution, for which ddof=1 is correct - Inconsistent with pandas

OTOH: - It is consistent with numpy - It wouldn't be a painless change — folks who don't read deprecation messages would see values change very slightly

Any thoughts?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8566/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
988158051 MDU6SXNzdWU5ODgxNTgwNTE= 5764 Implement __sizeof__ on objects? max-sixty 5635139 open 0     6 2021-09-03T23:36:53Z 2023-12-19T18:23:08Z   MEMBER      

Is your feature request related to a problem? Please describe. Currently ds.nbytes returns the size of the data.

But sys.getsizeof(ds) returns a very small number.

Describe the solution you'd like If we implement __sizeof__ on DataArrays & Datasets, this would work.

I think that would be something like ds.nbytes + the size of the ds container, + maybe attrs if those aren't handled by .nbytes?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5764/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  reopened xarray 13221727 issue
1977661256 I_kwDOAMm_X8514LdI 8414 Is there any way of having `.map_blocks` be even more opaque to dask? max-sixty 5635139 closed 0     23 2023-11-05T06:56:43Z 2023-12-12T18:14:57Z 2023-12-12T18:14:57Z MEMBER      

Is your feature request related to a problem?

Currently I have a workload which does something a bit like:

python ds = open_zarr(source) ( ds.assign( x=ds.foo * ds.bar y=ds.foo + ds.bar ).to_zarr(dest) )

(the actual calc is a bit more complicated! And while I don't have a MVCE of the full calc, I pasted a task graph below)

Dask — while very impressive in many ways — handles this extremely badly, because it attempts to load the whole of ds into memory before writing out any chunks. There are lots of issues on this in the dask repo; it seems like an intractable problem for dask.

Describe the solution you'd like

I was hoping to make the internals of this task opaque to dask, so it became a much dumber task runner — just map over the blocks, running the function and writing the result, block by block. I thought I had some success with .map_blocks last week — the internals of the calc are now opaque at least. But the dask cluster is falling over again, I think because the write is seen as a separate task.

Is there any way to make the write more opaque too?

Describe alternatives you've considered

I've built a homegrown thing which is really hacky which does this on a custom scheduler — just runs the functions and writes with region. I'd much prefer to use & contribute to the broader ecosystem...

Additional context

(It's also possible I'm making some basic error — and I do remember it working much better last week — so please feel free to direct me / ask me for more examples, if this doesn't ring true)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8414/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
866826033 MDU6SXNzdWU4NjY4MjYwMzM= 5215 Add an Cumulative aggregation, similar to Rolling max-sixty 5635139 closed 0     6 2021-04-24T19:59:49Z 2023-12-08T22:06:53Z 2023-12-08T22:06:53Z MEMBER      

Is your feature request related to a problem? Please describe.

Pandas has a .expanding aggregation, which is basically rolling with a full lookback. I often end up supplying rolling with the length of the dimension, and this is some nice sugar for that.

Describe the solution you'd like Basically the same as pandas — a .expanding method that returns an Expanding class, which implements the same methods as a Rolling class.

Describe alternatives you've considered Some options: – This – Don't add anything, the sugar isn't worth the additional API. – Go full out and write specialized expanding algos — which will be faster since they don't have to keep track of the window. But not that much faster, likely not worth the effort.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5215/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
2019645081 I_kwDOAMm_X854YVaZ 8498 Allow some notion of ordering in Dataset dims max-sixty 5635139 closed 0     5 2023-11-30T22:57:23Z 2023-12-08T19:22:56Z 2023-12-08T19:22:55Z MEMBER      

What is your issue?

Currently a DataArray's dims are ordered, while a Dataset's are not.

Do we gain anything from have unordered dims in a Dataset? Could we have an ordering without enforcing it on every variable?

Here's one proposal, with fairly wide error-bars: - Datasets have a dim order, which is set at construction time or through .transpose - Currently .transpose changes the order of each variable's dims, but not the dataset's - If dims aren't supplied, we can just use the first variable's - Variables don't have to conform to that order — .assign(foo=differently_ordered) maintains the differently ordered dims. So this doesn't limit any current functionality. - When there are transformations which change dim ordering, Xarray is "allowed" to transpose variables to the dataset's ordering. Currently Xarray is "allowed" to change dim order arbitrarily — for example to put a core dim last. IIUC, we'd prefer to set a non-arbitrary order, but we don't have one to reference. - This would remove a bunch of boilerplate from methods that save the ordering, run .apply_ufunc and then reorder in the original order[^1]

What do folks think?

[^1]: though also we could do this in .apply_ufunc

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8498/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  not_planned xarray 13221727 issue
2026963757 I_kwDOAMm_X8540QMt 8522 Test failures on `main` max-sixty 5635139 closed 0     7 2023-12-05T19:22:01Z 2023-12-06T18:48:24Z 2023-12-06T17:28:13Z MEMBER      

What is your issue?

Any ideas what could be causing these? I can't immediately reproduce locally.

https://github.com/pydata/xarray/actions/runs/7105414268/job/19342564583

``` Error: TestDataArray.test_computation_objects[int64-method_groupby_bins-data]

AssertionError: Left and right DataArray objects are not close

Differing values: L <Quantity([[ nan nan 1. 1. ] [2. 2. 3. 3. ] [4. 4. 5. 5. ] [6. 6. 7. 7. ] [8. 8. 9. 9.333333]], 'meter')> R <Quantity([[0. 0. 1. 1. ] [2. 2. 3. 3. ] [4. 4. 5. 5. ] [6. 6. 7. 7. ] [8. 8. 9. 9.333333]], 'meter')> ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8522/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 1,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1192478248 I_kwDOAMm_X85HE8Yo 6440 Add `eval`? max-sixty 5635139 closed 0     0 2022-04-05T00:57:00Z 2023-12-06T17:52:47Z 2023-12-06T17:52:47Z MEMBER      

Is your feature request related to a problem?

We currently have query, which can runs a numexpr string using eval.

Describe the solution you'd like

Should we add an eval method itself? I find that when building something for the command line, allowing people to pass an eval-able expression can be a good interface.

Describe alternatives you've considered

No response

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6440/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
907845790 MDU6SXNzdWU5MDc4NDU3OTA= 5413 Does the PyPI release job fire twice for each release? max-sixty 5635139 closed 0     2 2021-06-01T04:01:17Z 2023-12-04T19:22:32Z 2023-12-04T19:22:32Z MEMBER      

I was attempting to copy the great work here for numbagg and spotted this! Do we fire twice for each release? Maybe that's fine though?

https://github.com/pydata/xarray/actions/workflows/pypi-release.yaml

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5413/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
929840699 MDU6SXNzdWU5Mjk4NDA2OTk= 5531 Keyword only args for arguments like "drop" max-sixty 5635139 closed 0     12 2021-06-25T05:24:25Z 2023-12-04T19:22:24Z 2023-12-04T19:22:23Z MEMBER      

Is your feature request related to a problem? Please describe.

A method like .reset_index has a signature .reset_index(dims_or_levels, drop=False).

This means that passing .reset_index("x", "y") is actually like passing .reset_index("x", True), which is silent and confusing.

Describe the solution you'd like Move to kwarg-only arguments for these; like .reset_index(dims_or_levels, *, drop=False).

But we probably need a deprecation cycle, which will require some work.

Describe alternatives you've considered Not have a deprecation cycle? I imagine it's fairly rare to not pass the kwarg.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5531/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1165654699 I_kwDOAMm_X85Fenqr 6349 Rolling exp correlation max-sixty 5635139 closed 0     1 2022-03-10T19:51:57Z 2023-12-04T19:13:35Z 2023-12-04T19:13:34Z MEMBER      

Is your feature request related to a problem?

I'd like an exponentially moving correlation coefficient

Describe the solution you'd like

I think we could add a rolling_exp.corr method fairly easily — i.e. just in python, no need to add anything to numbagg: ewma here means rolling_exp(...).mean - ewma(A * B) - ewma(A) * ewma(B) for the rolling covar - divided by sqrt of (ewma(A**2) - ewma(A)**2 * ewma(B**2) - ewma(B)**2 for the sqrt of variance

We could also add a flag for cosine similarity, which wouldn't remove the mean. We could also add .var & .std & .covar as their own methods.

I think we'd need to mask the variables on their intersection, so we don't have values that are missing from B affecting A's variance without affecting its covariance.

Pandas does this in cython, possibly because it's faster to only do a single pass of the data. If anyone has correctness concerns about this simple approach of wrapping ewmas, please let me know. Or if the performance would be unacceptable such that it shouldn't go into xarray until it's a single pass.

Describe alternatives you've considered

Numagg

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6349/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1995489227 I_kwDOAMm_X8528L_L 8455 Errors when assigning using `.from_pandas_multiindex` max-sixty 5635139 closed 0     3 2023-11-15T20:09:15Z 2023-12-04T19:10:12Z 2023-12-04T19:10:11Z MEMBER      

What happened?

Very possibly this is user-error, forgive me if so.

I'm trying to transition some code from the previous assignment of MultiIndexes, to the new world. Here's an MCVE:

What did you expect to happen?

No response

Minimal Complete Verifiable Example

```Python da = xr.tutorial.open_dataset("air_temperature")['air']

old code, works, but with a warning

da.expand_dims('foo').assign_coords(foo=(pd.MultiIndex.from_tuples([(1,2)])))

<ipython-input-25-f09b7f52bb42>:1: FutureWarning: the pandas.MultiIndex object(s) passed as 'foo' coordinate(s) or data variable(s) will no longer be implicitly promoted and wrapped into multiple indexed coordinates in the future (i.e., one coordinate for each multi-index level + one dimension coordinate). If you want to keep this behavior, you need to first wrap it explicitly using mindex_coords = xarray.Coordinates.from_pandas_multiindex(mindex_obj, 'dim') and pass it as coordinates, e.g., xarray.Dataset(coords=mindex_coords), dataset.assign_coords(mindex_coords) or dataarray.assign_coords(mindex_coords). da.expand_dims('foo').assign_coords(foo=(pd.MultiIndex.from_tuples([(1,2)]))) Out[25]: <xarray.DataArray 'air' (foo: 1, time: 2920, lat: 25, lon: 53)> array([[[[241.2 , 242.5 , 243.5 , ..., 232.79999, 235.5 , 238.59999], ... [297.69 , 298.09 , 298.09 , ..., 296.49 , 296.19 , 295.69 ]]]], dtype=float32) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0 * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 * foo (foo) object MultiIndex * foo_level_0 (foo) int64 1 * foo_level_1 (foo) int64 2

new code — seems to get confused between the number of values in the index — 1 — and the number of levels — 3 including the parent:

da.expand_dims('foo').assign_coords(foo=xr.Coordinates.from_pandas_multiindex(pd.MultiIndex.from_tuples([(1,2)]), dim='foo'))

ValueError Traceback (most recent call last) Cell In[26], line 1 ----> 1 da.expand_dims('foo').assign_coords(foo=xr.Coordinates.from_pandas_multiindex(pd.MultiIndex.from_tuples([(1,2)]), dim='foo'))

File ~/workspace/xarray/xarray/core/common.py:621, in DataWithCoords.assign_coords(self, coords, **coords_kwargs) 618 else: 619 results = self._calc_assign_results(coords_combined) --> 621 data.coords.update(results) 622 return data

File ~/workspace/xarray/xarray/core/coordinates.py:566, in Coordinates.update(self, other) 560 # special case for PandasMultiIndex: updating only its dimension coordinate 561 # is still allowed but depreciated. 562 # It is the only case where we need to actually drop coordinates here (multi-index levels) 563 # TODO: remove when removing PandasMultiIndex's dimension coordinate. 564 self._drop_coords(self._names - coords_to_align._names) --> 566 self._update_coords(coords, indexes)

File ~/workspace/xarray/xarray/core/coordinates.py:834, in DataArrayCoordinates._update_coords(self, coords, indexes) 832 coords_plus_data = coords.copy() 833 coords_plus_data[_THIS_ARRAY] = self._data.variable --> 834 dims = calculate_dimensions(coords_plus_data) 835 if not set(dims) <= set(self.dims): 836 raise ValueError( 837 "cannot add coordinates with new dimensions to a DataArray" 838 )

File ~/workspace/xarray/xarray/core/variable.py:3014, in calculate_dimensions(variables) 3012 last_used[dim] = k 3013 elif dims[dim] != size: -> 3014 raise ValueError( 3015 f"conflicting sizes for dimension {dim!r}: " 3016 f"length {size} on {k!r} and length {dims[dim]} on {last_used!r}" 3017 ) 3018 return dims

ValueError: conflicting sizes for dimension 'foo': length 1 on <this-array> and length 3 on {'lat': 'lat', 'lon': 'lon', 'time': 'time', 'foo': 'foo'} ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.9.18 (main, Nov 2 2023, 16:51:22) [Clang 14.0.3 (clang-1403.0.22.14.1)] python-bits: 64 OS: Darwin OS-release: 22.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2023.10.2.dev10+gccc8f998 pandas: 2.1.1 numpy: 1.25.2 scipy: 1.11.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.16.0 cftime: None nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: None dask: 2023.4.0 distributed: 2023.7.1 matplotlib: 3.5.1 cartopy: None seaborn: None numbagg: 0.2.3.dev30+gd26e29e fsspec: 2021.11.1 cupy: None pint: None sparse: None flox: None numpy_groupies: 0.9.19 setuptools: 68.2.2 pip: 23.3.1 conda: None pytest: 7.4.0 mypy: 1.6.0 IPython: 8.15.0 sphinx: 4.3.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8455/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  not_planned xarray 13221727 issue
1995308522 I_kwDOAMm_X8527f3q 8454 Formalize `mode` / safety guarantees for Zarr max-sixty 5635139 open 0     1 2023-11-15T18:28:38Z 2023-11-15T20:38:04Z   MEMBER      

What is your issue?

It sounds like we're coalescing on when it's safe to write concurrently: - mode="r+" is safe to write concurrently to different parts of a dataset - mode="a" isn't safe, because it changes the shape of an array, for example extending a dimension

What are the existing operations that aren't consistent with this? - Is concurrently writing additional variables safe? Or it requires updating the centralized consolidated metadata? Currently that requires mode="a", which is overly conservative based on the above rules assuming it is safe — we can liberalize to allow with mode="r+". - https://github.com/pydata/xarray/issues/8371, ~but that's a bug~ — edit: or possibly an artifact of writing concurrently to overlapping chunks with a single to_zarr call. We could at least restrict non-aligned writes to mode="a", so it wasn't possible to hit this mistakenly while writing to different parts of a dataset. - Writing the same values to the same chunks concurrently isn't safe at the moment — we'll get an "Stale file handle" error if two processes write to the same location at the same time. I'm not sure if that's possible to allow; possibly it requires work on the Zarr side. If it were possible, we wouldn't have to be as careful about ensuring that each process has mutually exclusive chunks to write. (lower priority)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8454/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1953001043 I_kwDOAMm_X850aG5T 8343 Add `metadata_only` param to `.to_zarr`? max-sixty 5635139 open 0     17 2023-10-19T20:25:11Z 2023-11-15T05:22:12Z   MEMBER      

Is your feature request related to a problem?

A leaf from https://github.com/pydata/xarray/issues/8245, which has a bullet:

compute=False is arguably a less-than-obvious kwarg meaning "write metadata". Maybe this should be a method, maybe it's a candidate for renaming? Or maybe make_template can be an abstraction over it

I've also noticed that for large arrays, running compute=False can take several minutes, despite the indexes being very small. I think this is because it's building a dask task graph — which is then discarded, since the array is written from different machines with the region pattern.

Describe the solution you'd like

Would introducing a metadata_only parameter to to_zarr help here: - Better name - No dask graph

Describe alternatives you've considered

No response

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8343/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1980019336 I_kwDOAMm_X852BLKI 8421 `to_zarr` could transpose dims max-sixty 5635139 closed 0     0 2023-11-06T20:38:35Z 2023-11-14T19:23:08Z 2023-11-14T19:23:08Z MEMBER      

Is your feature request related to a problem?

Currently we need to know the order of dims when using region in to_zarr. Generally in xarray we're fine with the order, because we have the names, so this is a bit of an aberration. It means that code needs to carry around the correct order of dims.

Here's an MCVE:

```python

ds = xr.tutorial.load_dataset('air_temperature')

ds.to_zarr('foo', mode='w')

ds.transpose(..., 'lat').to_zarr('foo', mode='r+')

ValueError: variable 'air' already exists with different dimension names ('time', 'lat', 'lon') != ('time', 'lon', 'lat'), but changing variable dimensions is not supported by to_zarr().

```

Describe the solution you'd like

I think we should be able to transpose them based on the target?

Describe alternatives you've considered

No response

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8421/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1986643906 I_kwDOAMm_X852acfC 8437 Restrict pint test runs max-sixty 5635139 open 0     10 2023-11-10T00:50:52Z 2023-11-13T21:57:45Z   MEMBER      

What is your issue?

Pint tests are failing on main — https://github.com/pydata/xarray/actions/runs/6817674274/job/18541677930

E TypeError: no implementation found for 'numpy.min' on types that implement __array_function__: [<class 'pint.util.Quantity'>]

If we can't fix soon, should we disable?

CC @keewis

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8437/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
874039546 MDU6SXNzdWU4NzQwMzk1NDY= 5246 test_save_mfdataset_compute_false_roundtrip fails max-sixty 5635139 open 0     1 2021-05-02T20:41:48Z 2023-11-02T04:38:05Z   MEMBER      

What happened:

test_save_mfdataset_compute_false_roundtrip consistently fails in windows-latest-3.9, e.g. https://github.com/pydata/xarray/pull/5244/checks?check_run_id=2485202784

Here's the traceback:

```python self = <xarray.tests.test_backends.TestDask object at 0x000001FF45A9B640>

def test_save_mfdataset_compute_false_roundtrip(self):
    from dask.delayed import Delayed

    original = Dataset({"foo": ("x", np.random.randn(10))}).chunk()
    datasets = [original.isel(x=slice(5)), original.isel(x=slice(5, 10))]
    with create_tmp_file(allow_cleanup_failure=ON_WINDOWS) as tmp1:
        with create_tmp_file(allow_cleanup_failure=ON_WINDOWS) as tmp2:
            delayed_obj = save_mfdataset(
                datasets, [tmp1, tmp2], engine=self.engine, compute=False
            )
            assert isinstance(delayed_obj, Delayed)
            delayed_obj.compute()
            with open_mfdataset(
                [tmp1, tmp2], combine="nested", concat_dim="x"
            ) as actual:
              assert_identical(actual, original)

E AssertionError: Left and right Dataset objects are not identical E
E
E Differing data variables: E L foo (x) float64 dask.array<chunksize=(5,), meta=np.ndarray> E R foo (x) float64 dask.array<chunksize=(10,), meta=np.ndarray> ```

Anything else we need to know?:

xfailed in https://github.com/pydata/xarray/pull/5245

Environment:

[Eliding since it's the test env]

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5246/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1923431725 I_kwDOAMm_X85ypT0t 8264 Improve error messages max-sixty 5635139 open 0     4 2023-10-03T06:42:57Z 2023-10-24T18:40:04Z   MEMBER      

Is your feature request related to a problem?

Coming back to xarray, and using it based on what I remember from a year ago or so, means I make lots of mistakes. I've also been using it outside of a repl, where error messages are more important, given I can't explore a dataset inline.

Some of the error messages could be much more helpful. Take one example:

xarray.core.merge.MergeError: conflicting values for variable 'date' on objects to be combined. You can skip this check by specifying compat='override'.

The second sentence is nice. But the first could be give us much more information: - Which variables conflict? I'm merging four objects, so would be so helpful to know which are causing the issue. - What is the conflict? Is one a superset and I can join=...? Are they off by 1 or are they completely different types? - Our testing.assert_equal produces pretty nice errors, as a comparison

Having these good is really useful, lets folks stay in the flow while they're working, and it signals that we're a well-built, refined library.

Describe the solution you'd like

I'm not sure the best way to surface the issues — error messages make for less legible contributions than features or bug fixes, and the primary audience for good error messages is often the opposite of those actively developing the library. They're also more difficult to manage as GH issues — there could be scores of marginal issues which would often be out of date.

One thing we do in PRQL is have a file that snapshots error messages test_bad_error_messages.rs, which can then be a nice contribution to change those from bad to good. I'm not sure whether that would work here (python doesn't seem to have a great snapshotter, pytest-regtest is the best I've found; I wrote pytest-accept but requires doctests).

Any other ideas?

Describe alternatives you've considered

No response

Additional context

A couple of specific error-message issues: - https://github.com/pydata/xarray/issues/2078 - https://github.com/pydata/xarray/issues/5290

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8264/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1913983402 I_kwDOAMm_X85yFRGq 8233 numbagg & flox max-sixty 5635139 closed 0     13 2023-09-26T17:33:32Z 2023-10-15T07:48:56Z 2023-10-09T15:40:29Z MEMBER      

What is your issue?

I've been doing some work recently on our old friend numbagg, improving the ewm routines & adding some more.

I'm keen to get numbagg back in shape, doing the things that it does best, and trimming anything it doesn't. I notice that it has grouped calcs. Am I correct to think that flox does this better? I haven't been up with the latest. flox looks like it's particularly focused on dask arrays, whereas numpy_groupies, one of the inspirations for this, was applicable to numpy arrays too.

At least from the xarray perspective, are we OK to deprecate these numbagg functions, and direct folks to flox?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8233/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1920369929 I_kwDOAMm_X85ydoUJ 8259 Should `.reset_encoding` be `.drop_encoding`? max-sixty 5635139 closed 0     1 2023-09-30T19:11:46Z 2023-10-12T17:11:06Z 2023-10-12T17:11:06Z MEMBER      

What is your issue?

Not the greatest issue facing the universe — but for the cause of consistency — should .reset_encoding be .drop_encoding, since it drops all encoding attributes?

For comparison: - .reset_coords — "Given names of coordinates, reset them to become variables." - '.drop_vars` — "Drop variables from this dataset."

Also ref #8258

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8259/reactions",
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1918061661 I_kwDOAMm_X85yU0xd 8251 `.chunk()` doesn't create chunks on 0 dim arrays max-sixty 5635139 open 0     0 2023-09-28T18:30:50Z 2023-09-30T21:31:05Z   MEMBER      

What happened?

.chunk's docstring states:

``` """Coerce this array's data into a dask arrays with the given chunks.

    If this variable is a non-dask array, it will be converted to dask
    array. If it's a dask array, it will be rechunked to the given chunk
    sizes.

```

...but this doesn't happen for 0 dim arrays; example below.

For context, as part of #8245, I had a function that creates a template array. It created an empty DataArray, then expanded dims for each dimension. And it kept blowing up memory! ...until I realized that it was actually not a lazy array.

What did you expect to happen?

It may be that we can't have a 0-dim dask array — but then we should raise in this method, rather than return the wrong thing.

Minimal Complete Verifiable Example

```Python [ins] In [1]: type(xr.DataArray().chunk().data) Out[1]: numpy.ndarray

[ins] In [2]: type(xr.DataArray(1).chunk().data) Out[2]: numpy.ndarray

[ins] In [3]: type(xr.DataArray([1]).chunk().data) Out[3]: dask.array.core.Array ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: 0d6cd2a39f61128e023628c4352f653537585a12 python: 3.9.18 (main, Aug 24 2023, 21:19:58) [Clang 14.0.3 (clang-1403.0.22.14.1)] python-bits: 64 OS: Darwin OS-release: 22.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2023.8.1.dev25+g8215911a.d20230914 pandas: 2.1.1 numpy: 1.25.2 scipy: 1.11.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.16.0 cftime: None nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: None dask: 2023.4.0 distributed: 2023.7.1 matplotlib: 3.5.1 cartopy: None seaborn: None numbagg: 0.2.3.dev30+gd26e29e fsspec: 2021.11.1 cupy: None pint: None sparse: None flox: 0.7.2 numpy_groupies: 0.9.19 setuptools: 68.1.2 pip: 23.2.1 conda: None pytest: 7.4.0 mypy: 1.5.1 IPython: 8.15.0 sphinx: 4.3.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8251/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1920167070 I_kwDOAMm_X85yc2ye 8255 Allow a `lambda` for the `other` param to `where` max-sixty 5635139 closed 0     1 2023-09-30T08:05:54Z 2023-09-30T19:02:42Z 2023-09-30T19:02:42Z MEMBER      

Is your feature request related to a problem?

Currently we allow:

python da.where(lambda x: x.foo == 5)

...but we don't allow:

python da.where(lambda x: x.foo == 5, lambda x: x - x.shift(1))

...which would be nice

Describe the solution you'd like

No response

Describe alternatives you've considered

I don't think this offers many downsides — it's not like we want to fill the array with a callable object.

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8255/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
124154674 MDU6SXNzdWUxMjQxNTQ2NzQ= 688 Keep attrs & Add a 'keep_coords' argument to Dataset.apply max-sixty 5635139 closed 0     14 2015-12-29T02:42:48Z 2023-09-30T18:47:07Z 2023-09-30T18:47:07Z MEMBER      

Generally this isn't a problem, since the coords are carried over by the resulting DataArrays:

``` python In [11]:

ds = xray.Dataset({ 'a':pd.DataFrame(pd.np.random.rand(10,3)), 'b':pd.Series(pd.np.random.rand(10)) }) ds.coords['c'] = pd.Series(pd.np.random.rand(10)) ds Out[11]: <xray.Dataset> Dimensions: (dim_0: 10, dim_1: 3) Coordinates: * dim_0 (dim_0) int64 0 1 2 3 4 5 6 7 8 9 * dim_1 (dim_1) int64 0 1 2 c (dim_0) float64 0.9318 0.2899 0.3853 0.6235 0.9436 0.7928 ... Data variables: a (dim_0, dim_1) float64 0.5707 0.9485 0.3541 0.5987 0.406 0.7992 ... b (dim_0) float64 0.4106 0.2316 0.5804 0.6393 0.5715 0.6463 ... In [12]:

ds.apply(lambda x: x*2) Out[12]: <xray.Dataset> Dimensions: (dim_0: 10, dim_1: 3) Coordinates: c (dim_0) float64 0.9318 0.2899 0.3853 0.6235 0.9436 0.7928 ... * dim_0 (dim_0) int64 0 1 2 3 4 5 6 7 8 9 * dim_1 (dim_1) int64 0 1 2 Data variables: a (dim_0, dim_1) float64 1.141 1.897 0.7081 1.197 0.812 1.598 ... b (dim_0) float64 0.8212 0.4631 1.161 1.279 1.143 1.293 0.3507 ... ```

But if there's an operation that removes the coords from the DataArrays, the coords are not there on the result (notice c below). Should the Dataset retain them? Either always or with a keep_coords argument, similar to keep_attrs.

``` python In [13]:

ds = xray.Dataset({ 'a':pd.DataFrame(pd.np.random.rand(10,3)), 'b':pd.Series(pd.np.random.rand(10)) }) ds.coords['c'] = pd.Series(pd.np.random.rand(10)) ds Out[13]: <xray.Dataset> Dimensions: (dim_0: 10, dim_1: 3) Coordinates: * dim_0 (dim_0) int64 0 1 2 3 4 5 6 7 8 9 * dim_1 (dim_1) int64 0 1 2 c (dim_0) float64 0.4121 0.2507 0.6326 0.4031 0.6169 0.441 0.1146 ... Data variables: a (dim_0, dim_1) float64 0.4813 0.2479 0.5158 0.2787 0.06672 ... b (dim_0) float64 0.2638 0.5788 0.6591 0.7174 0.3645 0.5655 ... In [14]:

ds.apply(lambda x: x.to_pandas()*2) Out[14]: <xray.Dataset> Dimensions: (dim_0: 10, dim_1: 3) Coordinates: * dim_0 (dim_0) int64 0 1 2 3 4 5 6 7 8 9 * dim_1 (dim_1) int64 0 1 2 Data variables: a (dim_0, dim_1) float64 0.9627 0.4957 1.032 0.5574 0.1334 0.8289 ... b (dim_0) float64 0.5275 1.158 1.318 1.435 0.7291 1.131 0.1903 ... ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/688/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1917820711 I_kwDOAMm_X85yT58n 8248 `write_empty_chunks` not in `DataArray.to_zarr` max-sixty 5635139 open 0     0 2023-09-28T15:48:22Z 2023-09-28T15:49:35Z   MEMBER      

What is your issue?

Our to_zarr methods on DataArray & Dataset are slightly inconsistent — Dataset.to_zarr has write_empty_chunks and chunkmanager_store_kwargs. They're also in a different order.


Up a level — not sure of the best way of enforcing consistency here; a couple of ideas. - We could have tests that operate on both a DataArray and Dataset, parameterized by fixtures (might also help reduce the duplication in some of our tests), though we then need to make the tests generic. We could have some general tests which just test that methods work, and then delegate to the current per-object tests for finer guarantees. - We could have a tool which collects the differences between DataArray & Dataset methods and snapshots them — then we'll see if they diverge, while allowing for some divergences.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8248/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
587895591 MDU6SXNzdWU1ODc4OTU1OTE= 3891 Keep attrs by default? (keep_attrs) max-sixty 5635139 open 0     14 2020-03-25T18:17:35Z 2023-09-22T02:27:50Z   MEMBER      

I've held this view in low confidence for a while and wanted to socialize it to see whether there's something to it: Should we keep attrs in operations by default?

Advantages: - I think most of the time people want to keep attrs after operations - Is that right? Are there cases where it wouldn't be a reasonable default? e.g. good points here for not always keeping coords around - It's easy to remove them with a (currently unimplemented) drop_attrs method when people do want to remove them

Disadvantages: - Backward incompatible change with an expensive deprecate cycle (would be impractical to have a deprecation warning every time someone ran a function on an object with attrs I think? At least without adding a once filter warning) - ?

Here are some existing relevant discussions: - https://github.com/pydata/xarray/issues/3815#issuecomment-603974527 - https://github.com/pydata/xarray/issues/688 - https://github.com/pydata/xarray/pull/2482 - https://github.com/pydata/xarray/issues/3304

I think this is an easy situation to get into: - We make an incorrect-but-insignificant design decision; e.g. some methods don't keep attrs - We want to change that, but avoid breaking backward-compatibility - So we add kwargs and eventually a global config - But now we have a global config that requires global context and lots of kwargs! :(

I'm up for leaning towards breaking changes if it makes the library better: I think xarray will grow immensely, and so the narrow immediate pain is worth the broader future positive impact. Clearly if the immediate pain stops xarray growing, then it's not a good tradeoff.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3891/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
729117202 MDU6SXNzdWU3MjkxMTcyMDI= 4539 Failing main branch — test_save_mfdataset_compute_false_roundtrip max-sixty 5635139 closed 0     11 2020-10-25T21:22:36Z 2023-09-21T06:48:03Z 2023-09-20T19:57:17Z MEMBER      

We had the main branch passing for a while, but unfortunately another test failure. Now in our new Linux py38-backend-api-v2 test case, intest_save_mfdataset_compute_false_roundtrip

link

``` self = <xarray.tests.test_backends.TestDask object at 0x7f821a0d6190>

def test_save_mfdataset_compute_false_roundtrip(self):
    from dask.delayed import Delayed

    original = Dataset({"foo": ("x", np.random.randn(10))}).chunk()
    datasets = [original.isel(x=slice(5)), original.isel(x=slice(5, 10))]
    with create_tmp_file(allow_cleanup_failure=ON_WINDOWS) as tmp1:
        with create_tmp_file(allow_cleanup_failure=ON_WINDOWS) as tmp2:
            delayed_obj = save_mfdataset(
                datasets, [tmp1, tmp2], engine=self.engine, compute=False
            )
            assert isinstance(delayed_obj, Delayed)
            delayed_obj.compute()
            with open_mfdataset(
                [tmp1, tmp2], combine="nested", concat_dim="x"
            ) as actual:
              assert_identical(actual, original)

E AssertionError: Left and right Dataset objects are not identical E
E
E Differing data variables: E L foo (x) float64 dask.array<chunksize=(5,), meta=np.ndarray> E R foo (x) float64 dask.array<chunksize=(10,), meta=np.ndarray>

/home/vsts/work/1/s/xarray/tests/test_backends.py:3274: AssertionError

AssertionError: Left and right Dataset objects are not identical

Differing data variables: L foo (x) float64 dask.array<chunksize=(5,), meta=np.ndarray> R foo (x) float64 dask.array<chunksize=(10,), meta=np.ndarray> ```

@aurghs & @alexamici — are you familiar with this? Thanks in advance

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4539/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1905824568 I_kwDOAMm_X85xmJM4 8221 Frequent doc build timeout / OOM max-sixty 5635139 open 0     4 2023-09-20T23:02:37Z 2023-09-21T03:50:07Z   MEMBER      

What is your issue?

I'm frequently seeing Command killed due to timeout or excessive memory consumption in the doc build.

It's after 1552 seconds, so it not being a round number means it might be the memory?

It follows writing output... [ 90%] generated/xarray.core.rolling.DatasetRolling.max, which I wouldn't have thought as a particularly memory-intensive part of the build?

Here's an example: https://readthedocs.org/projects/xray/builds/21983708/

Any thoughts for what might be going on?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8221/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1902482417 I_kwDOAMm_X85xZZPx 8209 Protect `main` from mistaken pushes max-sixty 5635139 closed 0     5 2023-09-19T08:36:45Z 2023-09-19T22:00:42Z 2023-09-19T22:00:41Z MEMBER      

What is your issue?

Hi team — apologies but I mistakenly pushed to main. Less than a minute later I pushed another commit reverting the commit.

I'll check my git shortcuts tomorrow to ensure this can't happen by default — I thought I had the push remote set correctly, but possibly something doesn't use that.

Would we consider adding protection from pushes to main? (Am not blaming that setting tbc...)

(Will close this issue)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8209/reactions",
    "total_count": 2,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1326238990 I_kwDOAMm_X85PDM0O 6870 `rolling_exp` loses coords max-sixty 5635139 closed 0     4 2022-08-02T18:27:44Z 2023-09-19T01:13:23Z 2023-09-19T01:13:23Z MEMBER      

What happened?

We lose the time coord here — Dimensions without coordinates: time:

```python ds = xr.tutorial.load_dataset("air_temperature") ds.rolling_exp(time=5).mean()

<xarray.Dataset> Dimensions: (lat: 25, time: 2920, lon: 53) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 Dimensions without coordinates: time Data variables: air (time, lat, lon) float32 241.2 242.5 243.5 ... 296.4 296.1 295.7 ```

(I realize I wrote this, I didn't think this used to happen, but either it always did or I didn't write good enough tests... mea culpa)

What did you expect to happen?

We keep the time coords, like we do for normal rolling:

python In [2]: ds.rolling(time=5).mean() Out[2]: <xarray.Dataset> Dimensions: (lat: 25, lon: 53, time: 2920) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00

Minimal Complete Verifiable Example

Python (as above)

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.9.13 (main, May 24 2022, 21:13:51) [Clang 13.1.6 (clang-1316.0.21.2)] python-bits: 64 OS: Darwin OS-release: 21.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2022.6.0 pandas: 1.4.3 numpy: 1.21.6 scipy: 1.8.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.12.0 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2021.12.0 distributed: 2021.12.0 matplotlib: 3.5.1 cartopy: None seaborn: None numbagg: 0.2.1 fsspec: 2021.11.1 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 62.3.2 pip: 22.1.2 conda: None pytest: 7.1.2 IPython: 8.4.0 sphinx: 4.3.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6870/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1885042937 I_kwDOAMm_X85wW3j5 8157 Doc build fails on pandas docstring max-sixty 5635139 closed 0     3 2023-09-07T03:14:25Z 2023-09-15T13:26:26Z 2023-09-15T13:26:26Z MEMBER      

What is your issue?

It looks like the doc build is failing on a pandas docstring:

``` /home/docs/checkouts/readthedocs.org/user_builds/xray/conda/8156/lib/python3.10/site-packages/cartopy/io/init.py:241: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/50m_physical/ne_50m_coastline.zip warnings.warn(f'Downloading: {url}', DownloadWarning) reading sources... [ 99%] user-guide/reshaping reading sources... [ 99%] user-guide/terminology reading sources... [ 99%] user-guide/time-series reading sources... [ 99%] user-guide/weather-climate reading sources... [100%] whats-new

/home/docs/checkouts/readthedocs.org/user_builds/xray/conda/8156/lib/python3.10/site-packages/pandas/core/indexes/base.py:docstring of pandas.core.indexes.base.Index.join:14: WARNING: Inline literal start-string without end-string. /home/docs/checkouts/readthedocs.org/user_builds/xray/conda/8156/lib/python3.10/site-packages/pandas/core/indexes/base.py:docstring of pandas.core.indexes.base.Index.join:15: WARNING: Inline literal start-string without end-string. looking for now-outdated files... none found ```

(also including the cartopy warning in case that's relevant)

Is this expected? Is anyone familiar enough with the doc build to know whether we can disable warnings from 3rd party modules?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8157/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1890982762 I_kwDOAMm_X85wthtq 8173 HTML repr with many data vars max-sixty 5635139 open 0     1 2023-09-11T17:49:32Z 2023-09-11T20:38:01Z   MEMBER      

What is your issue?

I've been working with Datasets with 1000+ data vars. The HTML repr is extremely slow.

My current solution is to change the config to use the text at the top of the notebook, and then kick myself & restart when I forget.

Would folks be OK with us falling back to the text repr automatically for, say, >100 data vars?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8173/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1874148181 I_kwDOAMm_X85vtTtV 8123 `.rolling_exp` arguments could be clearer max-sixty 5635139 open 0     6 2023-08-30T18:09:04Z 2023-09-01T00:25:08Z   MEMBER      

Is your feature request related to a problem?

Currently we call .rolling_exp like:

da.rolling_exp(date=20).mean()

20 refers to a "standard" window type — broadly "the same average distance as a simple rolling window. That works well, and matches the .rolling(date=20).mean() format.

But we also have different window types, and this makes it a bit incongruent:

da.rolling_exp(date=0.5, window_type="alpha").mean()

...since the window_type is completely changing the meaning of the value we pass to the dimension argument. A bit like someone asking "how many apples would you like to buy", and replying "5", and then separately saying "when I said 5, I meant 5 tonnes".

Describe the solution you'd like

One option would be:

.rolling_exp(dptr={"alpha": 0.5})

We pass a dict if we want a non-standard window type — so the value is attached to its type.

We could still have the original form for da.rolling_exp(date=20).mean().

Describe alternatives you've considered

No response

Additional context

(I realize I wrote this originally, all criticism directed at me! This is based on feedback from a colleague, which on reflection I agree with.)

Unless anyone disagrees, I'll try and do this soon-ish™

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8123/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1865834162 I_kwDOAMm_X85vNl6y 8112 Local mypy issues max-sixty 5635139 closed 0     1 2023-08-24T20:25:42Z 2023-08-24T20:43:50Z 2023-08-24T20:43:50Z MEMBER      

What is your issue?

Hi team! I've been out of the flow for a bit. I'm just starting to add a couple of small changes to the library, and I'm having mypy issues I've spent 20 mins trying to figure out without much success.

When I run mypy locally, I get a lot of errors. That's even on the most recent commit which passed in CI, which suggests it's on me rather than Xarray... I've downgraded mypy versions to 1.4.1, which was the version that passed there.

But I can't figure out what it is — does anyone else have this? We can close if it's just me...

``` xarray/core/utils.py:120: error: Incompatible types in assignment (expression has type "Union[Any, dtype[generic], ExtensionDtype]", variable has type "dtype[object_]") [assignment] xarray/core/utils.py:170: error: No overload variant matches argument type "T" [call-overload] xarray/core/utils.py:170: note: Possible overload variants: xarray/core/utils.py:170: note: def isna(obj: DataFrame) -> DataFrame xarray/core/utils.py:170: note: def isna(obj: Series[Any]) -> Series[bool] xarray/core/utils.py:170: note: def isna(obj: Union[Index, list[Any], Union[ExtensionArray, ndarray[Any, Any]]]) -> ndarray[Any, dtype[bool_]] xarray/core/utils.py:170: note: def isna(obj: Union[Union[str, bytes, date, datetime, timedelta, bool, int, float, complex, Timestamp, Timedelta], NaTType, NAType, None]) -> TypeGuard[Union[NaTType, NAType, None]] xarray/core/pdcompat.py:104: error: "Timestamp" has no attribute "as_unit" [attr-defined] xarray/core/indexing.py:1650: error: Argument 1 to "PandasMultiIndexingAdapter" has incompatible type "Index"; expected "MultiIndex" [arg-type] xarray/core/indexes.py:452: error: Incompatible types in assignment (expression has type "Index", variable has type "NumericIndex") [assignment] xarray/core/indexes.py:454: error: Incompatible types in assignment (expression has type "Index", variable has type "NumericIndex") [assignment] xarray/core/indexes.py:456: error: Incompatible types in assignment (expression has type "Index", variable has type "NumericIndex") [assignment] xarray/core/indexes.py:473: error: No overload variant of "Index" matches argument types "ndarray[Any, dtype[Any]]", "dict[str, str]" [call-overload] xarray/core/indexes.py:473: note: Possible overload variants: xarray/core/indexes.py:473: note: def __new__(cls, data: Iterable[Any], dtype: Union[type[complex], type[number[Any]], Literal['float', 'int', 'complex']], copy: bool = ..., name: Any = ..., tupleize_cols: bool = ..., **kwargs: Any) -> NumericIndex xarray/core/indexes.py:473: note: def __new__(cls, data: Iterable[Any] = ..., dtype: Any = ..., copy: bool = ..., name: Any = ..., tupleize_cols: bool = ..., **kwargs: Any) -> Index xarray/core/indexes.py:723: error: Invalid index type "Union[int, slice, ndarray[Any, Any], Variable]" for "Index"; expected type "Union[slice, ndarray[Any, dtype[signedinteger[_64Bit]]], Index, Series[bool], ndarray[Any, dtype[bool_]]]" [index] xarray/core/indexes.py:885: error: "Index" has no attribute "levels"; maybe "nlevels"? [attr-defined] xarray/core/indexes.py:887: error: "Index" has no attribute "levels"; maybe "nlevels"? [attr-defined] xarray/core/indexes.py:889: error: "CategoricalIndex" has no attribute "remove_unused_categories" [attr-defined] xarray/core/indexes.py:889: error: "Index" has no attribute "codes" [attr-defined] xarray/core/indexes.py:891: error: "Index" has no attribute "codes" [attr-defined] xarray/core/indexes.py:900: error: "CategoricalIndex" has no attribute "remove_unused_categories" [attr-defined] xarray/core/indexes.py:916: error: "Index" has no attribute "levels"; maybe "nlevels"? [attr-defined] xarray/core/indexes.py:927: error: "Index" has no attribute "levels"; maybe "nlevels"? [attr-defined] xarray/core/indexes.py:1014: error: "Index" has no attribute "levels"; maybe "nlevels"? [attr-defined] xarray/core/indexes.py:1020: error: Incompatible return value type (got "tuple[dict[Hashable, xarray.core.indexes.Index], pandas.core.indexes.base.Index]", expected "tuple[dict[Hashable, xarray.core.indexes.Index], MultiIndex]") [return-value] xarray/core/indexes.py:1048: error: "Index" has no attribute "codes" [attr-defined] xarray/core/indexes.py:1049: error: "Index" has no attribute "levels"; maybe "nlevels"? [attr-defined] xarray/core/indexes.py:1059: error: Argument 1 to "append" of "list" has incompatible type "ndarray[Any, dtype[signedinteger[Any]]]"; expected "list[int]" [arg-type] xarray/core/indexes.py:1066: error: Argument 1 to "append" of "list" has incompatible type "ndarray[Any, dtype[signedinteger[Any]]]"; expected "list[int]" [arg-type] xarray/core/indexes.py:1106: error: "Index" has no attribute "reorder_levels" [attr-defined] xarray/core/indexes.py:1119: error: No overload variant of "__add__" of "tuple" matches argument type "list[str]" [operator] xarray/core/indexes.py:1119: note: Possible overload variants: xarray/core/indexes.py:1119: note: def __add__(self, tuple[Hashable, ...], /) -> tuple[Hashable, ...] xarray/core/indexes.py:1119: note: def [_T] __add__(self, tuple[_T, ...], /) -> tuple[Union[Hashable, _T], ...] xarray/core/indexes.py:1135: error: Argument 1 to "PandasMultiIndexingAdapter" has incompatible type "Index"; expected "MultiIndex" [arg-type] xarray/core/indexes.py:1179: error: "Index" has no attribute "get_loc_level" [attr-defined] xarray/core/indexes.py:1213: error: "Index" has no attribute "get_locs"; maybe "get_loc"? [attr-defined] xarray/core/indexes.py:1218: error: "Index" has no attribute "get_loc_level" [attr-defined] xarray/core/indexes.py:1225: error: "Index" has no attribute "get_loc_level" [attr-defined] xarray/core/indexes.py:1620: error: Incompatible types in assignment (expression has type "PandasMultiIndex", variable has type "Index") [assignment] xarray/core/indexes.py:1622: error: Incompatible types in assignment (expression has type "PandasIndex", variable has type "Index") [assignment] xarray/core/indexes.py:1626: error: "Index" has no attribute "_copy"; maybe "copy"? [attr-defined] xarray/core/indexes.py:1627: error: "Index" has no attribute "create_variables" [attr-defined] xarray/core/indexes.py:1630: error: Incompatible types in assignment (expression has type "pandas.core.indexes.base.Index", variable has type "xarray.core.indexes.Index") [assignment] xarray/core/variable.py:1837: error: No overload variant of "__add__" of "tuple" matches argument type "list[str]" [operator] xarray/core/variable.py:1837: note: Possible overload variants: xarray/core/variable.py:1837: note: def __add__(self, tuple[Hashable, ...], /) -> tuple[Hashable, ...] xarray/core/variable.py:1837: note: def [_T] __add__(self, tuple[_T, ...], /) -> tuple[Union[Hashable, _T], ...] xarray/core/missing.py:284: error: No overload variant of "str" matches argument types "int", "int", "int" [call-overload] xarray/core/missing.py:284: note: Possible overload variants: xarray/core/missing.py:284: note: def str(object: object = ...) -> str xarray/core/missing.py:284: note: def str(object: Buffer, encoding: str = ..., errors: str = ...) -> str xarray/core/missing.py:284: error: No overload variant of "bytes" matches argument types "int", "int", "int" [call-overload] xarray/core/missing.py:284: note: def bytes(Union[Iterable[SupportsIndex], SupportsIndex, SupportsBytes, Buffer], /) -> bytes xarray/core/missing.py:284: note: def bytes(str, /, encoding: str, errors: str = ...) -> bytes xarray/core/missing.py:284: note: def bytes() -> bytes xarray/core/missing.py:284: error: No overload variant of "int" matches argument types "int", "int", "int" [call-overload] xarray/core/missing.py:284: note: def int(Union[str, Buffer, SupportsInt, SupportsIndex, SupportsTrunc] = ..., /) -> int xarray/core/missing.py:284: note: def int(Union[str, bytes, bytearray], /, base: SupportsIndex) -> int xarray/core/missing.py:284: error: Too many arguments for "float" [call-arg] xarray/core/missing.py:284: error: No overload variant of "complex" matches argument types "int", "int", "int" [call-overload] xarray/core/missing.py:284: note: def complex(real: Union[complex, SupportsComplex, SupportsFloat, SupportsIndex] = ..., imag: Union[complex, SupportsFloat, SupportsIndex] = ...) -> complex xarray/core/missing.py:284: note: def complex(real: Union[str, SupportsComplex, SupportsFloat, SupportsIndex, complex]) -> complex xarray/core/missing.py:312: error: Incompatible default for argument "max_gap" (default has type "None", argument has type "Union[int, float, str, Timedelta, timedelta64, timedelta]") [assignment] xarray/core/missing.py:312: note: PEP 484 prohibits implicit Optional. Accordingly, mypy has changed its default to no_implicit_optional=True xarray/core/missing.py:312: note: Use https://github.com/hauntsaninja/no_implicit_optional to automatically upgrade your codebase xarray/core/groupby.py:306: error: Incompatible types in assignment (expression has type "Optional[DateOffset]", variable has type "Union[str, DateOffset, timedelta, Timedelta]") [assignment] xarray/core/groupby.py:420: error: Incompatible return value type (got "tuple[Any, list[Union[int, slice, list[int]]], IndexVariable, IndexVariable]", expected "tuple[DataArray, list[Union[int, slice, list[int]]], Union[IndexVariable, _DummyGroup[Any]], Index]") [return-value] xarray/core/groupby.py:439: error: Incompatible return value type (got "tuple[DataArray, list[Union[int, slice, list[int]]], Union[Any, IndexVariable, _DummyGroup[Any]], IndexVariable]", expected "tuple[DataArray, list[Union[int, slice, list[int]]], Union[IndexVariable, _DummyGroup[Any]], Index]") [return-value] xarray/core/groupby.py:508: error: Unexpected keyword argument "closed" for "Grouper" [call-arg] /opt/homebrew/lib/python3.9/site-packages/pandas-stubs/core/groupby/grouper.pyi:24: note: "Grouper" defined here xarray/core/groupby.py:508: error: Unexpected keyword argument "label" for "Grouper" [call-arg] /opt/homebrew/lib/python3.9/site-packages/pandas-stubs/core/groupby/grouper.pyi:24: note: "Grouper" defined here xarray/core/groupby.py:508: error: Unexpected keyword argument "origin" for "Grouper" [call-arg] /opt/homebrew/lib/python3.9/site-packages/pandas-stubs/core/groupby/grouper.pyi:24: note: "Grouper" defined here xarray/core/groupby.py:508: error: Unexpected keyword argument "offset" for "Grouper" [call-arg] /opt/homebrew/lib/python3.9/site-packages/pandas-stubs/core/groupby/grouper.pyi:24: note: "Grouper" defined here xarray/core/groupby.py:508: error: Incompatible types in assignment (expression has type "Grouper", variable has type "CFTimeGrouper") [assignment] xarray/core/groupby.py:530: error: Item "Grouper" of "Union[CFTimeGrouper, Grouper]" has no attribute "first_items" [union-attr] xarray/core/groupby.py:533: error: Argument 1 to "groupby" of "Series" has incompatible type "Union[CFTimeGrouper, Grouper]"; expected "Union[tuple[Any, ...], list[Any], Union[ufunc, Callable[..., Any]], list[Union[ufunc, Callable[..., Any]]], Series[Any], list[Series[Any]], ndarray[Any, Any], list[ndarray[Any, Any]], Mapping[Optional[Hashable], Any], list[Mapping[Optional[Hashable], Any]], Index, list[Index], Grouper, list[Grouper]]" [arg-type] xarray/core/dataset.py:503: error: Definition of "__eq__" in base class "DatasetOpsMixin" is incompatible with definition in base class "Mapping" [misc] xarray/core/dataset.py:4057: error: Incompatible types in assignment (expression has type "Variable", target has type "Index") [assignment] xarray/core/dataset.py:4059: error: Incompatible types in assignment (expression has type "Variable", target has type "Index") [assignment] xarray/core/dataset.py:6355: error: Incompatible default for argument "max_gap" (default has type "None", argument has type "Union[int, float, str, Timedelta, timedelta64, timedelta]") [assignment] xarray/core/dataset.py:6355: note: PEP 484 prohibits implicit Optional. Accordingly, mypy has changed its default to no_implicit_optional=True xarray/core/dataset.py:6355: note: Use https://github.com/hauntsaninja/no_implicit_optional to automatically upgrade your codebase xarray/core/dataarray.py:435: error: List item 0 has incompatible type "Union[NumericIndex, IndexVariable]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/core/dataarray.py:2924: error: "MultiIndex" has no attribute "_get_level_number" [attr-defined] xarray/core/dataarray.py:3478: error: Argument "max_gap" to "interp_na" has incompatible type "Union[None, int, float, str, Timedelta, timedelta64, timedelta]"; expected "Union[int, float, str, Timedelta, timedelta64, timedelta]" [arg-type] xarray/core/dataarray.py:3750: error: "object" not callable [operator] xarray/core/coordinates.py:803: error: Incompatible types in assignment (expression has type "MultiIndex", target has type "Variable") [assignment] xarray/core/concat.py:479: error: Incompatible types in assignment (expression has type "Hashable", variable has type "Union[str, T_DataArray, Index]") [assignment] xarray/core/concat.py:499: error: Argument 1 to "pop" of "dict" has incompatible type "Union[str, DataArray, Index]"; expected "Hashable" [arg-type] xarray/core/concat.py:500: error: Argument 1 to "pop" of "dict" has incompatible type "Union[str, DataArray, Index]"; expected "Hashable" [arg-type] xarray/core/concat.py:505: error: Argument 1 to "expand_dims" of "Dataset" has incompatible type "Union[str, DataArray, Index]"; expected "Union[None, Hashable, Sequence[Hashable], Mapping[Any, Any]]" [arg-type] xarray/core/concat.py:615: error: Argument 2 to "concat" of "Index" has incompatible type "Union[str, DataArray, Index]"; expected "Hashable" [arg-type] xarray/core/concat.py:667: error: Invalid index type "Union[str, DataArray, Index]" for "dict[Hashable, Index]"; expected type "Hashable" [index] xarray/coding/times.py:230: error: Incompatible types in assignment (expression has type "Timestamp", variable has type "str") [assignment] xarray/coding/times.py:240: error: No overload variant of "to_timedelta" matches argument types "Any", "str" [call-overload] xarray/coding/times.py:240: note: Possible overload variants: xarray/coding/times.py:240: note: def to_timedelta(arg: Union[str, float, timedelta], unit: Optional[Literal['H', 'T', 'S', 'L', 'U', 'N', 'W', 'w', 'D', 'd', 'days', 'day', 'hours', 'hour', 'hr', 'h', 'm', 'minute', 'min', 'minutes', 't', 's', 'seconds', 'sec', 'second', 'ms', 'milliseconds', 'millisecond', 'milli', 'millis', 'l', 'us', 'microseconds', 'microsecond', 'µs', 'micro', 'micros', 'u', 'ns', 'nanoseconds', 'nano', 'nanos', 'nanosecond', 'n']] = ..., errors: Literal['ignore', 'raise', 'coerce'] = ...) -> Timedelta xarray/coding/times.py:240: note: def to_timedelta(arg: Series[Any], unit: Optional[Literal['H', 'T', 'S', 'L', 'U', 'N', 'W', 'w', 'D', 'd', 'days', 'day', 'hours', 'hour', 'hr', 'h', 'm', 'minute', 'min', 'minutes', 't', 's', 'seconds', 'sec', 'second', 'ms', 'milliseconds', 'millisecond', 'milli', 'millis', 'l', 'us', 'microseconds', 'microsecond', 'µs', 'micro', 'micros', 'u', 'ns', 'nanoseconds', 'nano', 'nanos', 'nanosecond', 'n']] = ..., errors: Literal['ignore', 'raise', 'coerce'] = ...) -> TimedeltaSeries xarray/coding/times.py:240: note: def to_timedelta(arg: Union[Sequence[Union[float, timedelta]], list[Union[str, float, timedelta]], tuple[Union[str, float, timedelta], ...], range, Union[ExtensionArray, ndarray[Any, Any]], Index], unit: Optional[Literal['H', 'T', 'S', 'L', 'U', 'N', 'W', 'w', 'D', 'd', 'days', 'day', 'hours', 'hour', 'hr', 'h', 'm', 'minute', 'min', 'minutes', 't', 's', 'seconds', 'sec', 'second', 'ms', 'milliseconds', 'millisecond', 'milli', 'millis', 'l', 'us', 'microseconds', 'microsecond', 'µs', 'micro', 'micros', 'u', 'ns', 'nanoseconds', 'nano', 'nanos', 'nanosecond', 'n']] = ..., errors: Literal['ignore', 'raise', 'coerce'] = ...) -> TimedeltaIndex xarray/coding/times.py:241: error: No overload variant of "to_timedelta" matches argument types "Any", "str" [call-overload] xarray/coding/times.py:241: note: Possible overload variants: xarray/coding/times.py:241: note: def to_timedelta(arg: Union[str, float, timedelta], unit: Optional[Literal['H', 'T', 'S', 'L', 'U', 'N', 'W', 'w', 'D', 'd', 'days', 'day', 'hours', 'hour', 'hr', 'h', 'm', 'minute', 'min', 'minutes', 't', 's', 'seconds', 'sec', 'second', 'ms', 'milliseconds', 'millisecond', 'milli', 'millis', 'l', 'us', 'microseconds', 'microsecond', 'µs', 'micro', 'micros', 'u', 'ns', 'nanoseconds', 'nano', 'nanos', 'nanosecond', 'n']] = ..., errors: Literal['ignore', 'raise', 'coerce'] = ...) -> Timedelta xarray/coding/times.py:241: note: def to_timedelta(arg: Series[Any], unit: Optional[Literal['H', 'T', 'S', 'L', 'U', 'N', 'W', 'w', 'D', 'd', 'days', 'day', 'hours', 'hour', 'hr', 'h', 'm', 'minute', 'min', 'minutes', 't', 's', 'seconds', 'sec', 'second', 'ms', 'milliseconds', 'millisecond', 'milli', 'millis', 'l', 'us', 'microseconds', 'microsecond', 'µs', 'micro', 'micros', 'u', 'ns', 'nanoseconds', 'nano', 'nanos', 'nanosecond', 'n']] = ..., errors: Literal['ignore', 'raise', 'coerce'] = ...) -> TimedeltaSeries xarray/coding/times.py:241: note: def to_timedelta(arg: Union[Sequence[Union[float, timedelta]], list[Union[str, float, timedelta]], tuple[Union[str, float, timedelta], ...], range, Union[ExtensionArray, ndarray[Any, Any]], Index], unit: Optional[Literal['H', 'T', 'S', 'L', 'U', 'N', 'W', 'w', 'D', 'd', 'days', 'day', 'hours', 'hour', 'hr', 'h', 'm', 'minute', 'min', 'minutes', 't', 's', 'seconds', 'sec', 'second', 'ms', 'milliseconds', 'millisecond', 'milli', 'millis', 'l', 'us', 'microseconds', 'microsecond', 'µs', 'micro', 'micros', 'u', 'ns', 'nanoseconds', 'nano', 'nanos', 'nanosecond', 'n']] = ..., errors: Literal['ignore', 'raise', 'coerce'] = ...) -> TimedeltaIndex xarray/coding/times.py:262: error: No overload variant of "__add__" of "TimedeltaIndex" matches argument type "str" [operator] xarray/coding/times.py:262: note: Possible overload variants: xarray/coding/times.py:262: note: def __add__(self, Period, /) -> PeriodIndex xarray/coding/times.py:262: note: def __add__(self, DatetimeIndex, /) -> DatetimeIndex xarray/coding/times.py:262: note: def __add__(self, Union[Timedelta, TimedeltaIndex], /) -> TimedeltaIndex xarray/coding/times.py:459: error: Incompatible types in assignment (expression has type "str", variable has type "Timestamp") [assignment] xarray/coding/cftimeindex.py:514: error: Signature of "shift" incompatible with supertype "Index" [override] xarray/coding/cftimeindex.py:514: note: Superclass: xarray/coding/cftimeindex.py:514: note: def shift(self, periods: int = ..., freq: Any = ...) -> None xarray/coding/cftimeindex.py:514: note: Subclass: xarray/coding/cftimeindex.py:514: note: def shift(self, n: Union[int, float], freq: Union[str, timedelta]) -> Any xarray/coding/cftime_offsets.py:1219: error: Argument "inclusive" to "date_range" has incompatible type "Optional[Literal['both', 'neither', 'left', 'right']]"; expected "Literal['left', 'right', 'both', 'neither']" [arg-type] xarray/core/resample_cftime.py:155: error: Argument 1 to "to_timedelta" has incompatible type "Union[str, timedelta, BaseCFTimeOffset]"; expected "Union[str, float, timedelta]" [arg-type] xarray/core/resample_cftime.py:238: error: Argument 1 to "_adjust_bin_edges" has incompatible type "CFTimeIndex"; expected "ndarray[Any, Any]" [arg-type] xarray/core/resample_cftime.py:238: error: Argument 5 to "_adjust_bin_edges" has incompatible type "CFTimeIndex"; expected "ndarray[Any, Any]" [arg-type] xarray/tests/test_rolling.py:141: error: List item 0 has incompatible type "ndarray[Any, dtype[signedinteger[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_rolling.py:165: error: Argument 1 to "assert_allclose" has incompatible type "Union[ExtensionArray, ndarray[Any, Any]]"; expected "Union[Union[_SupportsArray[dtype[Union[bool_, number[Any]]]], _NestedSequence[_SupportsArray[dtype[Union[bool_, number[Any]]]]], bool, int, float, complex, _NestedSequence[Union[bool, int, float, complex]]], Union[_SupportsArray[dtype[object_]], _NestedSequence[_SupportsArray[dtype[object_]]]]]" [arg-type] xarray/tests/test_rolling.py:167: error: Argument 1 to "assert_allclose" has incompatible type "Union[ExtensionArray, ndarray[Any, Any]]"; expected "Union[Union[_SupportsArray[dtype[Union[bool_, number[Any]]]], _NestedSequence[_SupportsArray[dtype[Union[bool_, number[Any]]]]], bool, int, float, complex, _NestedSequence[Union[bool, int, float, complex]]], Union[_SupportsArray[dtype[object_]], _NestedSequence[_SupportsArray[dtype[object_]]]]]" [arg-type] xarray/tests/test_rolling.py:180: error: Argument 1 to "assert_allclose" has incompatible type "Union[ExtensionArray, ndarray[Any, Any]]"; expected "Union[Union[_SupportsArray[dtype[Union[bool_, number[Any]]]], _NestedSequence[_SupportsArray[dtype[Union[bool_, number[Any]]]]], bool, int, float, complex, _NestedSequence[Union[bool, int, float, complex]]], Union[_SupportsArray[dtype[object_]], _NestedSequence[_SupportsArray[dtype[object_]]]]]" [arg-type] xarray/tests/test_rolling.py:185: error: Argument 1 to "assert_allclose" has incompatible type "Optional[ndarray[Any, Any]]"; expected "Union[Union[_SupportsArray[dtype[Union[bool_, number[Any]]]], _NestedSequence[_SupportsArray[dtype[Union[bool_, number[Any]]]]], bool, int, float, complex, _NestedSequence[Union[bool, int, float, complex]]], Union[_SupportsArray[dtype[object_]], _NestedSequence[_SupportsArray[dtype[object_]]]]]" [arg-type] xarray/tests/test_rolling.py:597: error: Argument 1 to "assert_allclose" has incompatible type "Union[ExtensionArray, ndarray[Any, Any]]"; expected "Union[Union[_SupportsArray[dtype[Union[bool_, number[Any]]]], _NestedSequence[_SupportsArray[dtype[Union[bool_, number[Any]]]]], bool, int, float, complex, _NestedSequence[Union[bool, int, float, complex]]], Union[_SupportsArray[dtype[object_]], _NestedSequence[_SupportsArray[dtype[object_]]]]]" [arg-type] xarray/tests/test_rolling.py:616: error: Argument 1 to "assert_allclose" has incompatible type "Union[ExtensionArray, ndarray[Any, Any]]"; expected "Union[Union[_SupportsArray[dtype[Union[bool_, number[Any]]]], _NestedSequence[_SupportsArray[dtype[Union[bool_, number[Any]]]]], bool, int, float, complex, _NestedSequence[Union[bool, int, float, complex]]], Union[_SupportsArray[dtype[object_]], _NestedSequence[_SupportsArray[dtype[object_]]]]]" [arg-type] xarray/tests/test_rolling.py:622: error: Argument 1 to "assert_allclose" has incompatible type "Union[ExtensionArray, ndarray[Any, Any]]"; expected "Union[Union[_SupportsArray[dtype[Union[bool_, number[Any]]]], _NestedSequence[_SupportsArray[dtype[Union[bool_, number[Any]]]]], bool, int, float, complex, _NestedSequence[Union[bool, int, float, complex]]], Union[_SupportsArray[dtype[object_]], _NestedSequence[_SupportsArray[dtype[object_]]]]]" [arg-type] xarray/tests/test_plot.py:229: error: List item 0 has incompatible type "ndarray[Any, dtype[signedinteger[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_plot.py:515: error: No overload variant of "__sub__" of "DatetimeIndex" matches argument type "timedelta64" [operator] xarray/tests/test_plot.py:515: note: Possible overload variants: xarray/tests/test_plot.py:515: note: def __sub__(self, TimedeltaSeries, /) -> TimestampSeries xarray/tests/test_plot.py:515: note: def __sub__(self, Union[Timedelta, TimedeltaIndex], /) -> DatetimeIndex xarray/tests/test_plot.py:515: note: def __sub__(self, Union[Timestamp, DatetimeIndex], /) -> TimedeltaIndex xarray/tests/test_plot.py:1003: error: "object" not callable [operator] xarray/tests/test_groupby.py:735: error: Argument 1 to "groupby" of "Dataset" has incompatible type "Index"; expected "Union[Hashable, DataArray, IndexVariable]" [arg-type] xarray/tests/test_groupby.py:735: note: Following member(s) of "Index" have conflicts: xarray/tests/test_groupby.py:735: note: __hash__: expected "Callable[[], int]", got "None" xarray/tests/test_groupby.py:1475: error: No overload variant of "cut" matches argument types "DataArray", "list[float]", "dict[Any, Any]" [call-overload] xarray/tests/test_groupby.py:1475: note: Possible overload variants: xarray/tests/test_groupby.py:1475: note: def cut(x: Union[Index, ndarray[Any, dtype[Any]], Sequence[int], Sequence[float]], bins: Union[int, Series[Any], Int64Index, Float64Index, Sequence[int], Sequence[float]], right: bool = ..., *, labels: Literal[False], retbins: Literal[True], precision: int = ..., include_lowest: bool = ..., duplicates: Literal['raise', 'drop'] = ..., ordered: bool = ...) -> tuple[ndarray[Any, dtype[signedinteger[Any]]], ndarray[Any, dtype[Any]]] xarray/tests/test_groupby.py:1475: note: def cut(x: Union[Index, ndarray[Any, dtype[Any]], Sequence[int], Sequence[float]], bins: IntervalIndex, right: bool = ..., *, labels: Literal[False], retbins: Literal[True], precision: int = ..., include_lowest: bool = ..., duplicates: Literal['raise', 'drop'] = ..., ordered: bool = ...) -> tuple[ndarray[Any, dtype[signedinteger[Any]]], IntervalIndex] xarray/tests/test_groupby.py:1475: note: def cut(x: Series[Any], bins: Union[int, Series[Any], Int64Index, Float64Index, Sequence[int], Sequence[float]], right: bool = ..., labels: Union[Literal[False], Sequence[Optional[Hashable]], None] = ..., *, retbins: Literal[True], precision: int = ..., include_lowest: bool = ..., duplicates: Literal['raise', 'drop'] = ..., ordered: bool = ...) -> tuple[Series[Any], ndarray[Any, dtype[Any]]] xarray/tests/test_groupby.py:1475: note: def cut(x: Series[Any], bins: IntervalIndex, right: bool = ..., labels: Optional[Sequence[Optional[Hashable]]] = ..., *, retbins: Literal[True], precision: int = ..., include_lowest: bool = ..., duplicates: Literal['raise', 'drop'] = ..., ordered: bool = ...) -> tuple[Series[Any], IntervalIndex] xarray/tests/test_groupby.py:1475: note: def cut(x: Union[Index, ndarray[Any, dtype[Any]], Sequence[int], Sequence[float]], bins: Union[int, Series[Any], Int64Index, Float64Index, Sequence[int], Sequence[float]], right: bool = ..., labels: Optional[Sequence[Optional[Hashable]]] = ..., *, retbins: Literal[True], precision: int = ..., include_lowest: bool = ..., duplicates: Literal['raise', 'drop'] = ..., ordered: bool = ...) -> tuple[Categorical, ndarray[Any, dtype[Any]]] xarray/tests/test_groupby.py:1475: note: def cut(x: Union[Index, ndarray[Any, dtype[Any]], Sequence[int], Sequence[float]], bins: IntervalIndex, right: bool = ..., labels: Optional[Sequence[Optional[Hashable]]] = ..., *, retbins: Literal[True], precision: int = ..., include_lowest: bool = ..., duplicates: Literal['raise', 'drop'] = ..., ordered: bool = ...) -> tuple[Categorical, IntervalIndex] xarray/tests/test_groupby.py:1475: note: def cut(x: Union[Index, ndarray[Any, dtype[Any]], Sequence[int], Sequence[float]], bins: Union[int, Series[Any], Int64Index, Float64Index, Sequence[int], Sequence[float], IntervalIndex], right: bool = ..., *, labels: Literal[False], retbins: Literal[False] = ..., precision: int = ..., include_lowest: bool = ..., duplicates: Literal['raise', 'drop'] = ..., ordered: bool = ...) -> ndarray[Any, dtype[signedinteger[Any]]] xarray/tests/test_groupby.py:1475: note: def cut(x: Series[Any], bins: Union[int, Series[Any], Int64Index, Float64Index, Sequence[int], Sequence[float], IntervalIndex], right: bool = ..., labels: Union[Literal[False], Sequence[Optional[Hashable]], None] = ..., retbins: Literal[False] = ..., precision: int = ..., include_lowest: bool = ..., duplicates: Literal['raise', 'drop'] = ..., ordered: bool = ...) -> Series[Any] xarray/tests/test_groupby.py:1475: note: def cut(x: Union[Index, ndarray[Any, dtype[Any]], Sequence[int], Sequence[float]], bins: Union[int, Series[Any], Int64Index, Float64Index, Sequence[int], Sequence[float], IntervalIndex], right: bool = ..., labels: Optional[Sequence[Optional[Hashable]]] = ..., retbins: Literal[False] = ..., precision: int = ..., include_lowest: bool = ..., duplicates: Literal['raise', 'drop'] = ..., ordered: bool = ...) -> Categorical xarray/tests/test_groupby.py:1526: error: List item 0 has incompatible type "ndarray[Any, dtype[floating[_64Bit]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_groupby.py:2035: error: Argument "origin" to "resample" of "Series" has incompatible type "str"; expected "Union[Timestamp, Literal['epoch', 'start', 'start_day', 'end', 'end_day']]" [arg-type] xarray/tests/test_formatting.py:110: error: Argument 1 to "to_timedelta" has incompatible type "list[str]"; expected "Union[Sequence[Union[float, timedelta]], list[Union[str, float, timedelta]], tuple[Union[str, float, timedelta], ...], range, Union[ExtensionArray, ndarray[Any, Any]], Index]" [arg-type] xarray/tests/test_formatting.py:110: note: "List" is invariant -- see https://mypy.readthedocs.io/en/stable/common_issues.html#variance xarray/tests/test_formatting.py:110: note: Consider using "Sequence" instead, which is covariant xarray/tests/test_formatting.py:112: error: Argument 1 to "to_timedelta" has incompatible type "list[str]"; expected "Union[Sequence[Union[float, timedelta]], list[Union[str, float, timedelta]], tuple[Union[str, float, timedelta], ...], range, Union[ExtensionArray, ndarray[Any, Any]], Index]" [arg-type] xarray/tests/test_formatting.py:112: note: "List" is invariant -- see https://mypy.readthedocs.io/en/stable/common_issues.html#variance xarray/tests/test_formatting.py:112: note: Consider using "Sequence" instead, which is covariant xarray/tests/test_coding_times.py:623: error: Argument 1 to "to_timedelta" has incompatible type "list[str]"; expected "Union[Sequence[Union[float, timedelta]], list[Union[str, float, timedelta]], tuple[Union[str, float, timedelta], ...], range, Union[ExtensionArray, ndarray[Any, Any]], Index]" [arg-type] xarray/tests/test_coding_times.py:623: note: "List" is invariant -- see https://mypy.readthedocs.io/en/stable/common_issues.html#variance xarray/tests/test_coding_times.py:623: note: Consider using "Sequence" instead, which is covariant xarray/tests/test_coding_times.py:624: error: Argument 1 to "to_timedelta" has incompatible type "list[str]"; expected "Union[Sequence[Union[float, timedelta]], list[Union[str, float, timedelta]], tuple[Union[str, float, timedelta], ...], range, Union[ExtensionArray, ndarray[Any, Any]], Index]" [arg-type] xarray/tests/test_coding_times.py:624: note: "List" is invariant -- see https://mypy.readthedocs.io/en/stable/common_issues.html#variance xarray/tests/test_coding_times.py:624: note: Consider using "Sequence" instead, which is covariant xarray/tests/test_coding_times.py:625: error: Argument 1 to "to_timedelta" has incompatible type "list[object]"; expected "Union[Sequence[Union[float, timedelta]], list[Union[str, float, timedelta]], tuple[Union[str, float, timedelta], ...], range, Union[ExtensionArray, ndarray[Any, Any]], Index]" [arg-type] xarray/tests/test_coding_times.py:626: error: Argument 1 to "to_timedelta" has incompatible type "list[str]"; expected "Union[Sequence[Union[float, timedelta]], list[Union[str, float, timedelta]], tuple[Union[str, float, timedelta], ...], range, Union[ExtensionArray, ndarray[Any, Any]], Index]" [arg-type] xarray/tests/test_coding_times.py:626: note: "List" is invariant -- see https://mypy.readthedocs.io/en/stable/common_issues.html#variance xarray/tests/test_coding_times.py:626: note: Consider using "Sequence" instead, which is covariant xarray/tests/test_indexes.py:413: error: "Index" has no attribute "codes" [attr-defined] xarray/tests/test_indexes.py:433: error: "Index" has no attribute "codes" [attr-defined] xarray/tests/test_indexes.py:435: error: "Index" has no attribute "levels"; maybe "nlevels"? [attr-defined] xarray/tests/test_indexes.py:436: error: "Index" has no attribute "levels"; maybe "nlevels"? [attr-defined] xarray/tests/test_indexes.py:588: error: Incompatible types in assignment (expression has type "type[NumericIndex]", variable has type "type[xarray.core.indexes.Index]") [assignment] xarray/tests/test_accessor_dt.py:114: error: Incompatible types in assignment (expression has type "DataArray", variable has type "Index") [assignment] xarray/tests/test_accessor_dt.py:239: error: Incompatible types in assignment (expression has type "DataArray", variable has type "DatetimeIndex") [assignment] xarray/tests/test_accessor_dt.py:258: error: "DatetimeIndex" has no attribute "dt" [attr-defined] xarray/tests/test_variable.py:2945: error: "PandasObject" has no attribute "astype" [attr-defined] xarray/tests/test_dataset.py:587: error: No overload variant of "__getitem__" of "Series" matches argument type "Hashable" [call-overload] xarray/tests/test_dataset.py:587: note: Possible overload variants: xarray/tests/test_dataset.py:587: note: def __getitem__(self, Union[list[str], Index, Series[Any], slice, Series[bool], ndarray[Any, dtype[bool_]], list[bool], tuple[Union[Any, slice], ...]], /) -> Series[Any] xarray/tests/test_dataset.py:587: note: def __getitem__(self, Union[int, str], /) -> Any xarray/tests/test_dataset.py:587: error: No overload variant of "__getitem__" of "DataFrame" matches argument type "Hashable" [call-overload] xarray/tests/test_dataset.py:587: note: def __getitem__(self, Union[str, bytes, date, datetime, timedelta, bool, int, float, complex, Timestamp, Timedelta], /) -> Series[Any] xarray/tests/test_dataset.py:587: note: def __getitem__(self, slice, /) -> DataFrame xarray/tests/test_dataset.py:587: note: def [ScalarT] __getitem__(self, Union[tuple[Any, ...], Series[bool], DataFrame, list[str], list[ScalarT], Index, ndarray[Any, dtype[str_]], ndarray[Any, dtype[bool_]], Sequence[tuple[Union[str, bytes, date, datetime, timedelta, bool, int, float, complex, Timestamp, Timedelta], ...]]], /) -> DataFrame xarray/tests/test_dataset.py:1648: error: Argument "tolerance" to "sel" of "Dataset" has incompatible type "str"; expected "Union[int, float, Iterable[Union[int, float]], None]" [arg-type] xarray/tests/test_dataset.py:1648: note: Following member(s) of "str" have conflicts: xarray/tests/test_dataset.py:1648: note: Expected: xarray/tests/test_dataset.py:1648: note: def __iter__(self) -> Iterator[Union[int, float]] xarray/tests/test_dataset.py:1648: note: Got: xarray/tests/test_dataset.py:1648: note: def __iter__(self) -> Iterator[str] xarray/tests/test_dataset.py:1997: error: Incompatible types in assignment (expression has type "DataFrame", variable has type "Series[Any]") [assignment] xarray/tests/test_dataset.py:1998: error: Argument 1 to "equals" of "NDFrame" has incompatible type "Union[Series[Any], DataFrame]"; expected "Series[Any]" [arg-type] xarray/tests/test_dataset.py:3476: error: "Index" has no attribute "reorder_levels" [attr-defined] xarray/tests/test_dataset.py:3935: error: "Index" has no attribute "month" [attr-defined] xarray/tests/test_dataset.py:4502: error: Incompatible types in assignment (expression has type "Dataset", variable has type "DataFrame") [assignment] xarray/tests/test_dataset.py:4503: error: Incompatible types in assignment (expression has type "Dataset", variable has type "DataFrame") [assignment] xarray/tests/test_dataset.py:4507: error: Incompatible types in assignment (expression has type "Dataset", variable has type "DataFrame") [assignment] xarray/tests/test_dataset.py:4508: error: Incompatible types in assignment (expression has type "Dataset", variable has type "DataFrame") [assignment] xarray/tests/test_dataset.py:4513: error: Incompatible types in assignment (expression has type "Dataset", variable has type "DataFrame") [assignment] xarray/tests/test_dataset.py:4514: error: Incompatible types in assignment (expression has type "Dataset", variable has type "DataFrame") [assignment] xarray/tests/test_dataset.py:4628: error: Incompatible types in assignment (expression has type "list[str]", variable has type "Index") [assignment] xarray/tests/test_dataset.py:4650: error: Argument 1 to "apply" of "DataFrame" has incompatible type overloaded function; expected "Callable[..., Series[Any]]" [arg-type] xarray/tests/test_dataset.py:5090: error: List item 0 has incompatible type "ndarray[Any, dtype[Any]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataset.py:5093: error: List item 0 has incompatible type "ndarray[Any, dtype[Any]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:330: error: Argument 2 to "DataArray" has incompatible type "list[object]"; expected "Union[Sequence[Union[Sequence[Any], Index, DataArray]], Mapping[Any, Any], None]" [arg-type] xarray/tests/test_dataarray.py:1049: error: No overload variant of "__sub__" of "DatetimeIndex" matches argument type "str" [operator] xarray/tests/test_dataarray.py:1049: note: Possible overload variants: xarray/tests/test_dataarray.py:1049: note: def __sub__(self, TimedeltaSeries, /) -> TimestampSeries xarray/tests/test_dataarray.py:1049: note: def __sub__(self, Union[Timedelta, TimedeltaIndex], /) -> DatetimeIndex xarray/tests/test_dataarray.py:1049: note: def __sub__(self, Union[Timestamp, DatetimeIndex], /) -> TimedeltaIndex xarray/tests/test_dataarray.py:1049: error: No overload variant of "__sub__" of "DatetimeIndex" matches argument type "bytes" [operator] xarray/tests/test_dataarray.py:1049: error: No overload variant of "__sub__" of "DatetimeIndex" matches argument type "date" [operator] xarray/tests/test_dataarray.py:1049: error: No overload variant of "__sub__" of "DatetimeIndex" matches argument type "datetime" [operator] xarray/tests/test_dataarray.py:1049: error: No overload variant of "__sub__" of "DatetimeIndex" matches argument type "timedelta" [operator] xarray/tests/test_dataarray.py:1049: error: No overload variant of "__sub__" of "DatetimeIndex" matches argument type "bool" [operator] xarray/tests/test_dataarray.py:1049: error: No overload variant of "__sub__" of "DatetimeIndex" matches argument type "int" [operator] xarray/tests/test_dataarray.py:1049: error: No overload variant of "__sub__" of "DatetimeIndex" matches argument type "float" [operator] xarray/tests/test_dataarray.py:1049: error: No overload variant of "__sub__" of "DatetimeIndex" matches argument type "complex" [operator] xarray/tests/test_dataarray.py:1049: note: Right operand is of type "Union[str, bytes, date, datetime, timedelta, bool, int, float, complex, Timestamp, Timedelta]" xarray/tests/test_dataarray.py:1225: error: Argument "coords" to "DataArray" has incompatible type "tuple[ndarray[Any, dtype[Any]]]"; expected "Union[Sequence[Union[Sequence[Any], Index, DataArray]], Mapping[Any, Any], None]" [arg-type] xarray/tests/test_dataarray.py:1399: error: Argument 2 to "DataArray" has incompatible type "list[IndexVariable]"; expected "Union[Sequence[Union[Sequence[Any], Index, DataArray]], Mapping[Any, Any], None]" [arg-type] xarray/tests/test_dataarray.py:2237: error: List item 0 has incompatible type "ndarray[Any, dtype[signedinteger[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:2237: error: List item 1 has incompatible type "ndarray[Any, dtype[signedinteger[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:2238: error: List item 0 has incompatible type "ndarray[Any, dtype[signedinteger[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:2238: error: List item 1 has incompatible type "ndarray[Any, dtype[signedinteger[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:2288: error: List item 0 has incompatible type "ndarray[Any, dtype[signedinteger[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:3218: error: Argument 1 to "assert_array_equal" has incompatible type "Union[ndarray[Any, Any], ExtensionArray]"; expected "Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]" [arg-type] xarray/tests/test_dataarray.py:3228: error: Argument 1 to "assert_array_equal" has incompatible type "Union[ndarray[Any, Any], ExtensionArray]"; expected "Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]" [arg-type] xarray/tests/test_dataarray.py:3249: error: Argument 1 to "assert_array_equal" has incompatible type "Union[ExtensionArray, ndarray[Any, Any]]"; expected "Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]" [arg-type] xarray/tests/test_dataarray.py:3249: error: Argument 2 to "assert_array_equal" has incompatible type "Union[ExtensionArray, ndarray[Any, Any]]"; expected "Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]" [arg-type] xarray/tests/test_dataarray.py:3250: error: Argument 1 to "assert_array_equal" has incompatible type "Hashable"; expected "Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]" [arg-type] xarray/tests/test_dataarray.py:3250: error: Argument 2 to "assert_array_equal" has incompatible type "Hashable"; expected "Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]" [arg-type] xarray/tests/test_dataarray.py:3254: error: Argument 2 to "assert_array_equal" has incompatible type "Union[ExtensionArray, ndarray[Any, Any]]"; expected "Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]" [arg-type] xarray/tests/test_dataarray.py:3258: error: Incompatible types in assignment (expression has type "DataFrame", variable has type "Series[Any]") [assignment] xarray/tests/test_dataarray.py:3261: error: Incompatible types in assignment (expression has type "DataFrame", variable has type "Series[Any]") [assignment] xarray/tests/test_dataarray.py:3262: error: Argument 1 to "assert_array_equal" has incompatible type "Union[ExtensionArray, ndarray[Any, Any]]"; expected "Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]" [arg-type] xarray/tests/test_dataarray.py:3262: error: Argument 2 to "assert_array_equal" has incompatible type "Union[ExtensionArray, ndarray[Any, Any]]"; expected "Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]" [arg-type] xarray/tests/test_dataarray.py:3285: error: Argument 1 to "assert_array_equal" has incompatible type "Union[ExtensionArray, ndarray[Any, Any]]"; expected "Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]" [arg-type] xarray/tests/test_dataarray.py:3287: error: "Index" has no attribute "levels"; maybe "nlevels"? [attr-defined] xarray/tests/test_dataarray.py:3288: error: "Index" has no attribute "levels"; maybe "nlevels"? [attr-defined] xarray/tests/test_dataarray.py:3289: error: "Index" has no attribute "levels"; maybe "nlevels"? [attr-defined] xarray/tests/test_dataarray.py:3310: error: Argument 2 to "assert_array_equal" has incompatible type "Union[ExtensionArray, ndarray[Any, Any]]"; expected "Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]" [arg-type] xarray/tests/test_dataarray.py:3318: error: Incompatible types in assignment (expression has type "DataFrame", variable has type "Series[Any]") [assignment] xarray/tests/test_dataarray.py:3323: error: Argument 1 to "assert_array_equal" has incompatible type "Union[ExtensionArray, ndarray[Any, Any]]"; expected "Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]" [arg-type] xarray/tests/test_dataarray.py:3340: error: Argument 2 to "assert_array_equal" has incompatible type "Union[ExtensionArray, ndarray[Any, Any]]"; expected "Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]" [arg-type] xarray/tests/test_dataarray.py:3934: error: List item 0 has incompatible type "ndarray[Any, dtype[floating[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:3934: error: List item 1 has incompatible type "ndarray[Any, dtype[floating[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:3942: error: List item 0 has incompatible type "ndarray[Any, dtype[floating[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:3942: error: List item 1 has incompatible type "ndarray[Any, dtype[floating[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:3954: error: List item 0 has incompatible type "ndarray[Any, dtype[floating[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:3954: error: List item 1 has incompatible type "ndarray[Any, dtype[floating[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:3957: error: List item 0 has incompatible type "ndarray[Any, dtype[floating[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:3957: error: List item 1 has incompatible type "ndarray[Any, dtype[floating[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:3982: error: List item 0 has incompatible type "ndarray[Any, dtype[floating[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:3982: error: List item 1 has incompatible type "ndarray[Any, dtype[floating[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:3999: error: List item 0 has incompatible type "ndarray[Any, dtype[floating[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:4005: error: List item 0 has incompatible type "ndarray[Any, dtype[floating[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:4005: error: List item 1 has incompatible type "ndarray[Any, dtype[floating[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:4011: error: List item 0 has incompatible type "ndarray[Any, dtype[floating[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:4011: error: List item 1 has incompatible type "ndarray[Any, dtype[floating[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:4020: error: List item 0 has incompatible type "ndarray[Any, dtype[floating[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:4020: error: List item 1 has incompatible type "ndarray[Any, dtype[floating[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:4032: error: List item 0 has incompatible type "ndarray[Any, dtype[signedinteger[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_dataarray.py:4033: error: List item 0 has incompatible type "ndarray[Any, dtype[signedinteger[Any]]]"; expected "Union[Sequence[Any], Index, DataArray]" [list-item] xarray/tests/test_concat.py:1008: error: No overload variant of "concat" matches argument types "list[DataArray]", "list[int]" [call-overload] xarray/tests/test_concat.py:1008: note: Possible overload variants: xarray/tests/test_concat.py:1008: note: def [T_Dataset <: Dataset, T_DataArray <: DataArray] concat(objs: Iterable[T_Dataset], dim: Union[Hashable, T_DataArray, Index], data_vars: Union[Literal['all', 'minimal', 'different'], Iterable[Hashable]] = ..., coords: Union[Literal['all', 'minimal', 'different'], list[Hashable]] = ..., compat: Literal['identical', 'equals', 'broadcast_equals', 'no_conflicts', 'override', 'minimal'] = ..., positions: Optional[Iterable[Iterable[int]]] = ..., fill_value: object = ..., join: Literal['outer', 'inner', 'left', 'right', 'exact', 'override'] = ..., combine_attrs: Union[Callable[..., Any], Literal['drop', 'identical', 'no_conflicts', 'drop_conflicts', 'override']] = ...) -> T_Dataset xarray/tests/test_concat.py:1008: note: def [T_DataArray <: DataArray] concat(objs: Iterable[T_DataArray], dim: Union[Hashable, T_DataArray, Index], data_vars: Union[Literal['all', 'minimal', 'different'], Iterable[Hashable]] = ..., coords: Union[Literal['all', 'minimal', 'different'], list[Hashable]] = ..., compat: Literal['identical', 'equals', 'broadcast_equals', 'no_conflicts', 'override', 'minimal'] = ..., positions: Optional[Iterable[Iterable[int]]] = ..., fill_value: object = ..., join: Literal['outer', 'inner', 'left', 'right', 'exact', 'override'] = ..., combine_attrs: Union[Callable[..., Any], Literal['drop', 'identical', 'no_conflicts', 'drop_conflicts', 'override']] = ...) -> T_DataArray xarray/tests/test_backends.py:570: error: Argument 1 to "to_timedelta" has incompatible type "list[str]"; expected "Union[Sequence[Union[float, timedelta]], list[Union[str, float, timedelta]], tuple[Union[str, float, timedelta], ...], range, Union[ExtensionArray, ndarray[Any, Any]], Index]" [arg-type] xarray/tests/test_backends.py:570: note: "List" is invariant -- see https://mypy.readthedocs.io/en/stable/common_issues.html#variance xarray/tests/test_backends.py:570: note: Consider using "Sequence" instead, which is covariant xarray/tests/test_backends.py:5244: error: Incompatible types in assignment (expression has type "Union[DataFrame, Series[Any]]", variable has type "DataFrame") [assignment] xarray/tests/test_conventions.py:90: error: Argument 1 to "to_timedelta" has incompatible type "list[str]"; expected "Union[Sequence[Union[float, timedelta]], list[Union[str, float, timedelta]], tuple[Union[str, float, timedelta], ...], range, Union[ExtensionArray, ndarray[Any, Any]], Index]" [arg-type] xarray/tests/test_conventions.py:90: note: "List" is invariant -- see https://mypy.readthedocs.io/en/stable/common_issues.html#variance xarray/tests/test_conventions.py:90: note: Consider using "Sequence" instead, which is covariant Found 183 errors in 28 files (checked 142 source files) ```
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8112/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1452291042 I_kwDOAMm_X85WkDPi 7295 einops integration? max-sixty 5635139 closed 0     5 2022-11-16T21:23:03Z 2023-06-14T18:20:05Z 2023-06-14T18:20:05Z MEMBER      

Is your feature request related to a problem?

I've been following https://github.com/arogozhnikov/einops with interest, and thought it would be worth raising whether we could offer an xarray integration, eventually inspired by the reshape discussion at https://github.com/pydata/xarray/discussions/7217#discussioncomment-4150613. (I thought there might have been a discussion / issue already, but couldn't find one).

Einops offers a string query to do common array operations. There's a good intro at https://github.com/arogozhnikov/einops/blob/master/docs/1-einops-basics.ipynb

The

Describe the solution you'd like

Because our dimension names are full names, it wouldn't be quite as terse, and it has lots of overlap with our existing operations. But I find string queries are often easy to understand, particularly for those who have less experience with xarray but are familiar with einsum & frends [^1].

Applying einops to xarray

The einops example, (no xarray yet):

python rearrange(ims, 'b h w c -> h (b w) c')

This could likely be something like, noting that we don't need the string before the ->, as our names are already defined:

python rearrange(ims, '-> height (batch width) color')

...or, if we wanted to name the new dimension rather than take an implicit dimension name like batch_width, at the cost of new einops syntax:

python rearrange(ims, '-> height width=(batch width) color')

...or if we wanted a method on a DataArray:

python ims.einops.rearrange('height width=(batch width) color')

Sometimes we would want the lvalue; e.g. to unstack:

python rearrange(ims, 'batch (height height2) color -> height (batch width) color', height2=2)

...and the same principles apply for reductions such as mean, for example

python ims.einops.reduce('-> height width color', 'mean')

...would be equivalent to:

```python ims.mean('batch')

or

ims.groupby(['height', 'width', 'color']).mean(...) ```

[^1]: I frequently find myself using .query, or pd.eval, also string queries, and these are great with tools that take user input and folks can pass a string

Describe alternatives you've considered

No response

Additional context

This probably needs a champion to drive, and realistically that's probably not me for quite a while. But if we can get consensus that this would be interesting, and someone is up for doing it, I think this could be a v cool feature.

I'll also tag @arogozhnikov for info

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7295/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1410336255 I_kwDOAMm_X85UEAX_ 7164 Error on xarray warnings in tests? max-sixty 5635139 open 0     5 2022-10-16T01:09:27Z 2022-10-18T09:51:20Z   MEMBER      

What is your issue?

We've done a superb job of cutting the number of warnings in https://github.com/pydata/xarray/issues/3266.

On another project I've been spending time with recently, we raise an error on any warnings in the test suite. It's easy mode — the dependencies are locked (it's not python...), but I wonder whether we can do something some of the way with this:

Would it be worth failing on: - Warnings from within xarray - There's no chance of an external change causing main to fail. When we deprecate something, we'd update calling code with it. - This would also ensure doctests & docs don't use old versions. Currently doctests have some warnings. - Warnings from the min-versions test - It prevents us from using outdated APIs - min-versions are fixed dependencies, so also no chance of an external change causing main to fail - It would fail in a more deliberate way than the upstream tests do now - OTOH, possibly it would discourage us from bumping those min versions — the burden falls on the bumper — already a generous PR! - ...and it's not perfectly matched — really we want to update from an old API before it changes in the new version, not before it becomes deprecated in an old version

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7164/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1203835220 I_kwDOAMm_X85HwRFU 6484 Should we raise a more informative error on no zarr dir? max-sixty 5635139 closed 0     2 2022-04-13T22:05:07Z 2022-09-20T22:38:46Z 2022-09-20T22:38:46Z MEMBER      

What happened?

Currently if someone supplies a path that doesn't exist, we get quite a long stack trace, without really saying that the path doesn't exist.

What did you expect to happen?

Possibly a FileNotFoundError

Minimal Complete Verifiable Example

Python xr.open_zarr('x.zarr')

Relevant log output

```Python In [1]: xr.open_zarr('x.zarr') <ipython-input-1-8be4b98d9b20>:1: RuntimeWarning: Failed to open Zarr store with consolidated metadata, falling back to try reading non-consolidated metadata. This is typically much slower for opening a dataset. To silence this warning, consider:

  1. Consolidating metadata in this existing store with zarr.consolidate_metadata().
  2. Explicitly setting consolidated=False, to avoid trying to read consolidate metadata, or
  3. Explicitly setting consolidated=True, to raise an error in this case instead of falling back to try reading non-consolidated metadata. xr.open_zarr('x.zarr')

KeyError Traceback (most recent call last) ~/Library/Caches/pypoetry/virtualenvs/-x204KUJE-py3.9/lib/python3.9/site-packages/xarray/backends/zarr.py in open_group(cls, store, mode, synchronizer, group, consolidated, consolidate_on_close, chunk_store, storage_options, append_dim, write_region, safe_chunks, stacklevel) 347 try: --> 348 zarr_group = zarr.open_consolidated(store, **open_kwargs) 349 except KeyError:

~/Library/Caches/pypoetry/virtualenvs/-x204KUJE-py3.9/lib/python3.9/site-packages/zarr/convenience.py in open_consolidated(store, metadata_key, mode, **kwargs) 1186 # setup metadata store -> 1187 meta_store = ConsolidatedMetadataStore(store, metadata_key=metadata_key) 1188

~/Library/Caches/pypoetry/virtualenvs/-x204KUJE-py3.9/lib/python3.9/site-packages/zarr/storage.py in init(self, store, metadata_key) 2643 # retrieve consolidated metadata -> 2644 meta = json_loads(store[metadata_key]) 2645

~/Library/Caches/pypoetry/virtualenvs/-x204KUJE-py3.9/lib/python3.9/site-packages/zarr/storage.py in getitem(self, key) 894 else: --> 895 raise KeyError(key) 896

KeyError: '.zmetadata'

During handling of the above exception, another exception occurred:

GroupNotFoundError Traceback (most recent call last) <ipython-input-1-8be4b98d9b20> in <cell line: 1>() ----> 1 xr.open_zarr('x.zarr')

~/Library/Caches/pypoetry/virtualenvs/-x204KUJE-py3.9/lib/python3.9/site-packages/xarray/backends/zarr.py in open_zarr(store, group, synchronizer, chunks, decode_cf, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, consolidated, overwrite_encoded_chunks, chunk_store, storage_options, decode_timedelta, use_cftime, **kwargs) 750 } 751 --> 752 ds = open_dataset( 753 filename_or_obj=store, 754 group=group,

~/Library/Caches/pypoetry/virtualenvs/-x204KUJE-py3.9/lib/python3.9/site-packages/xarray/backends/api.py in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, backend_kwargs, args, *kwargs) 493 494 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None) --> 495 backend_ds = backend.open_dataset( 496 filename_or_obj, 497 drop_variables=drop_variables,

~/Library/Caches/pypoetry/virtualenvs/-x204KUJE-py3.9/lib/python3.9/site-packages/xarray/backends/zarr.py in open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, synchronizer, consolidated, chunk_store, storage_options, stacklevel) 798 799 filename_or_obj = _normalize_path(filename_or_obj) --> 800 store = ZarrStore.open_group( 801 filename_or_obj, 802 group=group,

~/Library/Caches/pypoetry/virtualenvs/-x204KUJE-py3.9/lib/python3.9/site-packages/xarray/backends/zarr.py in open_group(cls, store, mode, synchronizer, group, consolidated, consolidate_on_close, chunk_store, storage_options, append_dim, write_region, safe_chunks, stacklevel) 363 stacklevel=stacklevel, 364 ) --> 365 zarr_group = zarr.open_group(store, **open_kwargs) 366 elif consolidated: 367 # TODO: an option to pass the metadata_key keyword

~/Library/Caches/pypoetry/virtualenvs/-x204KUJE-py3.9/lib/python3.9/site-packages/zarr/hierarchy.py in open_group(store, mode, cache_attrs, synchronizer, path, chunk_store, storage_options) 1180 if contains_array(store, path=path): 1181 raise ContainsArrayError(path) -> 1182 raise GroupNotFoundError(path) 1183 1184 elif mode == 'w':

GroupNotFoundError: group not found at path '' ```

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None python: 3.9.12 (main, Mar 26 2022, 15:44:31) [Clang 13.1.6 (clang-1316.0.21.2)] python-bits: 64 OS: Darwin OS-release: 21.3.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None

xarray: 2022.3.0 pandas: 1.4.1 numpy: 1.22.3 scipy: None netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.11.1 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2021.12.0 distributed: 2021.12.0 matplotlib: None cartopy: None seaborn: None numbagg: None fsspec: 2021.11.1 cupy: None pint: None sparse: None setuptools: 60.9.3 pip: 21.3.1 conda: None pytest: 6.2.5 IPython: 7.32.0 sphinx: None

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6484/reactions",
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
485446209 MDU6SXNzdWU0ODU0NDYyMDk= 3266 Warnings in the test suite max-sixty 5635139 open 0     8 2019-08-26T20:52:34Z 2022-07-16T14:14:00Z   MEMBER      

If anyone is looking for any bite-size contributions, the test suite is throwing off many warnings. Most of these indicate that something will break in the future without code changes; thought mostly the code changes are small.

```

=============================== warnings summary =============================== /usr/share/miniconda/envs/xarray-tests/lib/python3.7/site-packages/heapdict.py:11 /usr/share/miniconda/envs/xarray-tests/lib/python3.7/site-packages/heapdict.py:11: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working class heapdict(collections.MutableMapping):

/usr/share/miniconda/envs/xarray-tests/lib/python3.7/site-packages/pydap/model.py:175 /usr/share/miniconda/envs/xarray-tests/lib/python3.7/site-packages/pydap/model.py:175: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working from collections import OrderedDict, Mapping

/usr/share/miniconda/envs/xarray-tests/lib/python3.7/site-packages/pydap/responses/das.py:14 /usr/share/miniconda/envs/xarray-tests/lib/python3.7/site-packages/pydap/responses/das.py:14: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working from collections import Iterable

xarray/tests/test_accessor_dt.py::test_cftime_strftime_access[365_day] /home/vsts/work/1/s/xarray/tests/test_accessor_dt.py:226: RuntimeWarning: Converting a CFTimeIndex with dates from a non-standard calendar, 'noleap', to a pandas.DatetimeIndex, which uses dates from the standard calendar. This may lead to subtle errors in operations that depend on the length of time between dates. xr.coding.cftimeindex.CFTimeIndex(data.time.values).to_datetimeindex(),

xarray/tests/test_accessor_dt.py::test_cftime_strftime_access[360_day] /home/vsts/work/1/s/xarray/tests/test_accessor_dt.py:226: RuntimeWarning: Converting a CFTimeIndex with dates from a non-standard calendar, '360_day', to a pandas.DatetimeIndex, which uses dates from the standard calendar. This may lead to subtle errors in operations that depend on the length of time between dates. xr.coding.cftimeindex.CFTimeIndex(data.time.values).to_datetimeindex(),

xarray/tests/test_accessor_dt.py::test_cftime_strftime_access[julian] /home/vsts/work/1/s/xarray/tests/test_accessor_dt.py:226: RuntimeWarning: Converting a CFTimeIndex with dates from a non-standard calendar, 'julian', to a pandas.DatetimeIndex, which uses dates from the standard calendar. This may lead to subtle errors in operations that depend on the length of time between dates. xr.coding.cftimeindex.CFTimeIndex(data.time.values).to_datetimeindex(),

xarray/tests/test_accessor_dt.py::test_cftime_strftime_access[all_leap] xarray/tests/test_accessor_dt.py::test_cftime_strftime_access[366_day] /home/vsts/work/1/s/xarray/tests/test_accessor_dt.py:226: RuntimeWarning: Converting a CFTimeIndex with dates from a non-standard calendar, 'all_leap', to a pandas.DatetimeIndex, which uses dates from the standard calendar. This may lead to subtle errors in operations that depend on the length of time between dates. xr.coding.cftimeindex.CFTimeIndex(data.time.values).to_datetimeindex(),

xarray/tests/test_accessor_str.py::test_empty_str_methods xarray/tests/test_accessor_str.py::test_empty_str_methods xarray/tests/test_accessor_str.py::test_empty_str_methods xarray/tests/test_accessor_str.py::test_empty_str_methods xarray/tests/test_accessor_str.py::test_empty_str_methods xarray/tests/test_accessor_str.py::test_empty_str_methods xarray/tests/test_accessor_str.py::test_empty_str_methods xarray/tests/test_accessor_str.py::test_empty_str_methods xarray/tests/test_accessor_str.py::test_empty_str_methods /home/vsts/work/1/s/xarray/core/duck_array_ops.py:202: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison flag_array = (arr1 == arr2) | (isnull(arr1) & isnull(arr2))

xarray/tests/test_backends.py::TestZarrDictStore::test_to_zarr_append_compute_false_roundtrip xarray/tests/test_backends.py::TestZarrDictStore::test_to_zarr_append_compute_false_roundtrip xarray/tests/test_backends.py::TestZarrDirectoryStore::test_to_zarr_append_compute_false_roundtrip xarray/tests/test_backends.py::TestZarrDirectoryStore::test_to_zarr_append_compute_false_roundtrip /home/vsts/work/1/s/xarray/conventions.py:184: SerializationWarning: variable None has data in the form of a dask array with dtype=object, which means it is being loaded into memory to determine a data type that can be safely stored on disk. To avoid this, coerce this variable to a fixed-size dtype with astype() before saving it. SerializationWarning,

xarray/tests/test_backends.py::TestScipyInMemoryData::test_zero_dimensional_variable /usr/share/miniconda/envs/xarray-tests/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject return f(args, *kwds)

xarray/tests/test_backends.py::TestPseudoNetCDFFormat::test_ict_format xarray/tests/test_backends.py::TestPseudoNetCDFFormat::test_ict_format_write xarray/tests/test_backends.py::TestPseudoNetCDFFormat::test_ict_format_write /usr/share/miniconda/envs/xarray-tests/lib/python3.7/site-packages/PseudoNetCDF/icarttfiles/ffi1001.py:80: DeprecationWarning: 'U' mode is deprecated f = openf(path, 'rU', encoding = encoding)

xarray/tests/test_backends.py::TestPseudoNetCDFFormat::test_ict_format xarray/tests/test_backends.py::TestPseudoNetCDFFormat::test_ict_format_write /usr/share/miniconda/envs/xarray-tests/lib/python3.7/site-packages/_pytest/python.py:170: RuntimeWarning: deallocating CachingFileManager(<function pncopen at 0x7f252e49a6a8>, '/home/vsts/work/1/s/xarray/tests/data/example.ict', kwargs={'format': 'ffi1001'}), but file is not already closed. This may indicate a bug. result = testfunction(**testargs)

xarray/tests/test_backends.py::TestPseudoNetCDFFormat::test_uamiv_format_read xarray/tests/test_backends.py::TestPseudoNetCDFFormat::test_uamiv_format_mfread xarray/tests/test_backends.py::TestPseudoNetCDFFormat::test_uamiv_format_write xarray/tests/test_backends.py::TestPseudoNetCDFFormat::test_uamiv_format_write /usr/share/miniconda/envs/xarray-tests/lib/python3.7/site-packages/PseudoNetCDF/camxfiles/uamiv/Memmap.py:141: UserWarning: UnboundLocalError("local variable 'dims' referenced before assignment") warn(repr(e))

xarray/tests/test_backends.py::TestPseudoNetCDFFormat::test_uamiv_format_mfread /home/vsts/work/1/s/xarray/tests/test_backends.py:103: FutureWarning: In xarray version 0.13 the default behaviour of open_mfdataset will change. To retain the existing behavior, pass combine='nested'. To use future default behavior, pass combine='by_coords'. See http://xarray.pydata.org/en/stable/combining.html#combining-multi

**kwargs

xarray/tests/test_backends.py::TestPseudoNetCDFFormat::test_uamiv_format_mfread /home/vsts/work/1/s/xarray/backends/api.py:931: FutureWarning: Also open_mfdataset will no longer accept a concat_dim argument. To get equivalent behaviour from now on please use the new combine_nested function instead (or the combine='nested' option to open_mfdataset).The datasets supplied do not have global dimension coordinates. In future, to continue concatenating without supplying dimension coordinates, please use the new combine_nested function (or the combine='nested' option to open_mfdataset. from_openmfds=True,

xarray/tests/test_coding_times.py::test_cf_datetime_nan[num_dates1-days since 2000-01-01-expected_list1] xarray/tests/test_coding_times.py::test_cf_datetime_nan[num_dates2-days since 2000-01-01-expected_list2] /usr/share/miniconda/envs/xarray-tests/lib/python3.7/site-packages/numpy/testing/_private/utils.py:913: FutureWarning: Converting timezone-aware DatetimeArray to timezone-naive ndarray with 'datetime64[ns]' dtype. In the future, this will return an ndarray with 'object' dtype where each element is a 'pandas.Timestamp' with the correct 'tz'. To accept the future behavior, pass 'dtype=object'. To keep the old behavior, pass 'dtype="datetime64[ns]"'. verbose=verbose, header='Arrays are not equal')

xarray/tests/test_dataarray.py::TestDataArray::test_drop_index_labels xarray/tests/test_dataarray.py::TestDataArray::test_drop_index_labels xarray/tests/test_dataarray.py::TestDataArray::test_drop_index_labels /home/vsts/work/1/s/xarray/core/dataarray.py:1842: DeprecationWarning: dropping dimensions using list-like labels is deprecated; use dict-like arguments. ds = self._to_temp_dataset().drop(labels, dim, errors=errors)

xarray/tests/test_dataset.py::TestDataset::test_drop_index_labels /home/vsts/work/1/s/xarray/tests/test_dataset.py:2066: DeprecationWarning: dropping dimensions using list-like labels is deprecated; use dict-like arguments. actual = data.drop(["a"], "x")

xarray/tests/test_dataset.py::TestDataset::test_drop_index_labels /home/vsts/work/1/s/xarray/tests/test_dataset.py:2070: DeprecationWarning: dropping dimensions using list-like labels is deprecated; use dict-like arguments. actual = data.drop(["a", "b"], "x")

xarray/tests/test_dataset.py::TestDataset::test_drop_index_labels /home/vsts/work/1/s/xarray/tests/test_dataset.py:2078: DeprecationWarning: dropping dimensions using list-like labels is deprecated; use dict-like arguments. data.drop(["c"], dim="x")

xarray/tests/test_dataset.py::TestDataset::test_drop_index_labels /home/vsts/work/1/s/xarray/tests/test_dataset.py:2080: DeprecationWarning: dropping dimensions using list-like labels is deprecated; use dict-like arguments. actual = data.drop(["c"], dim="x", errors="ignore")

xarray/tests/test_dataset.py::TestDataset::test_drop_index_labels /home/vsts/work/1/s/xarray/tests/test_dataset.py:2086: DeprecationWarning: dropping dimensions using list-like labels is deprecated; use dict-like arguments. actual = data.drop(["a", "b", "c"], "x", errors="ignore")

xarray/tests/test_dataset.py::TestDataset::test_drop_labels_by_keyword /home/vsts/work/1/s/xarray/tests/test_dataset.py:2135: DeprecationWarning: dropping dimensions using list-like labels is deprecated; use dict-like arguments. data.drop(labels=["a"], dim="x", x="a")

xarray/tests/test_dataset.py::TestDataset::test_convert_dataframe_with_many_types_and_multiindex /home/vsts/work/1/s/xarray/core/dataset.py:3959: FutureWarning: Converting timezone-aware DatetimeArray to timezone-naive ndarray with 'datetime64[ns]' dtype. In the future, this will return an ndarray with 'object' dtype where each element is a 'pandas.Timestamp' with the correct 'tz'. To accept the future behavior, pass 'dtype=object'. To keep the old behavior, pass 'dtype="datetime64[ns]"'. data = np.asarray(series).reshape(shape)

xarray/tests/test_dataset.py::TestDataset::test_convert_dataframe_with_many_types_and_multiindex /usr/share/miniconda/envs/xarray-tests/lib/python3.7/site-packages/pandas/core/apply.py:321: FutureWarning: Converting timezone-aware DatetimeArray to timezone-naive ndarray with 'datetime64[ns]' dtype. In the future, this will return an ndarray with 'object' dtype where each element is a 'pandas.Timestamp' with the correct 'tz'. To accept the future behavior, pass 'dtype=object'. To keep the old behavior, pass 'dtype="datetime64[ns]"'. results[i] = self.f(v)

xarray/tests/test_distributed.py::test_dask_distributed_cfgrib_integration_test /usr/share/miniconda/envs/xarray-tests/lib/python3.7/site-packages/tornado/gen.py:772: RuntimeWarning: deallocating CachingFileManager(<function open at 0x7f2527b49bf8>, '/tmp/tmpt4tmnjh3/temp-2044.tif', mode='r', kwargs={}), but file is not already closed. This may indicate a bug. self.future = convert_yielded(yielded)

xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-False-False-float-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-False-False-float-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-False-False-int-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-False-False-int-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-False-False-float32-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-False-False-float32-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-False-False-bool_-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-False-False-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-False-False-str-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-False-False-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-True-False-float-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-True-False-float-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-True-False-int-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-True-False-int-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-True-False-float32-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-True-False-float32-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-True-False-bool_-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-True-False-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-True-False-str-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-min-True-False-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-False-False-float-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-False-False-float-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-False-False-int-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-False-False-int-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-False-False-float32-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-False-False-float32-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-False-False-bool_-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-False-False-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-False-False-str-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-False-False-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-True-False-float-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-True-False-float-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-True-False-int-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-True-False-int-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-True-False-float32-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-True-False-float32-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-True-False-bool_-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-True-False-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-True-False-str-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-False-max-True-False-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-False-True-bool_-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-False-True-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-False-True-str-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-False-True-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-False-False-float-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-False-False-float-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-False-False-int-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-False-False-int-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-False-False-float32-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-False-False-float32-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-False-False-bool_-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-False-False-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-False-False-str-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-False-False-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-True-True-bool_-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-True-True-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-True-True-str-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-True-True-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-True-False-float-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-True-False-float-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-True-False-int-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-True-False-int-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-True-False-float32-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-True-False-float32-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-True-False-bool_-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-True-False-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-True-False-str-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-min-True-False-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-False-True-bool_-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-False-True-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-False-True-str-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-False-True-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-False-False-float-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-False-False-float-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-False-False-int-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-False-False-int-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-False-False-float32-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-False-False-float32-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-False-False-bool_-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-False-False-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-False-False-str-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-False-False-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-True-True-bool_-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-True-True-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-True-True-str-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-True-True-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-True-False-float-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-True-False-float-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-True-False-int-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-True-False-int-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-True-False-float32-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-True-False-float32-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-True-False-bool_-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-True-False-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-True-False-str-1] xarray/tests/test_duck_array_ops.py::test_argmin_max[x-True-max-True-False-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-min-False-False-float-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-min-False-False-int-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-min-False-False-float32-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-min-False-False-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-min-False-False-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-min-True-False-float-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-min-True-False-int-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-min-True-False-float32-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-min-True-False-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-min-True-False-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-max-False-False-float-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-max-False-False-int-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-max-False-False-float32-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-max-False-False-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-max-False-False-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-max-True-False-float-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-max-True-False-int-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-max-True-False-float32-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-max-True-False-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-False-max-True-False-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-min-False-True-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-min-False-True-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-min-False-False-float-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-min-False-False-int-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-min-False-False-float32-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-min-False-False-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-min-False-False-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-min-True-True-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-min-True-True-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-min-True-False-float-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-min-True-False-int-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-min-True-False-float32-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-min-True-False-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-min-True-False-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-max-False-True-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-max-False-True-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-max-False-False-float-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-max-False-False-int-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-max-False-False-float32-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-max-False-False-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-max-False-False-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-max-True-True-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-max-True-True-str-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-max-True-False-float-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-max-True-False-int-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-max-True-False-float32-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-max-True-False-bool_-2] xarray/tests/test_duck_array_ops.py::test_argmin_max[y-True-max-True-False-str-2] /home/vsts/work/1/s/xarray/core/dataarray.py:1842: FutureWarning: dropping coordinates using key values of dict-like labels is deprecated; use drop_vars or a list of coordinates. ds = self._to_temp_dataset().drop(labels, dim, errors=errors)

xarray/tests/test_plot.py::TestPlotStep::test_step /home/vsts/work/1/s/xarray/plot/plot.py:321: MatplotlibDeprecationWarning: Passing the drawstyle with the linestyle as a single string is deprecated since Matplotlib 3.1 and support will be removed in 3.3; please pass the drawstyle separately using the drawstyle keyword argument to Line2D or set_drawstyle() method (or ds/set_ds()). primitive = ax.plot(xplt_val, yplt_val, args, *kwargs)

xarray/tests/test_print_versions.py::test_show_versions /usr/share/miniconda/envs/xarray-tests/lib/python3.7/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses import imp

xarray/tests/test_sparse.py::test_dataarray_method[obj.roll((), {'x': 2})-True] /home/vsts/work/1/s/xarray/core/dataarray.py:2632: FutureWarning: roll_coords will be set to False in the future. Explicitly set roll_coords to silence warning. shifts=shifts, roll_coords=roll_coords, *shifts_kwargs

xarray/tests/test_sparse.py::TestSparseDataArrayAndDataset::test_ufuncs /home/vsts/work/1/s/xarray/tests/test_sparse.py:711: PendingDeprecationWarning: xarray.ufuncs will be deprecated when xarray no longer supports versions of numpy older than v1.17. Instead, use numpy ufuncs directly. assert_equal(np.sin(x), xu.sin(x))

xarray/tests/test_sparse.py::TestSparseDataArrayAndDataset::test_ufuncs /home/vsts/work/1/s/xarray/core/dataarray.py:2393: PendingDeprecationWarning: xarray.ufuncs will be deprecated when xarray no longer supports versions of numpy older than v1.17. Instead, use numpy ufuncs directly. return self.array_wrap(f(self.variable.data, args, *kwargs))

xarray/tests/test_sparse.py::TestSparseDataArrayAndDataset::test_groupby_bins /home/vsts/work/1/s/xarray/core/groupby.py:780: FutureWarning: Default reduction dimension will be changed to the grouped dimension in a future version of xarray. To silence this warning, pass dim=xarray.ALL_DIMS explicitly. **kwargs

-- Docs: https://docs.pytest.org/en/latest/warnings.html ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3266/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
295959111 MDU6SXNzdWUyOTU5NTkxMTE= 1900 Representing & checking Dataset schemas max-sixty 5635139 open 0     15 2018-02-09T18:06:08Z 2022-07-14T11:28:37Z   MEMBER      

What would be the best way to canonically describe a dataset, which could be read by both humans and machines?

For example, frequently in our code we have docstrings which look something like:

``` def get_returns(security_ids): """ Retuns mega-dimensional dataset which gives recent returns for a set of securities by: - Date - Return (raw / economic / smoothed / etc) - Scaling (constant / risk_scaled) - Span - Hedged vs Unhedged

Dataset keys are security ids. All dimensions have coords.
"""

```

This helps when attempting to understand what code is doing while only reading it. But this isn't consistent between docstrings and can't be read or checked by a machine. Has anyone solved this problem / have any suggestions for resources out there?

Tangentially related to https://github.com/python/typing/issues/513 (but our issues are less about the type, dimension sizes, and more about the arrays within a dataset, their dimensions, and their names)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1900/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1221918917 I_kwDOAMm_X85I1QDF 6551 Mypy workflow failing max-sixty 5635139 closed 0     3 2022-04-30T20:53:14Z 2022-05-26T21:50:52Z 2022-05-26T21:40:01Z MEMBER      

What is your issue?

I can't work out what is causing this, and can't repro locally, though I've tried to ensure the same things are installed.

The bisect is: - Passes: https://github.com/pydata/xarray/runs/6233389985?check_suite_focus=true - Fails: https://github.com/pydata/xarray/runs/6237267544?check_suite_focus=true

Probably we have to skip it in the meantime, which is a shame

Is there a better way of locking the dependency versions so we can rule that out? I generally don't use conda, and poetry is great at this. Is there a conda equivalent?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6551/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1200356907 I_kwDOAMm_X85Hi_4r 6473 RTD concurrency limit max-sixty 5635139 closed 0     2 2022-04-11T18:14:05Z 2022-04-19T06:29:24Z 2022-04-19T06:29:24Z MEMBER      

What is your issue?

From https://github.com/pydata/xarray/pull/6472, and some PRs this weekend:

Is anyone familiar with what's going on with RTD? Did our concurrency limit drop?

Are there alternatives (e.g. running the tests on GHA even if the actual docs get built on RTD?). If we have to pay RTD for a subscription for a bit until we make changes then we could do that (I'm happy to given my recently poor contribution track-record!)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6473/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1207211171 I_kwDOAMm_X85H9JSj 6499 Added `automerge` max-sixty 5635139 closed 0     2 2022-04-18T16:24:35Z 2022-04-18T18:21:39Z 2022-04-18T16:24:41Z MEMBER      

What is your issue?

@pydata/xarray

Because our pipeline takes a while, it can be helpful to have an option to "merge when tests pass" — I've now set that up. So you can click here and it'll do just that.

Someone annoyingly / confusingly, the "required checks" need to be specified manually, in https://github.com/pydata/xarray/settings/branch_protection_rules/2465574 — there's no option for just "all checks".

So if we change the checks — e.g. add Python 3.11 — that list needs to be updated. If we remove a check from the our CI and don't update the list, it won't be possible to merge the PR without clicking the red "Admin Override" box — so we should keep it up to date.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6499/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1205154246 I_kwDOAMm_X85H1THG 6487 Add details section to issue template max-sixty 5635139 closed 0     1 2022-04-15T01:14:39Z 2022-04-15T01:21:50Z 2022-04-15T01:20:36Z MEMBER      

What happened?

I'm testing adding an issue with a <details> section; ref https://github.com/pydata/xarray/pull/6486

What did you expect to happen?

No response

Minimal Complete Verifiable Example

No response

Relevant log output

No response

Anything else we need to know?

No response

Environment

``` INSTALLED VERSIONS ------------------ commit: None python: 3.9.12 (main, Mar 26 2022, 15:44:31) [Clang 13.1.6 (clang-1316.0.21.2)] python-bits: 64 OS: Darwin OS-release: 21.3.0 machine: arm64 processor: arm byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2022.3.0 pandas: 1.4.1 numpy: 1.22.3 scipy: 1.8.0 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.11.1 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2021.12.0 distributed: 2021.12.0 matplotlib: None cartopy: None seaborn: None numbagg: None fsspec: 2021.11.1 cupy: None pint: None sparse: None setuptools: 60.9.3 pip: 21.3.1 conda: None pytest: 6.2.5 IPython: 7.32.0 sphinx: None ```
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6487/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
929818771 MDU6SXNzdWU5Mjk4MTg3NzE= 5529 Very poor html repr performance on large multi-indexes max-sixty 5635139 closed 0     5 2021-06-25T04:31:27Z 2022-03-29T07:05:32Z 2022-03-29T07:05:32Z MEMBER      

What happened:

We have catestrophic performance on the html repr of some long multi-indexed data arrays. Here's a case of it taking 12s.

Minimal Complete Verifiable Example:

```python import xarray as xr

ds = xr.tutorial.load_dataset("air_temperature") da = ds["air"].stack(z=[...])

da.shape

(3869000,)

%timeit -n 1 -r 1 da.repr_html()

12.4 s !!

```

Anything else we need to know?:

I thought we'd fixed some issues here: https://github.com/pydata/xarray/pull/4846/files

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.10 (default, May 9 2021, 13:21:55) [Clang 12.0.5 (clang-1205.0.22.9)] python-bits: 64 OS: Darwin OS-release: 20.4.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 0.18.2 pandas: 1.2.4 numpy: 1.20.3 scipy: 1.6.3 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.8.3 cftime: 1.4.1 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.3 cfgrib: None iris: None bottleneck: 1.3.2 dask: 2021.06.1 distributed: 2021.06.1 matplotlib: 3.4.2 cartopy: None seaborn: 0.11.1 numbagg: 0.2.1 pint: None setuptools: 56.0.0 pip: 21.1.2 conda: None pytest: 6.2.4 IPython: 7.24.0 sphinx: 4.0.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5529/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
730017064 MDU6SXNzdWU3MzAwMTcwNjQ= 4542 MultiIndex name (not 'names') is not set max-sixty 5635139 closed 0     1 2020-10-27T01:09:06Z 2022-03-17T17:11:41Z 2022-03-17T17:11:41Z MEMBER      

Minimal Complete Verifiable Example:

```python In [82]: da = xr.DataArray(data, dims=list("ab")).stack(c=[...])

In [83]: data = np.random.RandomState(0).randn(1000, 500)

In [84]: da = xr.DataArray(data, dims=list("ab")).stack(c=[...])

In [85]: da.indexes['c'] Out[85]: MultiIndex([( 0, 0), ( 0, 1), ( 0, 2), ( 0, 3), ( 0, 4), ( 0, 5), ( 0, 6), ( 0, 7), ( 0, 8), ( 0, 9), ... (999, 490), (999, 491), (999, 492), (999, 493), (999, 494), (999, 495), (999, 496), (999, 497), (999, 498), (999, 499)], names=['a', 'b'], length=500000)

In [89]: da.indexes['c'].name ```

What you expected to happen:

In [89]: da.indexes['c'].name Out [89]: 'c'

Environment:

Output of <tt>xr.show_versions()</tt> In [90]: xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.8.6 (default, Oct 8 2020, 14:06:32) [Clang 12.0.0 (clang-1200.0.32.2)] python-bits: 64 OS: Darwin OS-release: 19.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None xarray: 0.16.1 pandas: 1.1.3 numpy: 1.19.2 scipy: 1.5.3 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.5.0 cftime: 1.2.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2.30.0 distributed: None matplotlib: 3.3.2 cartopy: None seaborn: 0.11.0 numbagg: installed pint: None setuptools: 50.3.2 pip: 20.2.3 conda: None pytest: 6.1.1 IPython: 7.18.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4542/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1063049830 I_kwDOAMm_X84_XNpm 6027 Try new Issue template format? max-sixty 5635139 closed 0     1 2021-11-25T00:52:12Z 2022-02-23T12:34:48Z 2021-12-29T16:47:47Z MEMBER      

I just went to put an issue in with pre-commit, and it looks good:

Here are the docs: https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/syntax-for-githubs-form-schema

We could replace our "paste this value between in this markdown code block" with boxes that encapsulate this

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6027/reactions",
    "total_count": 4,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1125030343 I_kwDOAMm_X85DDpnH 6243 Maintenance improvements max-sixty 5635139 open 0     0 2022-02-05T21:01:51Z 2022-02-05T21:01:51Z   MEMBER      

Is your feature request related to a problem?

At the end of the dev call, we discussed ways to do better at maintenance. I'd like to make Xarray a wonderful place to contribute, partly because it was so formative for me in becoming more involved with software engineering.

Describe the solution you'd like

We've already come far, because of the hard work of many of us!

A few ideas, in increasing order of radical-ness - We looked at @andersy005's dashboards for PRs & Issues. Could we expose this, both to hold ourselves accountable and signal to potential contributors that we care about turnaround time for their contributions? - Is there a systematic way of understanding who should review something? - FWIW a few months ago I looked for a bot that would recommend a reviewer based on who had contributed code in the past, which I think I've seen before. But I couldn't find one generally available. This would be really helpful — we wouldn't have n people each assessing whether they're the best reviewer for each contribution. If anyone does better than me at finding something like this, that would be awesome. - Could we add a label so people can say "now I'm waiting for a review", and track how long those stay up? - Ensuring the 95th percentile is < 2 days is more important than the median being in the hours. It does pain me when I see PRs get dropped for a few weeks. TBC, I'm as responsible as anyone. - Could we have a bot that asks for feedback on the review process — i.e. "I received a prompt and helpful review", "I would recommend a friend contribute to Xarray", etc?

Describe alternatives you've considered

No response

Additional context

There's always a danger with making stats legible that Goodhart's law strikes. And sometimes stats are not joyful, and lots of people come here for joy. So probably there's a tradeoff.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6243/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1117934813 I_kwDOAMm_X85ColTd 6206 Remove stable branch? max-sixty 5635139 closed 0     3 2022-01-28T23:28:04Z 2022-01-30T22:19:08Z 2022-01-30T22:19:08Z MEMBER      

Is your feature request related to a problem?

Currently https://github.com/pydata/xarray/blob/main/HOW_TO_RELEASE.md has a few steps around the stable branch

Describe the solution you'd like

In our dev call, we discussed the possibility of using main in place of stable and removing the stable branch

IIRC there's something we can do on RTD to make that replacement. (If anyone knows to hand, comment here; otherwise I can search for it).

Is there anything else we need to do apart from RTD?

Describe alternatives you've considered

No response

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6206/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1045038245 I_kwDOAMm_X84-SgSl 5940 Refresh on issue bots max-sixty 5635139 closed 0     5 2021-11-04T17:50:26Z 2022-01-10T15:09:21Z 2022-01-10T15:09:21Z MEMBER      

Currently we have two bots which comment on issues, I'd propose we remove them both - The "Unit Test Results". I don't find this that useful — do others? I thought it would be listing the tests that failed (I had thought it might even be doing duration changes etc), but it seems to just list the number of tests that failed, which isn't that useful. Or are we not using it correctly? - pep8speaks — I think this would be dominated by pre-commit.ci

I'd propose we add: - pre-commit.ci. This autofixes PRs! e.g. here's a test PR with a black error from pytest-accept: https://github.com/max-sixty/pytest-accept/pull/24. This could also replace the pre-commit GHA (though it's no great cost to duplicate it)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5940/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
930580130 MDU6SXNzdWU5MzA1ODAxMzA= 5538 Is to_dask_dataframe(set_index=...) correct? max-sixty 5635139 closed 0     1 2021-06-26T00:56:04Z 2021-06-26T04:41:54Z 2021-06-26T04:41:54Z MEMBER      

What happened:

Calling ds.to_dask_dataframe(set_index='lat') raises on attempting to create a MultiIndex.

What you expected to happen:

Shouldn't this create a normal index with just lat?

Minimal Complete Verifiable Example:

```python In [1]: ds = xr.tutorial.load_dataset('air_temperature')

In [2]: ds.to_dask_dataframe(set_index='lat')

NotImplementedError Traceback (most recent call last) <ipython-input-2-e13a093182d0> in <module> ----> 1 ds.to_dask_dataframe(set_index='lat')

~/workspace/xarray/xarray/core/dataset.py in to_dask_dataframe(self, dim_order, set_index) 5534 # triggers an error about multi-indexes, even if only one 5535 # dimension is passed -> 5536 df = df.set_index(dim_order) 5537 5538 return df

~/.asdf/installs/python/3.8.10/lib/python3.8/site-packages/dask/dataframe/core.py in set_index(failed resolving arguments) 4177 from .shuffle import set_index 4178 -> 4179 return set_index( 4180 self, 4181 other,

~/.asdf/installs/python/3.8.10/lib/python3.8/site-packages/dask/dataframe/shuffle.py in set_index(df, index, npartitions, shuffle, compute, drop, upsample, divisions, partition_size, **kwargs) 140 index = index[0] 141 else: --> 142 raise NotImplementedError( 143 "Dask dataframe does not yet support multi-indexes.\n" 144 "You tried to index with this index: %s\n"

NotImplementedError: Dask dataframe does not yet support multi-indexes. You tried to index with this index: ['lat', 'time', 'lon'] Indexes must be single columns only. ```

Anything else we need to know?:

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: 95ba539f660e696fc080f39dd0afc0e29385fabc python: 3.8.10 (default, May 9 2021, 13:21:55) [Clang 12.0.5 (clang-1205.0.22.9)] python-bits: 64 OS: Darwin OS-release: 20.4.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 0.18.2 pandas: 1.2.4 numpy: 1.20.3 scipy: 1.6.3 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.8.3 cftime: 1.4.1 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.3 cfgrib: None iris: None bottleneck: 1.3.2 dask: 2021.06.1 distributed: 2021.06.1 matplotlib: 3.4.2 cartopy: None seaborn: 0.11.1 numbagg: 0.2.1 pint: None setuptools: 56.0.0 pip: 21.1.2 conda: None pytest: 6.2.4 IPython: 7.24.0 sphinx: 4.0.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5538/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
907715257 MDU6SXNzdWU5MDc3MTUyNTc= 5409 Split up tests? max-sixty 5635139 open 0     4 2021-05-31T21:07:53Z 2021-06-16T15:51:19Z   MEMBER      

Currently a large share of our tests are in test_dataset.py and test_dataarray.py — each of which are around 7k lines.

There's a case for splitting these up: - Many of the tests are somewhat duplicated between the files (and test_variable.py in some cases) — i.e. we're running the same test over a Dataset & DataArray, but putting them far away from each other in separate files. Should we instead have them split by "function"; e.g. test_rolling.py for all rolling tests? - My editor takes 5-20 seconds to run the linter and save the file. This is a very narrow complaint. - Now that we're all onto pytest, there's no need to have them in the same class.

If we do this, we could start on the margin — new tests around some specific functionality — e.g. join / rolling / reindex / stack (just a few from browsing through) — could go into a new respective test_{}.py file. Rather than some big copy and paste commit.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5409/reactions",
    "total_count": 5,
    "+1": 5,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
912881551 MDU6SXNzdWU5MTI4ODE1NTE= 5444 🟠 Test failure on master max-sixty 5635139 closed 0     1 2021-06-06T16:31:09Z 2021-06-07T21:05:24Z 2021-06-07T21:05:24Z MEMBER      

What happened:

We have a failure related to a dask release, I think. Here's a job that failed: https://github.com/pydata/xarray/pull/5365/checks?check_run_id=2757459587

It's the test: xarray/tests/test_computation.py::test_vectorize_dask_dtype_meta

```

    References
    ----------
    .. [1] https://docs.scipy.org/doc/numpy/reference/ufuncs.html
    .. [2] https://docs.scipy.org/doc/numpy/reference/c-api/generalized-ufuncs.html
    """
    # Input processing:
    ## Signature
    if not isinstance(signature, str):
        raise TypeError("`signature` has to be of type string")
    input_coredimss, output_coredimss = _parse_gufunc_signature(signature)

    ## Determine nout: nout = None for functions of one direct return; nout = int for return tuples
    nout = None if not isinstance(output_coredimss, list) else len(output_coredimss)

    ## Consolidate onto `meta`
    if meta is not None and output_dtypes is not None:
      raise ValueError(
            "Only one of `meta` and `output_dtypes` should be given (`meta` is preferred)."
        )

E ValueError: Only one of meta and output_dtypes should be given (meta is preferred).

```

Should we xfail this? Does anyone have thoughts for a quick fix?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5444/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
909676433 MDU6SXNzdWU5MDk2NzY0MzM= 5431 Do we need to test python 3.8? max-sixty 5635139 closed 0     1 2021-06-02T16:24:35Z 2021-06-04T19:34:31Z 2021-06-04T19:34:31Z MEMBER      

Is your feature request related to a problem? Please describe.

Currently we test on python 3.7, 3.8, and 3.9; across Linux, Windows, and Mac.

Describe the solution you'd like

Is there any reason to test 3.8 on every commit? Is there any code that would work on 3.7 & 3.9 but not on 3.8?

It's no great cost, but getting the pipeline to be faster has (fairly distributed, often unseen) benefits.

Describe alternatives you've considered

  • I'm not sure whether it's possible to query CI history, and see where failures have shown up? i.e. "which tests, if you never included them, would lead to no change in whether all the tests pass"?
  • We could run these as "confirmation" tests on main only
  • We could maybe also cut out Mac & Windows 3.7? Less confident about these.
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5431/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
895918276 MDU6SXNzdWU4OTU5MTgyNzY= 5348 v0.18.2 max-sixty 5635139 closed 0     2 2021-05-19T21:21:18Z 2021-05-20T01:51:12Z 2021-05-19T21:35:47Z MEMBER      

I'm about to release this as v0.18.2: https://github.com/pydata/xarray/compare/v0.18.1...max-sixty:release-0.18.2?expand=1 given https://github.com/pydata/xarray/issues/5346

Let me know any thoughts @pydata/xarray , thanks

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5348/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
874110561 MDU6SXNzdWU4NzQxMTA1NjE= 5248 Appearance of bulleted lists in docs max-sixty 5635139 closed 0     0 2021-05-02T23:21:49Z 2021-05-03T23:23:49Z 2021-05-03T23:23:49Z MEMBER      

What happened:

The new docs are looking great! One small issue — the lists don't appear as lists; e.g.

from https://xarray.pydata.org/en/latest/generated/xarray.Dataset.query.html

Do we need to change the rst convention?

What you expected to happen:

As bullets, with linebreaks

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5248/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
860038564 MDU6SXNzdWU4NjAwMzg1NjQ= 5173 Clip broadcasts min or max array input max-sixty 5635139 closed 0     1 2021-04-16T17:41:27Z 2021-04-21T19:06:48Z 2021-04-21T19:06:48Z MEMBER      

Is your feature request related to a problem? Please describe. Currently .clip can either take a scalar or a identically dimensioned array. numpy allows for broadcasting, which we can do in xarray too:

```python

In [11]: da = xr.DataArray(np.arange(24).reshape(2,4,3), dims=['a','b','c']) ...: da Out[11]: <xarray.DataArray (a: 2, b: 4, c: 3)> array([[[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 9, 10, 11]],

   [[12, 13, 14],
    [15, 16, 17],
    [18, 19, 20],
    [21, 22, 23]]])

Dimensions without coordinates: a, b, c

In [12]: da.clip(da.mean('a'))

ValueError Traceback (most recent call last) <ipython-input-12-c8b9c6d0e0a1> in <module> ----> 1 da.clip(da.mean('a'))

[...]

~/.asdf/installs/python/3.8.8/lib/python3.8/site-packages/xarray/core/computation.py in apply_variable_ufunc(func, signature, exclude_dims, dask, output_dtypes, vectorize, keep_attrs, dask_gufunc_kwargs, *args) 741 data = as_compatible_data(data) 742 if data.ndim != len(dims): --> 743 raise ValueError( 744 "applied function returned data with unexpected " 745 f"number of dimensions. Received {data.ndim} dimension(s) but "

ValueError: applied function returned data with unexpected number of dimensions. Received 3 dimension(s) but expected 2 dimensions with names: ('b', 'c') ```

Adding in a broadcast_like currently allows it to work, but we can do the equivalent internally:

```python In [20]: da.clip(da.mean('a').broadcast_like(da)) Out[20]: <xarray.DataArray (a: 2, b: 4, c: 3)> array([[[ 6., 7., 8.], [ 9., 10., 11.], [12., 13., 14.], [15., 16., 17.]],

   [[12., 13., 14.],
    [15., 16., 17.],
    [18., 19., 20.],
    [21., 22., 23.]]])

Dimensions without coordinates: a, b, c ```

Additional context Numpy broadcasts the argument: https://numpy.org/doc/stable/reference/generated/numpy.clip.html

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5173/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
728893769 MDU6SXNzdWU3Mjg4OTM3Njk= 4535 Support operations with pandas Offset objects max-sixty 5635139 closed 0     2 2020-10-24T22:49:57Z 2021-03-06T23:02:01Z 2021-03-06T23:02:01Z MEMBER      

Is your feature request related to a problem? Please describe.

Currently xarray objects containting datetimes don't operate with pandas' offset objects:

python times = pd.date_range("2000-01-01", freq="6H", periods=10) ds = xr.Dataset( { "foo": (["time", "x", "y"], np.random.randn(10, 5, 3)), "bar": ("time", np.random.randn(10), {"meta": "data"}), "time": times, } ) ds.attrs["dsmeta"] = "dsdata" ds.resample(time="24H").mean("time").time + to_offset("8H")

raises: ```


TypeError Traceback (most recent call last) <ipython-input-29-f9de46fe6c54> in <module> ----> 1 ds.resample(time="24H").mean("time").time + to_offset("8H")

/usr/local/lib/python3.8/site-packages/xarray/core/dataarray.py in func(self, other) 2763 2764 variable = ( -> 2765 f(self.variable, other_variable) 2766 if not reflexive 2767 else f(other_variable, self.variable)

/usr/local/lib/python3.8/site-packages/xarray/core/variable.py in func(self, other) 2128 with np.errstate(all="ignore"): 2129 new_data = ( -> 2130 f(self_data, other_data) 2131 if not reflexive 2132 else f(other_data, self_data)

TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'pandas._libs.tslibs.offsets.Hour' ```

This is an issue because pandas resampling has deprecated loffset — from our test suite:

``` xarray/tests/test_dataset.py::TestDataset::test_resample_loffset /Users/maximilian/workspace/xarray/xarray/tests/test_dataset.py:3844: FutureWarning: 'loffset' in .resample() and in Grouper() is deprecated.

df.resample(freq="3s", loffset="8H")

becomes:

from pandas.tseries.frequencies import to_offset df = df.resample(freq="3s").mean() df.index = df.index.to_timestamp() + to_offset("8H")

ds.bar.to_series().resample("24H", loffset="-12H").mean()

```

...and so we'll need to support something like this in order to maintain existing behavior.

Describe the solution you'd like I'm not completely sure; I think probably supporting the operations between xarray objects containing datetime objects and pandas' offset objects.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4535/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
572995385 MDU6SXNzdWU1NzI5OTUzODU= 3811 Don't warn on empty reductions max-sixty 5635139 closed 0     2 2020-02-28T20:45:38Z 2021-02-21T23:05:46Z 2021-02-21T23:05:46Z MEMBER      

Numpy warns when computing over an all-NaN slice. We handle that case reasonably and so should handle and discard the warning.

MCVE Code Sample

```python In [1]: import xarray as xr

In [2]: import numpy as np

In [3]: da = xr.DataArray(np.asarray([np.nan]*3))

In [4]: da
Out[4]: <xarray.DataArray (dim_0: 3)> array([nan, nan, nan]) Dimensions without coordinates: dim_0

In [6]: da.mean()
[...]/python3.6/site-packages/xarray/core/nanops.py:142: RuntimeWarning: Mean of empty slice return np.nanmean(a, axis=axis, dtype=dtype) Out[6]: <xarray.DataArray ()> array(nan)

```

Expected Output

No warning

Problem Description

Somewhat discussed in https://github.com/pydata/xarray/issues/1164, and https://github.com/pydata/xarray/issues/1652, but starting a separate issue as it's more important than just noise in the test suite, and not covered by the existing work on comparisons & arithmetic

Output of xr.show_versions()

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3811/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
782943813 MDU6SXNzdWU3ODI5NDM4MTM= 4789 Poor performance of repr of large arrays, particularly jupyter repr max-sixty 5635139 closed 0     5 2021-01-11T00:28:24Z 2021-01-29T23:05:58Z 2021-01-29T23:05:58Z MEMBER      

What happened:

The _repr_html_ method of large arrays seems very slow — 4.78s in the case of a 100m value array; and the general repr seems fairly slow — 1.87s. Here's a quick example. I haven't yet investigated how dependent it is on there being a MultiIndex.

What you expected to happen:

We should really focus on having good repr performance, given how essential it is to any REPL workflow.

Minimal Complete Verifiable Example:

```python In [10]: import xarray as xr ...: import numpy as np ...: import pandas as pd

In [11]: idx = pd.MultiIndex.from_product([range(10_000), range(10_000)])

In [12]: df = pd.DataFrame(range(100_000_000), index=idx)

In [13]: da = xr.DataArray(df)

In [14]: da Out[14]: <xarray.DataArray (dim_0: 100000000, dim_1: 1)> array([[ 0], [ 1], [ 2], ..., [99999997], [99999998], [99999999]]) Coordinates: * dim_0 (dim_0) MultiIndex - dim_0_level_0 (dim_0) int64 0 0 0 0 0 0 0 ... 9999 9999 9999 9999 9999 9999 - dim_0_level_1 (dim_0) int64 0 1 2 3 4 5 6 ... 9994 9995 9996 9997 9998 9999 * dim_1 (dim_1) int64 0

In [26]: %timeit repr(da) 1.87 s ± 7.33 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [27]: %timeit da.repr_html() 4.78 s ± 1.8 s per loop (mean ± std. dev. of 7 runs, 1 loop each) ```

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.7 (default, Dec 30 2020, 10:13:08) [Clang 12.0.0 (clang-1200.0.32.28)] python-bits: 64 OS: Darwin OS-release: 19.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None xarray: 0.16.3.dev48+gbf0fe2ca pandas: 1.1.3 numpy: 1.19.2 scipy: 1.5.3 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.5.0 cftime: 1.2.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2.30.0 distributed: None matplotlib: 3.3.2 cartopy: None seaborn: 0.11.0 numbagg: installed pint: 0.16.1 setuptools: 51.1.1 pip: 20.3.3 conda: None pytest: 6.1.1 IPython: 7.19.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4789/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
576692586 MDU6SXNzdWU1NzY2OTI1ODY= 3837 Should we run tests on docstrings? max-sixty 5635139 closed 0     2 2020-03-06T04:35:16Z 2020-09-11T12:34:34Z 2020-09-11T12:34:34Z MEMBER      

Currently almost none of the docstrings pass running pytest --doctest-modules xarray/core, though mostly for easy reasons.

Should we run these in CI?

I've recently started using docstring tests in another project, and they've work pretty well.

CC @keewis

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3837/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
298421965 MDU6SXNzdWUyOTg0MjE5NjU= 1923 Local test failure in test_backends max-sixty 5635139 closed 0     6 2018-02-19T22:53:37Z 2020-09-05T20:32:17Z 2020-09-05T20:32:17Z MEMBER      

I'm happy to debug this further but before I do, is this an issue people have seen before? I'm running tests on master and hit an issue very early on.

FWIW I don't use netCDF, and don't think I've got that installed

Code Sample, a copy-pastable example if possible

```python ========================================================================== FAILURES ========================================================================== _________ ScipyInMemoryDataTest.test_bytesio_pickle __________

self = <xarray.tests.test_backends.ScipyInMemoryDataTest testMethod=test_bytesio_pickle>

@pytest.mark.skipif(PY2, reason='cannot pickle BytesIO on Python 2')
def test_bytesio_pickle(self):
    data = Dataset({'foo': ('x', [1, 2, 3])})
    fobj = BytesIO(data.to_netcdf())
    with open_dataset(fobj, autoclose=self.autoclose) as ds:
      unpickled = pickle.loads(pickle.dumps(ds))

E TypeError: can't pickle _thread.lock objects

xarray/tests/test_backends.py:1384: TypeError ```

Problem description

[this should explain why the current behavior is a problem and why the expected output is a better solution.]

Expected Output

Skip or pass backends tests

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: d00721a3560f57a1b9226c5dbf5bf3af0356619d python: 3.6.4.final.0 python-bits: 64 OS: Darwin OS-release: 17.4.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.7.0-38-g1005a9e # not sure why this is tagged so early. I'm running on latest master pandas: 0.22.0 numpy: 1.14.0 scipy: 1.0.0 netCDF4: None h5netcdf: None h5py: None Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: None distributed: None matplotlib: 2.1.2 cartopy: None seaborn: 0.8.1 setuptools: 38.5.1 pip: 9.0.1 conda: None pytest: 3.4.0 IPython: 6.2.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1923/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
681325776 MDU6SXNzdWU2ODEzMjU3NzY= 4349 NaN in cov & cov? max-sixty 5635139 closed 0     1 2020-08-18T20:36:40Z 2020-08-30T11:36:57Z 2020-08-30T11:36:57Z MEMBER      

Is your feature request related to a problem? Please describe. Could cov & corr ignore missing values?

Describe the solution you'd like Currently any NaN in an dimension over which cov / corr is calculated gives a NaN result:

```python In [1]: import xarray as xr ...: import numpy as np ...: da = xr.DataArray([[1, 2], [1, np.nan]], dims=["x", "time"]) ...: da Out[1]: <xarray.DataArray (x: 2, time: 2)> array([[ 1., 2.], [ 1., nan]]) Dimensions without coordinates: x, time

In [2]: xr.cov(da,da) Out[2]: <xarray.DataArray ()> array(nan) ```

That's explained here as: python # 4. Compute covariance along the given dim # N.B. `skipna=False` is required or there is a bug when computing # auto-covariance. E.g. Try xr.cov(da,da) for # da = xr.DataArray([[1, 2], [1, np.nan]], dims=["x", "time"]) cov = (demeaned_da_a * demeaned_da_b).sum(dim=dim, skipna=False) / (valid_count)

Without having thought about it for too long, I'm not sure I understand this, and couldn't find any discussion in the PR. Adding this diff seems to fail tests around NaN values but no others:

``diff diff --git a/xarray/core/computation.py b/xarray/core/computation.py index 1f2a8a8e..1fc95fe1 100644 --- a/xarray/core/computation.py +++ b/xarray/core/computation.py @@ -1256,7 +1256,8 @@ def _cov_corr(da_a, da_b, dim=None, ddof=0, method=None): # N.B.skipna=False` is required or there is a bug when computing # auto-covariance. E.g. Try xr.cov(da,da) for # da = xr.DataArray([[1, 2], [1, np.nan]], dims=["x", "time"]) - cov = (demeaned_da_a * demeaned_da_b).sum(dim=dim, skipna=False) / (valid_count) + cov = (demeaned_da_a * demeaned_da_b).sum(dim=dim, skipna=True, min_count=1) / (valid_count) + # cov = (demeaned_da_a * demeaned_da_b).sum(dim=dim, skipna=False) / (valid_count)

 if method == "cov":
     return cov

```

Does anyone know off-hand the logic here?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4349/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
326711578 MDU6SXNzdWUzMjY3MTE1Nzg= 2188 Allow all dims-as-kwargs methods to take a dict instead max-sixty 5635139 closed 0     4 2018-05-26T05:22:55Z 2020-08-24T10:21:58Z 2020-08-24T05:24:32Z MEMBER      

Follow up to https://github.com/pydata/xarray/pull/2174

Pasting from https://github.com/pydata/xarray/pull/2174#issuecomment-392111566

  • [x] stack
  • [x] shift
  • [x] roll
  • [x] set_index
  • [x] reorder_levels
  • [x] rolling
  • [ ] resample (not yet, we still support old behavior for the first positional arguments with a warning)

...potentially rename (I often trip myself up on that)?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2188/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
132579684 MDU6SXNzdWUxMzI1Nzk2ODQ= 755 count docstring mistakenly includes skipna max-sixty 5635139 closed 0     2 2016-02-10T00:49:34Z 2020-07-24T16:09:25Z 2020-07-24T16:09:25Z MEMBER      

Is this a mistake or am I missing something?

http://xray.readthedocs.org/en/stable/generated/xarray.DataArray.count.html?highlight=count#xarray.DataArray.count

skipna : bool, optional If True, skip missing values (as marked by NaN). By default, only skips missing values for float dtypes; other dtypes either do not have a sentinel missing value (int) or skipna=True has not been implemented (object, datetime64 or timedelta64).

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/755/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
365973662 MDU6SXNzdWUzNjU5NzM2NjI= 2459 Stack + to_array before to_xarray is much faster that a simple to_xarray max-sixty 5635139 closed 0     13 2018-10-02T16:13:26Z 2020-07-02T20:39:01Z 2020-07-02T20:39:01Z MEMBER      

I was seeing some slow performance around to_xarray() on MultiIndexed series, and found that unstacking one of the dimensions before running to_xarray(), and then restacking with to_array() was ~30x faster. This time difference is consistent with larger data sizes.

To reproduce:

Create a series with a MultiIndex, ensuring the MultiIndex isn't a simple product:

```python s = pd.Series( np.random.rand(100000), index=pd.MultiIndex.from_product([ list('abcdefhijk'), list('abcdefhijk'), pd.DatetimeIndex(start='2000-01-01', periods=1000, freq='B'), ]))

cropped = s[::3] cropped.index=pd.MultiIndex.from_tuples(cropped.index, names=list('xyz'))

cropped.head()

x y z

a a 2000-01-03 0.993989

2000-01-06 0.850518

2000-01-11 0.068944

2000-01-14 0.237197

2000-01-19 0.784254

dtype: float64

```

Two approaches for getting this into xarray; 1 - Simple .to_xarray():

```python

current_method = cropped.to_xarray()

<xarray.DataArray (x: 10, y: 10, z: 1000)> array([[[0.993989, nan, ..., nan, 0.721663], [ nan, nan, ..., 0.58224 , nan], ..., [ nan, 0.369382, ..., nan, nan], [0.98558 , nan, ..., nan, 0.403732]],

   [[     nan,      nan, ..., 0.493711,      nan],
    [     nan, 0.126761, ...,      nan,      nan],
    ...,
    [0.976758,      nan, ...,      nan, 0.816612],
    [     nan,      nan, ..., 0.982128,      nan]],

   ...,

   [[     nan, 0.971525, ...,      nan,      nan],
    [0.146774,      nan, ...,      nan, 0.419806],
    ...,
    [     nan,      nan, ..., 0.700764,      nan],
    [     nan, 0.502058, ...,      nan,      nan]],

   [[0.246768,      nan, ...,      nan, 0.079266],
    [     nan,      nan, ..., 0.802297,      nan],
    ...,
    [     nan, 0.636698, ...,      nan,      nan],
    [0.025195,      nan, ...,      nan, 0.629305]]])

Coordinates: * x (x) object 'a' 'b' 'c' 'd' 'e' 'f' 'h' 'i' 'j' 'k' * y (y) object 'a' 'b' 'c' 'd' 'e' 'f' 'h' 'i' 'j' 'k' * z (z) datetime64[ns] 2000-01-03 2000-01-04 ... 2003-10-30 2003-10-31 ```

This takes 536 ms

2 - unstack in pandas first, and then use to_array to do the equivalent of a restack: proposed_version = ( cropped .unstack('y') .to_xarray() .to_array('y') )

This takes 17.3 ms

To confirm these are identical:

``` proposed_version_adj = ( proposed_version .assign_coords(y=proposed_version['y'].astype(object)) .transpose(*current_version.dims) )

proposed_version_adj.equals(current_version)

True

```

Problem description

A default operation is much slower than a (potentially) equivalent operation that's not the default.

I need to look more at what's causing the issues. I think it's to do with the .reindex(full_idx), but I'm unclear why it's so much faster in the alternative route, and whether there's a fix that we can make to make the default path fast.

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 2.7.14.final.0 python-bits: 64 OS: Linux OS-release: 4.9.93-linuxkit-aufs machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.utf8 LOCALE: None.None xarray: 0.10.9 pandas: 0.23.4 numpy: 1.15.2 scipy: 1.1.0 netCDF4: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None PseudonetCDF: None rasterio: None iris: None bottleneck: 1.2.1 cyordereddict: None dask: None distributed: None matplotlib: 2.2.3 cartopy: 0.16.0 seaborn: 0.9.0 setuptools: 40.4.3 pip: 18.0 conda: None pytest: 3.8.1 IPython: 5.8.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2459/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
561316080 MDU6SXNzdWU1NjEzMTYwODA= 3760 Truncate array repr based on line count max-sixty 5635139 closed 0     1 2020-02-06T22:48:58Z 2020-06-24T16:04:12Z 2020-06-24T16:04:12Z MEMBER      

MCVE Code Sample

I thought we might have had an issue (and maybe solved it?) but couldn't find it anywhere. Forgive me if I'm duplicating.

```python xr.DataArray(np.random.rand(100,5,1))

<xarray.DataArray (dim_0: 100, dim_1: 5, dim_2: 1)> array([[[0.71333665], [0.93820892], [0.48678056], [0.07299961], [0.63542414]],

*** Deleted 400 lines ***

   [[0.29987457],
    [0.55963998],
    [0.25976744],
    [0.80062955],
    [0.503025  ]],

   [[0.48255097],
    [0.55861315],
    [0.36059861],
    [0.96539665],
    [0.05674621]],

   [[0.81389941],
    [0.55745028],
    [0.20348983],
    [0.63390148],
    [0.94698865]],

   [[0.16792246],
    [0.9252646 ],
    [0.38596734],
    [0.17168077],
    [0.18162088]],

   [[0.04526339],
    [0.70028912],
    [0.72388995],
    [0.97481276],
    [0.66155381]],

   [[0.15058745],
    [0.57646963],
    [0.53382085],
    [0.24696459],
    [0.77601528]],

   [[0.6752243 ],
    [0.84991466],
    [0.87758404],
    [0.70828751],
    [0.04033709]]])

Dimensions without coordinates: dim_0, dim_1, dim_2 ```

Expected Output

With larger arrays, it's much more reasonable:

``` <xarray.DataArray (dim_0: 500, dim_1: 6, dim_2: 1)> array([[[0.9680447 ], [0.12554914], [0.9163406 ], [0.63710986], [0.97778361], [0.6419909 ]],

   [[0.48480678],
    [0.31214637],
    [0.72270997],
    [0.81523543],
    [0.34327902],
    [0.80941523]],

   [[0.92192284],
    [0.47841933],
    [0.00760903],
    [0.83886152],
    [0.88538772],
    [0.6532889 ]],

   ...,

   [[0.39558324],
    [0.42220218],
    [0.56731915],
    [0.27388751],
    [0.51097741],
    [0.62824705]],

   [[0.97379019],
    [0.0311196 ],
    [0.09790975],
    [0.65206508],
    [0.14369363],
    [0.09683937]],

   [[0.71318171],
    [0.88591664],
    [0.30032286],
    [0.97324135],
    [0.10250702],
    [0.03973667]]])

Dimensions without coordinates: dim_0, dim_1, dim_2 ```

Problem Description

Something like 40 lines is probably a reasonable place to truncate?

Output of xr.show_versions()

# Paste the output here xr.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 21:52:21) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: [...] machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.utf8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.1 xarray: 0.15.0 pandas: 0.25.3 numpy: 1.17.3 scipy: 1.3.2 netCDF4: 1.5.3 pydap: None h5netcdf: 0.7.4 h5py: 2.10.0 Nio: None zarr: None cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 2.7.0 distributed: 2.7.0 matplotlib: 3.1.2 cartopy: None seaborn: 0.9.0 numbagg: None setuptools: 41.6.0.post20191101 pip: 19.3.1 conda: None pytest: 5.2.2 IPython: 7.9.0 sphinx: 2.2.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3760/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
207862981 MDU6SXNzdWUyMDc4NjI5ODE= 1270 BUG: Resample on PeriodIndex not working? max-sixty 5635139 closed 0     10 2017-02-15T16:56:21Z 2020-05-30T02:34:17Z 2020-05-30T02:34:17Z MEMBER      

```python

import xarray as xr import pandas as pd da = xr.DataArray(pd.Series(1, pd.period_range('2000-1', '2000-12', freq='W')).rename_axis('date'))

da.resample('B', 'date', 'ffill')


TypeError Traceback (most recent call last) <ipython-input-1-eb64a66a8d1f> in <module>() 3 da = xr.DataArray(pd.Series(1, pd.period_range('2000-1', '2000-12', freq='W')).rename_axis('date')) 4 ----> 5 da.resample('B', 'date', 'ffill')

/Users/maximilian/drive/workspace/xarray/xarray/core/common.py in resample(self, freq, dim, how, skipna, closed, label, base, keep_attrs) 577 time_grouper = pd.TimeGrouper(freq=freq, how=how, closed=closed, 578 label=label, base=base) --> 579 gb = self.groupby_cls(self, group, grouper=time_grouper) 580 if isinstance(how, basestring): 581 f = getattr(gb, how)

/Users/maximilian/drive/workspace/xarray/xarray/core/groupby.py in init(self, obj, group, squeeze, grouper, bins, cut_kwargs) 242 raise ValueError('index must be monotonic for resampling') 243 s = pd.Series(np.arange(index.size), index) --> 244 first_items = s.groupby(grouper).first() 245 if first_items.isnull().any(): 246 full_index = first_items.index

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/generic.py in groupby(self, by, axis, level, as_index, sort, group_keys, squeeze, kwargs) 3989 return groupby(self, by=by, axis=axis, level=level, as_index=as_index, 3990 sort=sort, group_keys=group_keys, squeeze=squeeze, -> 3991 kwargs) 3992 3993 def asfreq(self, freq, method=None, how=None, normalize=False):

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/groupby.py in groupby(obj, by, kwds) 1509 raise TypeError('invalid type: %s' % type(obj)) 1510 -> 1511 return klass(obj, by, kwds) 1512 1513

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/groupby.py in init(self, obj, keys, axis, level, grouper, exclusions, selection, as_index, sort, group_keys, squeeze, **kwargs) 368 level=level, 369 sort=sort, --> 370 mutated=self.mutated) 371 372 self.obj = obj

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/groupby.py in _get_grouper(obj, key, axis, level, sort, mutated) 2390 # a passed-in Grouper, directly convert 2391 if isinstance(key, Grouper): -> 2392 binner, grouper, obj = key._get_grouper(obj) 2393 if key.key is None: 2394 return grouper, [], obj

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/tseries/resample.py in _get_grouper(self, obj) 1059 def _get_grouper(self, obj): 1060 # create the resampler and return our binner -> 1061 r = self._get_resampler(obj) 1062 r._set_binner() 1063 return r.binner, r.grouper, r.obj

/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/tseries/resample.py in _get_resampler(self, obj, kind) 1055 raise TypeError("Only valid with DatetimeIndex, " 1056 "TimedeltaIndex or PeriodIndex, " -> 1057 "but got an instance of %r" % type(ax).name) 1058 1059 def _get_grouper(self, obj):

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index' ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1270/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
569162754 MDU6SXNzdWU1NjkxNjI3NTQ= 3789 Remove groupby with multi-dimensional warning soon max-sixty 5635139 closed 0     0 2020-02-21T20:15:28Z 2020-05-06T16:39:35Z 2020-05-06T16:39:35Z MEMBER      

MCVE Code Sample

We have a very verbose warning in 0.15: it prints on every groupby on an object with multidimensional coords.

So the notebook I'm currently working on has red sections like: /home/mroos/.local/lib/python3.7/site-packages/xarray/core/common.py:664: FutureWarning: This DataArray contains multi-dimensional coordinates. In the future, the dimension order of these coordinates will be restored as well unless you specify restore_coord_dims=False. self, group, squeeze=squeeze, restore_coord_dims=restore_coord_dims /home/mroos/.local/lib/python3.7/site-packages/xarray/core/common.py:664: FutureWarning: This DataArray contains multi-dimensional coordinates. In the future, the dimension order of these coordinates will be restored as well unless you specify restore_coord_dims=False. self, group, squeeze=squeeze, restore_coord_dims=restore_coord_dims /home/mroos/.local/lib/python3.7/site-packages/xarray/core/common.py:664: FutureWarning: This DataArray contains multi-dimensional coordinates. In the future, the dimension order of these coordinates will be restored as well unless you specify restore_coord_dims=False. self, group, squeeze=squeeze, restore_coord_dims=restore_coord_dims /home/mroos/.local/lib/python3.7/site-packages/xarray/core/common.py:664: FutureWarning: This DataArray contains multi-dimensional coordinates. In the future, the dimension order of these coordinates will be restored as well unless you specify restore_coord_dims=False. self, group, squeeze=squeeze, restore_coord_dims=restore_coord_dims /home/mroos/.local/lib/python3.7/site-packages/xarray/core/common.py:664: FutureWarning: This DataArray contains multi-dimensional coordinates. In the future, the dimension order of these coordinates will be restored as well unless you specify restore_coord_dims=False. self, group, squeeze=squeeze, restore_coord_dims=restore_coord_dims

Unless there's a way of reducing its verbosity (e.g. only print once per session?), let's aim to push the change through and remove the warning soon?

```python

Your code here

In [2]: import xarray as xr

In [4]: import numpy as np

In [16]: da = xr.DataArray(np.random.rand(2,3), dims=list('ab'))

In [17]: da = da.assign_coords(foo=(('a','b'),np.random.rand(2,3)))

In [18]: da.groupby('a').mean(...)
[...]/python3.6/site-packages/xarray/core/common.py:664: FutureWarning: This DataArray contains multi-dimensional coordinates. In the future, the dimension order of these coordinates will be restored as well unless you specify restore_coord_dims=False. self, group, squeeze=squeeze, restore_coord_dims=restore_coord_dims Out[18]: <xarray.DataArray (a: 2)> array([0.59216558, 0.58616892]) Dimensions without coordinates: a

```

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.8 (default, Aug 7 2019, 17:28:10) [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] python-bits: 64 OS: Linux OS-release: [...] machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.utf8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: None xarray: 0.15.0 pandas: 0.25.3 numpy: 1.18.1 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.1.2 cartopy: None seaborn: 0.10.0 numbagg: None setuptools: 45.0.0 pip: 20.0.2 conda: None pytest: 5.3.2 IPython: 7.12.0 sphinx: 2.3.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3789/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
587900011 MDU6SXNzdWU1ODc5MDAwMTE= 3892 Update core developer list max-sixty 5635139 closed 0     0 2020-03-25T18:24:17Z 2020-04-07T19:28:25Z 2020-04-07T19:28:25Z MEMBER      

This is out of date: http://xarray.pydata.org/en/stable/roadmap.html#current-core-developers

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3892/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
561312864 MDU6SXNzdWU1NjEzMTI4NjQ= 3759 Truncate long lines in repr of coords max-sixty 5635139 closed 0     1 2020-02-06T22:41:13Z 2020-03-29T09:58:46Z 2020-03-29T09:58:45Z MEMBER      

MCVE Code Sample

```python xr.DataArray(coords=dict(a=' '.join(['hello world' for _ in range(100)])))

<xarray.DataArray ()> array(nan) Coordinates: a <U5999 'hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world' ```

Expected Output

<xarray.DataArray ()> array(nan) Coordinates: a <U5999 'hello world ... hello world'

Problem Description

I think mostly the same as https://github.com/pydata/xarray/issues/1319 but for coords

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 21:52:21) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: [...] machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.utf8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.1 xarray: 0.15.0 pandas: 0.25.3 numpy: 1.17.3 scipy: 1.3.2 netCDF4: 1.5.3 pydap: None h5netcdf: 0.7.4 h5py: 2.10.0 Nio: None zarr: None cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 2.7.0 distributed: 2.7.0 matplotlib: 3.1.2 cartopy: None seaborn: 0.9.0 numbagg: None setuptools: 41.6.0.post20191101 pip: 19.3.1 conda: None pytest: 5.2.2 IPython: 7.9.0 sphinx: 2.2.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3759/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
586450690 MDU6SXNzdWU1ODY0NTA2OTA= 3881 Flaky test: test_uamiv_format_write max-sixty 5635139 closed 0     2 2020-03-23T19:13:34Z 2020-03-23T20:32:15Z 2020-03-23T20:32:15Z MEMBER      

I've seen a couple of failures recently on this test. Flaky tests are really annoying and would be great to fix or if impossible, remove it. Does anyone have any ideas what's causing this?

``` __ TestPseudoNetCDFFormat.test_uamiv_format_write __

self = <xarray.tests.test_backends.TestPseudoNetCDFFormat object at 0x7f15352b9d00>

def test_uamiv_format_write(self):
    fmtkw = {"format": "uamiv"}

    expected = open_example_dataset(
        "example.uamiv", engine="pseudonetcdf", backend_kwargs=fmtkw
    )
    with self.roundtrip(
        expected,
        save_kwargs=fmtkw,
        open_kwargs={"backend_kwargs": fmtkw},
        allow_cleanup_failure=True,
    ) as actual:
      assert_identical(expected, actual)

E AssertionError: Left and right Dataset objects are not identical E
E
E
E Differing attributes: E L WTIME: 190117 E R WTIME: 190118

xarray/tests/test_backends.py:3563: AssertionError ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3881/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
576471089 MDU6SXNzdWU1NzY0NzEwODk= 3833 html repr fails on non-str Dataset keys max-sixty 5635139 closed 0     0 2020-03-05T19:10:31Z 2020-03-23T05:39:00Z 2020-03-23T05:39:00Z MEMBER      

MCVE Code Sample

```python

In a notebook with html repr enabled

xr.Dataset({0: (('a','b'), np.random.rand(2,3))})

gives:


AttributeError Traceback (most recent call last) /j/office/app/research-python/conda/envs/2019.10/lib/python3.7/site-packages/IPython/core/formatters.py in call(self, obj) 343 method = get_real_method(obj, self.print_method) 344 if method is not None: --> 345 return method() 346 return None 347 else:

~/.local/lib/python3.7/site-packages/xarray/core/dataset.py in repr_html(self) 1632 if OPTIONS["display_style"] == "text": 1633 return f"

{escape(repr(self))}
" -> 1634 return formatting_html.dataset_repr(self) 1635 1636 def info(self, buf=None) -> None:

~/.local/lib/python3.7/site-packages/xarray/core/formatting_html.py in dataset_repr(ds) 268 dim_section(ds), 269 coord_section(ds.coords), --> 270 datavar_section(ds.data_vars), 271 attr_section(ds.attrs), 272 ]

~/.local/lib/python3.7/site-packages/xarray/core/formatting_html.py in _mapping_section(mapping, name, details_func, max_items_collapse, enabled) 165 return collapsible_section( 166 name, --> 167 details=details_func(mapping), 168 n_items=n_items, 169 enabled=enabled,

~/.local/lib/python3.7/site-packages/xarray/core/formatting_html.py in summarize_vars(variables) 131 vars_li = "".join( 132 f"

  • {summarize_variable(k, v)}
  • " --> 133 for k, v in variables.items() 134 ) 135

    ~/.local/lib/python3.7/site-packages/xarray/core/formatting_html.py in <genexpr>(.0) 131 vars_li = "".join( 132 f"

  • {summarize_variable(k, v)}
  • " --> 133 for k, v in variables.items() 134 ) 135

    ~/.local/lib/python3.7/site-packages/xarray/core/formatting_html.py in summarize_variable(name, var, is_index, dtype, preview) 96 cssclass_idx = " class='xr-has-index'" if is_index else "" 97 dims_str = f"({', '.join(escape(dim) for dim in var.dims)})" ---> 98 name = escape(name) 99 dtype = dtype or escape(str(var.dtype)) 100

    /j/office/app/research-python/conda/envs/2019.10/lib/python3.7/html/init.py in escape(s, quote) 17 translated. 18 """ ---> 19 s = s.replace("&", "&") # Must be done first! 20 s = s.replace("<", "<") 21 s = s.replace(">", ">")

    AttributeError: 'int' object has no attribute 'replace'

    <xarray.Dataset> Dimensions: (a: 2, b: 3) Dimensions without coordinates: a, b Data variables: 0 (a, b) float64 0.5327 0.927 0.8582 0.8825 0.9478 0.09475 ```

    Problem Description

    I think this may be an uncomplicated fix: coerce the keys to str

    Output of xr.show_versions()

    INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 21:52:21) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: ... machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.utf8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.1 xarray: 0.15.0 pandas: 1.0.1 numpy: 1.17.3 scipy: 1.3.2 netCDF4: 1.5.3 pydap: None h5netcdf: 0.7.4 h5py: 2.10.0 Nio: None zarr: None cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 2.7.0 distributed: 2.7.0 matplotlib: 3.1.2 cartopy: None seaborn: 0.9.0 numbagg: installed setuptools: 41.6.0.post20191101 pip: 19.3.1 conda: None pytest: 5.2.2 IPython: 7.9.0 sphinx: 2.2.1
    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/3833/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    576512167 MDU6SXNzdWU1NzY1MTIxNjc= 3835 3D DataArray.to_pandas() fails with a bad message max-sixty 5635139 closed 0     1 2020-03-05T20:30:06Z 2020-03-20T17:14:41Z 2020-03-20T17:14:41Z MEMBER      

    Panel is removed from pandas (as a result of the success of xarray! :grin: ), but we're still attempting to call it from .to_pandas()

    MCVE Code Sample

    ```python

    Your code here

    In [4]: import numpy as np

    In [1]: import xarray as xr
    In [5]: xr.DataArray(np.random.rand(2,3,4)).to_pandas()


    TypeError Traceback (most recent call last) <ipython-input-5-7d1b667d5cac> in <module> ----> 1 xr.DataArray(np.random.rand(2,3,4)).to_pandas()

    ~/workspace/corpfin/.venv/lib64/python3.6/site-packages/xarray/core/dataarray.py in to_pandas(self) 2267 ) 2268 indexes = [self.get_index(dim) for dim in self.dims] -> 2269 return constructor(self.values, *indexes) 2270 2271 def to_dataframe(self, name: Hashable = None) -> pd.DataFrame:

    TypeError: object() takes no parameters

    ```

    Expected Output

    Either a MultiIndexed DataFrame or a proper error (4D gives ValueError: cannot convert arrays with 4 dimensions into pandas objects)

    Output of xr.show_versions()

    INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 21:52:21) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: ... machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.utf8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.1 xarray: 0.15.0 pandas: 1.0.1 numpy: 1.17.3 scipy: 1.3.2 netCDF4: 1.5.3 pydap: None h5netcdf: 0.7.4 h5py: 2.10.0 Nio: None zarr: None cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 2.7.0 distributed: 2.7.0 matplotlib: 3.1.2 cartopy: None seaborn: 0.9.0 numbagg: installed setuptools: 41.6.0.post20191101 pip: 19.3.1 conda: None pytest: 5.2.2 IPython: 7.9.0 sphinx: 2.2.1
    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/3835/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    575080574 MDU6SXNzdWU1NzUwODA1NzQ= 3825 Accept lambda in methods with one obvious xarray argument max-sixty 5635139 closed 0     3 2020-03-04T01:52:46Z 2020-03-20T17:14:20Z 2020-03-20T17:14:20Z MEMBER      

    Branching from https://github.com/pydata/xarray/issues/3770

    Here's the proposal: allow lambdas on methods where the primary argument is a single xarray object, and interpret lambas as though they'd be supplied in a pipe method followed by the current method. Taking the example from the linked issue:

    ```python

    In [1]: import xarray as xr

    In [2]: import numpy as np

    In [3]: da = xr.DataArray(np.random.rand(2,3))

    In [4]: da.where(da > 0.5)
    Out[4]: <xarray.DataArray (dim_0: 2, dim_1: 3)> array([[ nan, 0.71442406, nan], [0.55748705, nan, nan]]) Dimensions without coordinates: dim_0, dim_1

    this should be equivalent (currently not valid)

    In [5]: da.where(lambda x: x > 0.5)

    the longer version (currently works)

    In [5]: da.pipe(lambda x: x.where(x > 0.5))

    ```

    Others I miss from pandas: assign, and loc. I haven't gone through the list though assume there are others; we don't have to agree 100% on the list before starting with the most obvious ones, assuming we're in agreement with the principle.

    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/3825/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    573052057 MDU6SXNzdWU1NzMwNTIwNTc= 3814 Allow an ellipsis in stack? max-sixty 5635139 closed 0     1 2020-02-28T22:57:58Z 2020-03-19T23:19:18Z 2020-03-19T23:19:18Z MEMBER      

    Could we add the ability to use an ellipsis to represent all dims in more places? For example, stack:

    MCVE Code Sample

    ```python

    In [14]: data = np.arange(15, 301, 15).reshape(2, 10)
    ...: da = xr.DataArray(data, dims=('y', 'x'), attrs={'test': 'test'})
    ...:

    In [15]: da.stack(z=['x','y'])
    Out[15]: <xarray.DataArray (z: 20)> array([ 15, 165, 30, 180, 45, 195, 60, 210, 75, 225, 90, 240, 105, 255, 120, 270, 135, 285, 150, 300]) Coordinates: * z (z) MultiIndex - x (z) int64 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 - y (z) int64 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 Attributes: test: test

    In [16]: da.stack(z=[...])

    KeyError Traceback (most recent call last) <ipython-input-16-a92d0ffe931a> in <module> ----> 1 da.stack(z=[...])

    ~/workspace/./.venv/lib64/python3.6/site-packages/xarray/core/dataarray.py in stack(self, dimensions, dimensions_kwargs) 1739 DataArray.unstack 1740 """ -> 1741 ds = self._to_temp_dataset().stack(dimensions, dimensions_kwargs) 1742 return self._from_temp_dataset(ds) 1743

    ~/workspace/./.venv/lib64/python3.6/site-packages/xarray/core/dataset.py in stack(self, dimensions, **dimensions_kwargs) 3291 result = self 3292 for new_dim, dims in dimensions.items(): -> 3293 result = result._stack_once(dims, new_dim) 3294 return result 3295

    ~/workspace/./.venv/lib64/python3.6/site-packages/xarray/core/dataset.py in _stack_once(self, dims, new_dim) 3246 3247 # consider dropping levels that are unused? -> 3248 levels = [self.get_index(dim) for dim in dims] 3249 idx = utils.multiindex_from_product_levels(levels, names=dims) 3250 variables[new_dim] = IndexVariable(new_dim, idx)

    ~/workspace/./.venv/lib64/python3.6/site-packages/xarray/core/dataset.py in <listcomp>(.0) 3246 3247 # consider dropping levels that are unused? -> 3248 levels = [self.get_index(dim) for dim in dims] 3249 idx = utils.multiindex_from_product_levels(levels, names=dims) 3250 variables[new_dim] = IndexVariable(new_dim, idx)

    ~/workspace/./.venv/lib64/python3.6/site-packages/xarray/core/common.py in get_index(self, key) 378 """ 379 if key not in self.dims: --> 380 raise KeyError(key) 381 382 try:

    KeyError: Ellipsis

    ```

    Expected Output

    Identical between ... and listing all dimensions

    Output of xr.show_versions()

    INSTALLED VERSIONS ------------------ commit: None python: 3.6.8 (default, Aug 7 2019, 17:28:10) python-bits: 64 OS: Linux machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.utf8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: None xarray: 0.15.0 pandas: 0.25.3 numpy: 1.18.1 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2.11.0 distributed: None matplotlib: 3.1.2 cartopy: None seaborn: 0.10.0 numbagg: None setuptools: 45.0.0 pip: 20.0.2 conda: None pytest: 5.3.2 IPython: 7.12.0 sphinx: 2.3.1
    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/3814/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    565626748 MDU6SXNzdWU1NjU2MjY3NDg= 3770 `where` ignores incorrectly typed arguments max-sixty 5635139 closed 0     5 2020-02-15T00:24:37Z 2020-03-07T04:38:12Z 2020-03-07T04:38:12Z MEMBER      

    MCVE Code Sample

    I optimistically tried passing a lambda to where* to avoid another .pipe. To my surprise it ran! But unfortunately it just ignored the lambda, as though all values were True.

    ```python

    In [1]: import xarray as xr

    In [2]: import numpy as np

    In [3]: da = xr.DataArray(np.random.rand(2,3))

    In [4]: da.where(da > 0.5)
    Out[4]: <xarray.DataArray (dim_0: 2, dim_1: 3)> array([[ nan, 0.71442406, nan], [0.55748705, nan, nan]]) Dimensions without coordinates: dim_0, dim_1

    this should fail!

    In [5]: da.where(lambda x: x > 0.5)
    Out[5]: <xarray.DataArray (dim_0: 2, dim_1: 3)> array([[0.26085668, 0.71442406, 0.05816167], [0.55748705, 0.15293073, 0.12766453]]) Dimensions without coordinates: dim_0, dim_1 ```

    Expected Output

    Raise a TypeError when passed the lambda

    * (maybe that could work, but separate discussion)

    Output of xr.show_versions()

    INSTALLED VERSIONS ------------------ commit: None python: 3.6.8 (default, Aug 7 2019, 17:28:10) [...] python-bits: 64 OS: Linux OS-release: [...] machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.utf8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: None xarray: 0.14.1 pandas: 0.25.3 numpy: 1.18.1 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.1.2 cartopy: None seaborn: 0.10.0 numbagg: None setuptools: 45.0.0 pip: 20.0.2 conda: None pytest: 5.3.2 IPython: 7.11.1 sphinx: 2.3.1
    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/3770/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    305663416 MDU6SXNzdWUzMDU2NjM0MTY= 1992 Canonical approach for new vectorized functions max-sixty 5635139 closed 0     4 2018-03-15T18:09:08Z 2020-02-29T07:22:01Z 2020-02-29T07:22:00Z MEMBER      

    We are moving some code over from pandas to Xarray, and one of the biggest missing features is exponential functions, e.g. series.ewm(span=20).mean().

    It looks like we can write these as gufuncs without too much trouble in numba. But I also notice that numbagg hasn't changed in a while and that we chose bottleneck for many of the functions in Xarray.

    • Is numba a good approach for these?
    • As well as our own internal use, could we add numba functions to Xarray, or are there dependency issues?
    • Tangentially, I'd be interested why we're using bottleneck rather than numbagg for the existing functions
    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/1992/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    567993968 MDU6SXNzdWU1Njc5OTM5Njg= 3782 Add groupby.pipe? max-sixty 5635139 closed 0     4 2020-02-20T01:33:31Z 2020-02-21T14:37:44Z 2020-02-21T14:37:44Z MEMBER      

    MCVE Code Sample

    ```python In [1]: import xarray as xr

    In [3]: import numpy as np

    In [4]: ds = xr.Dataset( ...: {"foo": (("x", "y"), np.random.rand(4, 3))}, ...: coords={"x": [10, 20, 30, 40], "letters": ("x", list("abba"))}, ...: )

    In [5]: ds.groupby('letters')
    Out[5]: DatasetGroupBy, grouped over 'letters' 2 groups with labels 'a', 'b'.

    In [8]: ds.groupby('letters').sum(...) / ds.groupby('letters').count(...)
    Out[8]: <xarray.Dataset> Dimensions: (letters: 2) Coordinates: * letters (letters) object 'a' 'b' Data variables: foo (letters) float64 0.4182 0.4995

    In [9]: ds.groupby('letters').pipe(lambda x: x.sum() / x.count())

    AttributeError Traceback (most recent call last) <ipython-input-9-c9b142ea051b> in <module> ----> 1 ds.groupby('letters').pipe(lambda x: x.sum() / x.count())

    AttributeError: 'DatasetGroupBy' object has no attribute 'pipe'

    ```

    Expected Output

    I think we could add groupby.pipe, as a convenience?

    Output of xr.show_versions()

    In [12]: xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.6.8 (default, Aug 7 2019, 17:28:10) [...] python-bits: 64 OS: Linux OS-release: [...] machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.utf8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: None xarray: 0.14.1 pandas: 0.25.3 numpy: 1.18.1 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.1.2 cartopy: None seaborn: 0.10.0 numbagg: None setuptools: 45.0.0 pip: 20.0.2 conda: None pytest: 5.3.2 IPython: 7.11.1 sphinx: 2.3.1
    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/3782/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    521754870 MDU6SXNzdWU1MjE3NTQ4NzA= 3514 Should we cache some small properties? max-sixty 5635139 open 0     7 2019-11-12T19:28:21Z 2019-11-16T04:32:11Z   MEMBER      

    I was doing some profiling on isel, and see there are some properties that (I think) never change, but are called frequently. Should we cache these on their object?

    Pandas uses cache_readonly for these cases.

    Here's a case: we call LazilyOuterIndexedArray.shape frequently when doing a simple indexing operation. Each call takes ~150µs. An attribute lookup on a python object takes ~50ns (i.e. 3000x faster). IIUC the result on that property should never change.

    I don't think this is the solution to performance issues, and there's some additional complexity. Could they be easy & small wins, though?

    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/3514/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
        xarray 13221727 issue
    514682267 MDU6SXNzdWU1MTQ2ODIyNjc= 3467 Errors with mypy 0.740 max-sixty 5635139 closed 0     5 2019-10-30T13:46:36Z 2019-10-30T22:51:40Z 2019-10-30T14:12:58Z MEMBER      

    ``` mypy .

    xarray/core/common.py:337: error: Too few arguments for "init_subclass" xarray/core/dataset.py:398: error: Too few arguments for "init_subclass" xarray/core/dataset.py:873: error: Incompatible default for argument "attrs" (default has type "object", argument has type "Optional[Dict[Hashable, Any]]") xarray/core/dataset.py:874: error: Incompatible default for argument "indexes" (default has type "object", argument has type "Optional[Dict[Any, Any]]") xarray/core/dataset.py:875: error: Incompatible default for argument "encoding" (default has type "object", argument has type "Optional[Dict[Any, Any]]") xarray/core/dataset.py:922: error: Incompatible default for argument "attrs" (default has type "object", argument has type "Optional[Dict[Hashable, Any]]") xarray/core/dataset.py:923: error: Incompatible default for argument "indexes" (default has type "object", argument has type "Dict[Hashable, Any]") xarray/core/dataset.py:937: error: Incompatible default for argument "attrs" (default has type "object", argument has type "Dict[Hashable, Any]") xarray/core/dataarray.py:213: error: Too few arguments for "init_subclass" Found 9 errors in 3 files (checked 122 source files)

    ```

    mypy --version mypy 0.740

    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/3467/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    508791645 MDU6SXNzdWU1MDg3OTE2NDU= 3414 Allow ellipsis in place of xr.ALL_DIMS? max-sixty 5635139 closed 0     2 2019-10-18T00:44:48Z 2019-10-28T21:14:42Z 2019-10-28T21:14:42Z MEMBER      

    @crusaderky had a good idea to allow ellipsis (...) as a placeholder for 'other dims' in transpose.

    What about using it as a placeholder for xr.ALL_DIMS in groupby etc operations? I find it nicer than custom sentinel values, and I think should be fairly low-confusion—thoughts?

    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/3414/reactions",
        "total_count": 2,
        "+1": 2,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    490752037 MDU6SXNzdWU0OTA3NTIwMzc= 3293 Drop 3.5 after 0.13? max-sixty 5635139 closed 0     8 2019-09-08T13:05:29Z 2019-10-08T21:23:47Z 2019-10-08T21:23:47Z MEMBER      

    I just saw https://numpy.org/neps/nep-0029-deprecation_policy.html, which suggests dropping 3.5 support earlier this year (though NEP is dated to this July)

    Ofc, existing xarray versions would still work with 3.5, only upgrades would require the new version (which I know everyone knows, but reinforces we don't need to wait for everyone to move off 3.5)

    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/3293/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    484711431 MDU6SXNzdWU0ODQ3MTE0MzE= 3257 0.13.0 release max-sixty 5635139 closed 0     43 2019-08-23T21:04:21Z 2019-09-18T02:12:49Z 2019-09-18T01:33:50Z MEMBER      

    What do we think about a minor release soon?

    My colleague just hit https://github.com/pydata/xarray/pull/3145 which reminded me the current release can make filing an issue confusing for folks

    We've had a v good pipeline of changes in just a month: https://github.com/pydata/xarray/blob/master/doc/whats-new.rst#L18

    I'm happy to help with this; ref https://github.com/pydata/xarray/issues/2998#issuecomment-516218628

    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/3257/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    493108860 MDU6SXNzdWU0OTMxMDg4NjA= 3308 NetCDF tests failing max-sixty 5635139 closed 0     4 2019-09-13T02:29:39Z 2019-09-13T15:36:27Z 2019-09-13T15:32:46Z MEMBER      

    (edit: original failure was mistaken) Does anyone know off hand why this is failing?

    ResolvePackageNotFound: - pandas=0.19 - python=3.5.0

    Worst case we could drop it... https://github.com/pydata/xarray/issues/3293

    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/3308/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    485437811 MDU6SXNzdWU0ODU0Mzc4MTE= 3265 Sparse tests failing on master max-sixty 5635139 closed 0     6 2019-08-26T20:34:21Z 2019-08-27T00:01:18Z 2019-08-27T00:01:07Z MEMBER      

    https://dev.azure.com/xarray/xarray/_build/results?buildId=695

    ```python

    =================================== FAILURES =================================== ___ TestSparseVariable.test_unary_op ___

    self = <xarray.tests.test_sparse.TestSparseVariable object at 0x7f24f0b21b70>

    def test_unary_op(self):
    
      sparse.utils.assert_eq(-self.var.data, -self.data)
    

    E AttributeError: module 'sparse' has no attribute 'utils'

    xarray/tests/test_sparse.py:285: AttributeError ___ TestSparseVariable.test_univariate_ufunc _____

    self = <xarray.tests.test_sparse.TestSparseVariable object at 0x7f24ebc2bb38>

    def test_univariate_ufunc(self):
    
      sparse.utils.assert_eq(np.sin(self.data), xu.sin(self.var).data)
    

    E AttributeError: module 'sparse' has no attribute 'utils'

    xarray/tests/test_sparse.py:290: AttributeError ___ TestSparseVariable.test_bivariate_ufunc ______

    self = <xarray.tests.test_sparse.TestSparseVariable object at 0x7f24f02a7e10>

    def test_bivariate_ufunc(self):
    
      sparse.utils.assert_eq(np.maximum(self.data, 0), xu.maximum(self.var, 0).data)
    

    E AttributeError: module 'sparse' has no attribute 'utils'

    xarray/tests/test_sparse.py:293: AttributeError ___ TestSparseVariable.testpickle ____

    self = <xarray.tests.test_sparse.TestSparseVariable object at 0x7f24f04f2c50>

    def test_pickle(self):
        v1 = self.var
        v2 = pickle.loads(pickle.dumps(v1))
    
      sparse.utils.assert_eq(v1.data, v2.data)
    

    E AttributeError: module 'sparse' has no attribute 'utils'

    xarray/tests/test_sparse.py:307: AttributeError ```

    Any ideas?

    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/3265/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    469983439 MDU6SXNzdWU0Njk5ODM0Mzk= 3144 h5py raising on xr.show_versions() max-sixty 5635139 closed 0     3 2019-07-18T20:51:26Z 2019-07-20T06:18:48Z 2019-07-20T06:18:48Z MEMBER      

    Any ideas why __hdf5libversion__ wouldn't be available? Shall I put a try / except around it?

    ```python In [4]: import xarray as xr

    In [5]: xr.show_versions()

    ModuleNotFoundError Traceback (most recent call last) /usr/local/lib/python3.7/site-packages/xarray/util/print_versions.py in netcdf_and_hdf5_versions() 64 try: ---> 65 import netCDF4 66 libhdf5_version = netCDF4.hdf5libversion

    ModuleNotFoundError: No module named 'netCDF4'

    During handling of the above exception, another exception occurred:

    AttributeError Traceback (most recent call last) <ipython-input-5-6f391305f2fe> in <module> ----> 1 xr.show_versions()

    /usr/local/lib/python3.7/site-packages/xarray/util/print_versions.py in show_versions(file) 78 sys_info = get_sys_info() 79 ---> 80 sys_info.extend(netcdf_and_hdf5_versions()) 81 82 deps = [

    /usr/local/lib/python3.7/site-packages/xarray/util/print_versions.py in netcdf_and_hdf5_versions() 69 try: 70 import h5py ---> 71 libhdf5_version = h5py.hdf5libversion 72 except ImportError: 73 pass

    AttributeError: module 'h5py' has no attribute 'hdf5libversion' ```

    I check I'm on the latest h5py: pip install h5py -U Thu Jul 18 16:47:29 2019 Requirement already up-to-date: h5py in ./Library/Python/3.7/lib/python/site-packages (2.9.0) Requirement already satisfied, skipping upgrade: numpy>=1.7 in /usr/local/lib/python3.7/site-packages (from h5py) (1.16.4) Requirement already satisfied, skipping upgrade: six in ./Library/Python/3.7/lib/python/site-packages (from h5py) (1.12.0)

    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/3144/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    467015096 MDU6SXNzdWU0NjcwMTUwOTY= 3098 Codecov bot comments? max-sixty 5635139 closed 0     2 2019-07-11T17:21:46Z 2019-07-18T01:12:38Z 2019-07-18T01:12:38Z MEMBER      

    ref https://github.com/pydata/xarray/pull/3090#issuecomment-510323490

    Do we want the bot commenting on the PR, at least while the early checks are wrong? People can always click on Details in the Codecov check (e.g. https://codecov.io/gh/pydata/xarray/compare/8f0d9e5c9909c93a90306ed7cb5a80c1c2e1c97d...ab6960f623017afdc99c34bcbb69b402aea3f7d4/diff) to see a full report.

    Happy to PR to disable, lmk

    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/3098/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    461127850 MDU6SXNzdWU0NjExMjc4NTA= 3050 Pipe Operator? max-sixty 5635139 closed 0     1 2019-06-26T18:51:31Z 2019-07-05T07:12:07Z 2019-07-05T07:12:07Z MEMBER      

    I realize this is a topic primed for bike-shedding, so I've held off suggesting, but here goes...

    Is there any interest in a pipe operator to make xarray syntax easier? Our internal code has much code like:

    ```python

    delinq_c = ( base_ds['delinq_comp_array'] # lots of pipe lines! .pipe(lambda x: x - base_ds['delinq_comp_array'].mean('fsym_id')) .pipe(lambda x: x / delinq_diff) .pipe(lambda x: x * 2) )

    ```

    ...with lots of new lines starting with .pipe. The fluent code is great for sequences of logic, but the requirement to have .pipe on each line adds verbosity.

    The addition of a pipe operator would allow for:

    ```python

    delinq_c = ( base_ds['delinq_comp_array'] >> lambda x: x - base_ds['delinq_comp_array'].mean('fsym_id') >> lambda x: x / delinq_diff >> lambda x: x * 2 )

    ```

    This requires (ab)using an existing python operator, such as >> (bitshift operator) or | (or operator). Airflow uses >>, Beam (and bash) use |. While xarray doesn't allow either to be used, other libraries might incorrectly assume their presence implies they work consistently with those definitions.

    Python has explicitly not added this, nor reduced the character count of lambda x:, so this would be somewhat a dissent from their standards, and introduce potential confusion to completely new xarray users.

    I remember some discussions at pandas on similar topics. I can't find them all, but here's an issue re adding the X term https://github.com/pandas-dev/pandas/issues/13133

    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/3050/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    423511704 MDU6SXNzdWU0MjM1MTE3MDQ= 2833 Integrate has undefined name 'dim' max-sixty 5635139 closed 0     2 2019-03-20T23:09:19Z 2019-07-05T07:10:37Z 2019-07-05T07:10:37Z MEMBER      

    https://github.com/pydata/xarray/blob/master/xarray/core/dataset.py#L4085

    Should that be called coord or dim? Currently there's a variable that's undefined:

    python raise ValueError('Coordinate {} does not exist.'.format(dim)) I would have made a quick fix but not sure the correct name

    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/2833/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    448340294 MDU6SXNzdWU0NDgzNDAyOTQ= 2990 Some minor errors in repo / flake8 max-sixty 5635139 closed 0     2 2019-05-24T20:24:04Z 2019-06-24T18:18:25Z 2019-06-24T18:18:24Z MEMBER      

    Currently we use pycodestyle: https://github.com/pydata/xarray/blob/ccd0b047ea8ca89c68ab6cfa942557e676e7d402/.travis.yml#L63

    I think we used to use flake8. I can't find / remember the reason we moved to pycodestyle.

    master has some non-trivial issues that flake would catch, including a test overwritting another and undefined variables: ``` flake8 xarray --ignore=I,W503,W504,F401,E265,E402

    xarray/core/options.py:62:8: F632 use ==/!= to compare str, bytes, and int literals xarray/core/dataset.py:4148:69: F821 undefined name 'dim' xarray/backends/netCDF4_.py:177:12: F632 use ==/!= to compare str, bytes, and int literals xarray/tests/test_dataarray.py:1264:9: F841 local variable 'foo' is assigned to but never used xarray/tests/test_dataarray.py:1270:18: F821 undefined name 'x' xarray/tests/test_dataarray.py:1301:5: F811 redefinition of unused 'test_reindex_fill_value' from line 1262 xarray/tests/test_dataarray.py:1647:16: F632 use ==/!= to compare str, bytes, and int literals xarray/tests/test_dataarray.py:1648:16: F632 use ==/!= to compare str, bytes, and int literals xarray/tests/test_dataset.py:4759:8: F632 use ==/!= to compare str, bytes, and int literals xarray/tests/test_dataset.py:4761:10: F632 use ==/!= to compare str, bytes, and int literals xarray/tests/test_distributed.py:62:9: F811 redefinition of unused 'loop' from line 12 xarray/tests/test_distributed.py:92:9: F811 redefinition of unused 'loop' from line 12 xarray/tests/test_distributed.py:117:49: F811 redefinition of unused 'loop' from line 12 xarray/tests/test_distributed.py:141:53: F811 redefinition of unused 'loop' from line 12 xarray/tests/test_distributed.py:152:51: F811 redefinition of unused 'loop' from line 12 ```

    Happy to fix these in a PR. For ensuring these don't crop up again, any objection to flake8?

    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/2990/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    365526259 MDU6SXNzdWUzNjU1MjYyNTk= 2451 Shift changes non-float arrays to object, even for shift=0 max-sixty 5635139 closed 0     2 2018-10-01T15:50:38Z 2019-03-04T16:31:57Z 2019-03-04T16:31:57Z MEMBER      

    ```python In [15]: xr.DataArray(np.random.randint(2,size=(100,100)).astype(bool)).shift(dim_0=0) Out[15]: <xarray.DataArray (dim_0: 100, dim_1: 100)> array([[False, True, True, ..., True, True, False], [False, True, False, ..., False, True, True], [False, True, False, ..., False, True, False], ..., [False, True, False, ..., False, True, True], [True, False, True, ..., False, False, False], [False, True, True, ..., True, True, False]], dtype=object) # <-- could be bool Dimensions without coordinates: dim_0, dim_1

    ```

    Problem description

    This causes memory bloat

    Expected Output

    As above with dtype=bool

    Output of xr.show_versions()

    In [16]: xr.show_versions() INSTALLED VERSIONS ------------------ commit: f9c4169150286fa1aac020ab965380ed21fe1148 python: 2.7.15.final.0 python-bits: 64 OS: Darwin OS-release: 18.0.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None xarray: 0.10.9+12.gf9c41691 pandas: 0.22.0 numpy: 1.14.2 scipy: 1.0.0 netCDF4: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None PseudonetCDF: None rasterio: None iris: None bottleneck: 1.2.1 cyordereddict: None dask: None distributed: None matplotlib: 2.1.2 cartopy: None seaborn: 0.8.1 setuptools: 39.2.0 pip: 18.0 conda: None pytest: 3.6.3 IPython: 5.8.0 sphinx: None

    The shift=0 is mainly theoretical. To avoid casting to object in practical scenarios, we could add a fill_value argument (e.g. fill_value=False) and fill with that rather than NaN

    CC @Ivocrnkovic

    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/2451/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    170305429 MDU6SXNzdWUxNzAzMDU0Mjk= 957 BUG: Repr on inherited classes is incorrect max-sixty 5635139 closed 0     3 2016-08-10T00:58:46Z 2019-02-26T01:28:23Z 2019-02-26T01:28:23Z MEMBER      

    This is extremely minor, I generally wouldn't report it.

    We're using classes inherited from Dataset more & more - this works really well for classes with a lot of array-like properties that can be aligned, and allows @property to lazily compute some calcs. (any feedback on this approach very welcome)

    The top of the repr is incorrect

    python <xarray.SecurityMeasure> Dimensions: (date: 6647, security: 285, sub_measure: 4) Coordinates: ...

    Could just be the qualified name.

    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/957/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    168901028 MDU6SXNzdWUxNjg5MDEwMjg= 934 Should indexing be possible on 1D coords, even if not dims? max-sixty 5635139 closed 0     6 2016-08-02T14:33:43Z 2019-01-27T06:49:52Z 2019-01-27T06:49:52Z MEMBER      

    ``` python In [1]: arr = xr.DataArray(np.random.rand(4, 3), ...: ...: [('time', pd.date_range('2000-01-01', periods=4)), ...: ...: ('space', ['IA', 'IL', 'IN'])]) ...: ...:

    In [17]: arr.coords['space2'] = ('space', ['A','B','C'])

    In [18]: arr Out[18]: <xarray.DataArray (time: 4, space: 3)> array([[ 0.05187049, 0.04743067, 0.90329666], [ 0.59482538, 0.71014366, 0.86588207], [ 0.51893157, 0.49442107, 0.10697737], [ 0.16068189, 0.60756757, 0.31935279]]) Coordinates: * time (time) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 2000-01-04 * space (space) |S2 'IA' 'IL' 'IN' space2 (space) |S1 'A' 'B' 'C' ```

    Now try to select on the space2 coord:

    ``` python In [19]: arr.sel(space2='A')


    ValueError Traceback (most recent call last) <ipython-input-19-eae5e4b64758> in <module>() ----> 1 arr.sel(space2='A')

    /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/xarray/core/dataarray.pyc in sel(self, method, tolerance, indexers) 601 """ 602 return self.isel(indexing.remap_label_indexers( --> 603 self, indexers, method=method, tolerance=tolerance)) 604 605 def isel_points(self, dim='points', **indexers):

    /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/xarray/core/dataarray.pyc in isel(self, indexers) 588 DataArray.sel 589 """ --> 590 ds = self._to_temp_dataset().isel(indexers) 591 return self._from_temp_dataset(ds) 592

    /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/xarray/core/dataset.pyc in isel(self, **indexers) 908 invalid = [k for k in indexers if k not in self.dims] 909 if invalid: --> 910 raise ValueError("dimensions %r do not exist" % invalid) 911 912 # all indexers should be int, slice or np.ndarrays

    ValueError: dimensions ['space2'] do not exist ```

    Is there an easier way to do this? I couldn't think of anything...

    CC @justinkuosixty

    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/934/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue
    399538493 MDU6SXNzdWUzOTk1Mzg0OTM= 2681 Close stale issues? max-sixty 5635139 closed 0     1 2019-01-15T21:14:54Z 2019-01-23T02:51:25Z 2019-01-23T02:51:25Z MEMBER      

    As discussed with @jhamman , I've been attempting to manually close / follow up on stale issues & PRs, given we had 500+ issues

    Any thoughts on doing this automatically with https://github.com/probot/stale?

    This could ask for an affirmation after an issue has been uncommented for a year, and then close a month later if there's no response

    {
        "url": "https://api.github.com/repos/pydata/xarray/issues/2681/reactions",
        "total_count": 0,
        "+1": 0,
        "-1": 0,
        "laugh": 0,
        "hooray": 0,
        "confused": 0,
        "heart": 0,
        "rocket": 0,
        "eyes": 0
    }
      completed xarray 13221727 issue

    Next page

    Advanced export

    JSON shape: default, array, newline-delimited, object

    CSV options:

    CREATE TABLE [issues] (
       [id] INTEGER PRIMARY KEY,
       [node_id] TEXT,
       [number] INTEGER,
       [title] TEXT,
       [user] INTEGER REFERENCES [users]([id]),
       [state] TEXT,
       [locked] INTEGER,
       [assignee] INTEGER REFERENCES [users]([id]),
       [milestone] INTEGER REFERENCES [milestones]([id]),
       [comments] INTEGER,
       [created_at] TEXT,
       [updated_at] TEXT,
       [closed_at] TEXT,
       [author_association] TEXT,
       [active_lock_reason] TEXT,
       [draft] INTEGER,
       [pull_request] TEXT,
       [body] TEXT,
       [reactions] TEXT,
       [performed_via_github_app] TEXT,
       [state_reason] TEXT,
       [repo] INTEGER REFERENCES [repos]([id]),
       [type] TEXT
    );
    CREATE INDEX [idx_issues_repo]
        ON [issues] ([repo]);
    CREATE INDEX [idx_issues_milestone]
        ON [issues] ([milestone]);
    CREATE INDEX [idx_issues_assignee]
        ON [issues] ([assignee]);
    CREATE INDEX [idx_issues_user]
        ON [issues] ([user]);
    Powered by Datasette · Queries took 49.191ms · About: xarray-datasette