home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

506 rows where user = 5635139 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, draft, state_reason, created_at (date), updated_at (date), closed_at (date)

type 2

  • pull 372
  • issue 134

state 2

  • closed 482
  • open 24

repo 1

  • xarray 506
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
2272299822 PR_kwDOAMm_X85uL82a 8989 Skip flaky `test_open_mfdataset_manyfiles` test max-sixty 5635139 closed 0     0 2024-04-30T19:24:41Z 2024-04-30T20:27:04Z 2024-04-30T19:46:34Z MEMBER   0 pydata/xarray/pulls/8989

Don't just xfail, and not only on windows, since it can crash the worker

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8989/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2271670475 PR_kwDOAMm_X85uJ5Er 8988 Remove `.drop` warning allow max-sixty 5635139 closed 0     0 2024-04-30T14:39:35Z 2024-04-30T19:26:17Z 2024-04-30T19:26:16Z MEMBER   0 pydata/xarray/pulls/8988  
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8988/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2271652603 PR_kwDOAMm_X85uJ122 8987 Add notes on when to add ignores to warnings max-sixty 5635139 closed 0     0 2024-04-30T14:34:52Z 2024-04-30T14:56:47Z 2024-04-30T14:56:46Z MEMBER   0 pydata/xarray/pulls/8987  
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8987/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1250939008 I_kwDOAMm_X85Kj9CA 6646 `dim` vs `dims` max-sixty 5635139 closed 0     4 2022-05-27T16:15:02Z 2024-04-29T18:24:56Z 2024-04-29T18:24:56Z MEMBER      

What is your issue?

I've recently been hit with this when experimenting with xr.dot and xr.corr — xr.dot takes dims, and xr.cov takes dim. Because they each take multiple arrays as positional args, kwargs are more conventional.

Should we standardize on one of these?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6646/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
2268058661 PR_kwDOAMm_X85t9f5f 8982 Switch all methods to `dim` max-sixty 5635139 closed 0     0 2024-04-29T03:42:34Z 2024-04-29T18:24:56Z 2024-04-29T18:24:55Z MEMBER   0 pydata/xarray/pulls/8982

I think this is the final set of methods

  • [x] Closes #6646
  • [ ] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8982/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2267810980 PR_kwDOAMm_X85t8q4s 8981 Enable ffill for datetimes max-sixty 5635139 closed 0     5 2024-04-28T20:53:18Z 2024-04-29T18:09:48Z 2024-04-28T23:02:11Z MEMBER   0 pydata/xarray/pulls/8981

Notes inline. Would fix #4587

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8981/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2262478932 PR_kwDOAMm_X85tqpUi 8974 Raise errors on new warnings from within xarray max-sixty 5635139 closed 0     2 2024-04-25T01:50:48Z 2024-04-29T12:18:42Z 2024-04-29T02:50:21Z MEMBER   0 pydata/xarray/pulls/8974

Notes are inline.

  • [x] Closes https://github.com/pydata/xarray/issues/8494
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst

Done with some help from an LLM — quite good for doing tedious tasks that we otherwise wouldn't want to do — can paste in all the warnings output and get a decent start on rules for exclusions

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8974/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1997537503 PR_kwDOAMm_X85fqp3A 8459 Check for aligned chunks when writing to existing variables max-sixty 5635139 closed 0     5 2023-11-16T18:56:06Z 2024-04-29T03:05:36Z 2024-03-29T14:35:50Z MEMBER   0 pydata/xarray/pulls/8459

While I don't feel super confident that this is designed to protect against any bugs, it does solve the immediate problem in #8371, by hoisting the encoding check above the code that runs for only new variables. The encoding check is somewhat implicit, so this was an easy thing to miss prior.

  • [x] Closes #8371,
  • [x] Closes #8882
  • [x] Closes #8876
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8459/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2244681150 PR_kwDOAMm_X85suxIl 8947 Add mypy to dev dependencies max-sixty 5635139 closed 0     0 2024-04-15T21:39:19Z 2024-04-17T16:39:23Z 2024-04-17T16:39:22Z MEMBER   0 pydata/xarray/pulls/8947  
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8947/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1960332384 I_kwDOAMm_X8502Exg 8371 Writing to regions with unaligned chunks can lose data max-sixty 5635139 closed 0     20 2023-10-25T01:17:59Z 2024-03-29T14:35:51Z 2024-03-29T14:35:51Z MEMBER      

What happened?

Writing with region with chunks that aren't aligned can lose data.

I've recreated an example below. While it's unlikely that folks are passing different values to .chunk for the template vs. the regions, I had an "auto" chunk, which can then set different chunk values.

(FWIW, this was fairly painful, and I managed to lose a lot of time by not noticing this, and then not really considering this could happen as I was trying to debug. I think we should really strive to ensure that we don't lose data / incorrectly report that we've successfully written data...)

What did you expect to happen?

If there's a risk of data loss, raise an error...

Minimal Complete Verifiable Example

```Python ds = xr.DataArray(np.arange(120).reshape(4,3,-1),dims=list("abc")).rename('var1').to_dataset().chunk(2)

ds

<xarray.Dataset>

Dimensions: (a: 4, b: 3, c: 10)

Dimensions without coordinates: a, b, c

Data variables:

var1 (a, b, c) int64 dask.array<chunksize=(2, 2, 2), meta=np.ndarray>

def write(ds): ds.chunk(5).to_zarr('foo.zarr', compute=False, mode='w') for r in (range(ds.sizes['a'])): ds.chunk(3).isel(a=[r]).to_zarr('foo.zarr', region=dict(a=slice(r, r+1)))

def read(ds): result = xr.open_zarr('foo.zarr') assert result.compute().identical(ds) print(result.chunksizes, ds.chunksizes)

write(ds); read(ds)

AssertionError

xr.open_zarr('foo.zarr').compute()['var1']

<xarray.DataArray 'var1' (a: 4, b: 3, c: 10)> array([[[ 0, 0, 0, 3, 4, 5, 0, 0, 0, 9], [ 0, 0, 0, 13, 14, 15, 0, 0, 0, 19], [ 0, 0, 0, 23, 24, 25, 0, 0, 0, 29]],

   [[ 30,  31,  32,   0,   0,  35,  36,  37,  38,   0],
    [ 40,  41,  42,   0,   0,  45,  46,  47,  48,   0],
    [ 50,  51,  52,   0,   0,  55,  56,  57,  58,   0]],

   [[ 60,  61,  62,   0,   0,  65,   0,   0,   0,  69],
    [ 70,  71,  72,   0,   0,  75,   0,   0,   0,  79],
    [ 80,  81,  82,   0,   0,  85,   0,   0,   0,  89]],

   [[  0,   0,   0,  93,  94,  95,  96,  97,  98,   0],
    [  0,   0,   0, 103, 104, 105, 106, 107, 108,   0],
    [  0,   0,   0, 113, 114, 115, 116, 117, 118,   0]]])

Dimensions without coordinates: a, b, c ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: ccc8f9987b553809fb6a40c52fa1a8a8095c8c5f python: 3.9.18 (main, Aug 24 2023, 21:19:58) [Clang 14.0.3 (clang-1403.0.22.14.1)] python-bits: 64 OS: Darwin OS-release: 22.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2023.10.2.dev10+gccc8f998 pandas: 2.1.1 numpy: 1.25.2 scipy: 1.11.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.16.0 cftime: None nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: None dask: 2023.4.0 distributed: 2023.7.1 matplotlib: 3.5.1 cartopy: None seaborn: None numbagg: 0.2.3.dev30+gd26e29e fsspec: 2021.11.1 cupy: None pint: None sparse: None flox: None numpy_groupies: 0.9.19 setuptools: 68.1.2 pip: 23.2.1 conda: None pytest: 7.4.0 mypy: 1.6.0 IPython: 8.15.0 sphinx: 4.3.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8371/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
2110888925 I_kwDOAMm_X8590Zvd 8690 Add `nbytes` to repr? max-sixty 5635139 closed 0     9 2024-01-31T20:13:59Z 2024-02-19T22:18:47Z 2024-02-07T20:47:38Z MEMBER      

Is your feature request related to a problem?

Would having the nbytes value in the Dataset repr be reasonable?

I frequently find myself logging this separately. For example:

diff <xarray.Dataset> Dimensions: (lat: 25, time: 2920, lon: 53) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 Data variables: - air (time, lat, lon) float32 dask.array<chunksize=(2920, 25, 53), meta=np.ndarray> + air (time, lat, lon) float32 15MB dask.array<chunksize=(2920, 25, 53), meta=np.ndarray> Attributes: Conventions: COARDS title: 4x daily NMC reanalysis (1948) description: Data is from NMC initialized reanalysis\n(4x/day). These a... platform: Model references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...

Describe the solution you'd like

No response

Describe alternatives you've considered

Status quo :)

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8690/reactions",
    "total_count": 6,
    "+1": 6,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
2128692061 PR_kwDOAMm_X85mkDqu 8735 Remove fsspec exclusion from 2021 max-sixty 5635139 closed 0     1 2024-02-10T19:43:14Z 2024-02-11T00:19:30Z 2024-02-11T00:19:29Z MEMBER   0 pydata/xarray/pulls/8735

Presumably no longer needed

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8735/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2128687154 PR_kwDOAMm_X85mkCum 8734 Silence dask doctest warning max-sixty 5635139 closed 0     0 2024-02-10T19:25:47Z 2024-02-10T23:44:24Z 2024-02-10T23:44:24Z MEMBER   0 pydata/xarray/pulls/8734

Closes #8732. Not the most elegant implementation but it's only temporary

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8734/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1920361792 PR_kwDOAMm_X85bl988 8258 Add a `.drop_attrs` method max-sixty 5635139 open 0     9 2023-09-30T18:42:12Z 2024-02-09T18:49:22Z   MEMBER   0 pydata/xarray/pulls/8258

Part of #3891

~Do we think this is a good idea? I'll add docs & tests if so...~

Ready to go, just needs agreement on whether it's good

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8258/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2126375172 I_kwDOAMm_X85-vekE 8726 PRs requiring approval & merging main? max-sixty 5635139 closed 0     4 2024-02-09T02:35:58Z 2024-02-09T18:23:52Z 2024-02-09T18:21:59Z MEMBER      

What is your issue?

Sorry I haven't been on the calls at all recently (unfortunately the schedule is difficult for me). Maybe this was discussed there? 

PRs now seem to require a separate approval prior to merging. Is there an upside to this? Is there any difference between those who can approve and those who can merge? Otherwise it just seems like more clicking.

PRs also now seem to require merging the latest main prior to merging? I get there's some theoretical value to this, because changes can semantically conflict with each other. But it's extremely rare that this actually happens (can we point to cases?), and it limits the immediacy & throughput of PRs. If the bad outcome does ever happen, we find out quickly when main tests fail and can revert.

(fwiw I wrote a few principles around this down a while ago here; those are much stronger than what I'm suggesting in this issue though)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8726/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
2126095122 PR_kwDOAMm_X85mbRG7 8724 Switch `.dt` to raise an `AttributeError` max-sixty 5635139 closed 0     0 2024-02-08T21:26:06Z 2024-02-09T02:21:47Z 2024-02-09T02:21:46Z MEMBER   0 pydata/xarray/pulls/8724

Discussion at #8718

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8724/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1984961987 I_kwDOAMm_X852UB3D 8432 Writing a datetime coord ignores chunks max-sixty 5635139 closed 0     5 2023-11-09T07:00:39Z 2024-01-29T19:12:33Z 2024-01-29T19:12:33Z MEMBER      

What happened?

When writing a coord with a datetime type, the chunking on the coord is ignored, and the whole coord is written as a single chunk. (or at least it can be, I haven't done enough to confirm whether it'll always be...)

This can be quite inconvenient. Any attempt to write to that dataset from a distributed process will have errors, since each process will be attempting to write another process's data, rather than only its region. And less severely, the chunks won't be unified.

Minimal Complete Verifiable Example

```Python ds = xr.tutorial.load_dataset('air_temperature')

( ds.chunk() .expand_dims(a=1000) .assign_coords( time2=lambda x: x.time, time_int=lambda x: (("time"), np.full(ds.sizes["time"], 1)), ) .chunk(time=10) .to_zarr("foo.zarr", mode="w") )

xr.open_zarr('foo.zarr')

Note the chunksize=(2920,) vs chunksize=(10,)!

<xarray.Dataset> Dimensions: (a: 1000, time: 2920, lat: 25, lon: 53) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 time2 (time) datetime64[ns] dask.array<chunksize=(2920,), meta=np.ndarray> # here time_int (time) int64 dask.array<chunksize=(10,), meta=np.ndarray> # here Dimensions without coordinates: a Data variables: air (a, time, lat, lon) float32 dask.array<chunksize=(1000, 10, 25, 53), meta=np.ndarray> Attributes: Conventions: COARDS description: Data is from NMC initialized reanalysis\n(4x/day). These a... platform: Model references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly... title: 4x daily NMC reanalysis (1948)

xr.open_zarr('foo.zarr').chunks

ValueError Traceback (most recent call last) Cell In[13], line 1 ----> 1 xr.open_zarr('foo.zarr').chunks

File /opt/homebrew/lib/python3.9/site-packages/xarray/core/dataset.py:2567, in Dataset.chunks(self) 2552 @property 2553 def chunks(self) -> Mapping[Hashable, tuple[int, ...]]: 2554 """ 2555 Mapping from dimension names to block lengths for this dataset's data, or None if 2556 the underlying data is not a dask array. (...) 2565 xarray.unify_chunks 2566 """ -> 2567 return get_chunksizes(self.variables.values())

File /opt/homebrew/lib/python3.9/site-packages/xarray/core/common.py:2013, in get_chunksizes(variables) 2011 for dim, c in v.chunksizes.items(): 2012 if dim in chunks and c != chunks[dim]: -> 2013 raise ValueError( 2014 f"Object has inconsistent chunks along dimension {dim}. " 2015 "This can be fixed by calling unify_chunks()." 2016 ) 2017 chunks[dim] = c 2018 return Frozen(chunks)

ValueError: Object has inconsistent chunks along dimension time. This can be fixed by calling unify_chunks().

```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.9.18 (main, Nov 2 2023, 16:51:22) [Clang 14.0.3 (clang-1403.0.22.14.1)] python-bits: 64 OS: Darwin OS-release: 22.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: None xarray: 2023.10.1 pandas: 2.1.1 numpy: 1.26.1 scipy: 1.11.1 netCDF4: None pydap: None h5netcdf: 1.1.0 h5py: 3.8.0 Nio: None zarr: 2.16.0 cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: 1.3.7 dask: 2023.5.0 distributed: 2023.5.0 matplotlib: 3.6.0 cartopy: None seaborn: 0.12.2 numbagg: 0.6.0 fsspec: 2022.8.2 cupy: None pint: 0.22 sparse: 0.14.0 flox: 0.8.1 numpy_groupies: 0.9.22 setuptools: 68.2.2 pip: 23.3.1 conda: None pytest: 7.4.0 mypy: 1.6.1 IPython: 8.14.0 sphinx: 5.2.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8432/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
2099077744 PR_kwDOAMm_X85k_vqU 8661 Add `dev` dependencies to `pyproject.toml` max-sixty 5635139 closed 0     1 2024-01-24T20:48:55Z 2024-01-25T06:24:37Z 2024-01-25T06:24:36Z MEMBER   0 pydata/xarray/pulls/8661  
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8661/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2097231358 PR_kwDOAMm_X85k5dSd 8648 xfail another test on windows max-sixty 5635139 closed 0     0 2024-01-24T01:04:01Z 2024-01-24T01:23:26Z 2024-01-24T01:23:26Z MEMBER   0 pydata/xarray/pulls/8648

As ever, very open to approaches to fix these. But unless we can fix them, xfailing them seems like the most reasonable solution

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8648/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2089331658 PR_kwDOAMm_X85keyUs 8624 Use ddof in `numbagg>=0.7.0` for aggregations max-sixty 5635139 closed 0     0 2024-01-19T00:23:15Z 2024-01-23T02:25:39Z 2024-01-23T02:25:38Z MEMBER   0 pydata/xarray/pulls/8624  
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8624/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2094956413 PR_kwDOAMm_X85kxwAk 8643 xfail zarr test on Windows max-sixty 5635139 closed 0     0 2024-01-22T23:24:12Z 2024-01-23T00:40:29Z 2024-01-23T00:40:28Z MEMBER   0 pydata/xarray/pulls/8643

I see this failing quite a lot of the time...

Ofc open to a proper solution but in the meantime setting this to xfail

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8643/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2092299525 PR_kwDOAMm_X85kozmg 8630 Use `T_DataArray` in `Weighted` max-sixty 5635139 closed 0     0 2024-01-21T01:18:14Z 2024-01-22T04:28:07Z 2024-01-22T04:28:07Z MEMBER   0 pydata/xarray/pulls/8630

Allows subtypes.

(I had this in my git stash, so commiting it...)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8630/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2092855603 PR_kwDOAMm_X85kqlH4 8639 Silence deprecation warning from `.dims` in tests max-sixty 5635139 closed 0     1 2024-01-22T00:25:07Z 2024-01-22T02:04:54Z 2024-01-22T02:04:53Z MEMBER   0 pydata/xarray/pulls/8639  
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8639/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2092790802 PR_kwDOAMm_X85kqX8y 8637 xfail a cftime test max-sixty 5635139 closed 0     0 2024-01-21T21:43:59Z 2024-01-21T22:00:59Z 2024-01-21T22:00:58Z MEMBER   0 pydata/xarray/pulls/8637

https://github.com/pydata/xarray/pull/8636#issuecomment-1902775153

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8637/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2092777417 PR_kwDOAMm_X85kqVIH 8636 xfail another dask/pyarrow test max-sixty 5635139 closed 0     1 2024-01-21T21:26:19Z 2024-01-21T21:42:22Z 2024-01-21T21:42:21Z MEMBER   0 pydata/xarray/pulls/8636

Unsure why this wasn't showing prior -- having tests fail in the good state does make it much more difficult to ensure everything is fixed before merging.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8636/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2089351473 PR_kwDOAMm_X85ke2qd 8625 Don't show stdlib paths for `user_level_warnings` max-sixty 5635139 closed 0     0 2024-01-19T00:45:14Z 2024-01-21T21:08:40Z 2024-01-21T21:08:39Z MEMBER   0 pydata/xarray/pulls/8625

Was previously seeing:

<frozen _collections_abc>:801: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.

Now:

/Users/maximilian/workspace/xarray/xarray/tests/test_dataset.py:701: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`. assert ds.dims == ds.sizes

It's a heuristic, so not perfect, but I think very likely to be accurate. Any contrary cases very welcome...

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8625/reactions",
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2092762468 PR_kwDOAMm_X85kqSLW 8635 xfail pyarrow test max-sixty 5635139 closed 0     0 2024-01-21T20:42:50Z 2024-01-21T21:03:35Z 2024-01-21T21:03:34Z MEMBER   0 pydata/xarray/pulls/8635

Sorry for the repeated PR -- some tests passed but some failed without pyarrow installed. So this xfails the test for the moment

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8635/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2092747686 PR_kwDOAMm_X85kqPTB 8634 Workaround broken test from pyarrow max-sixty 5635139 closed 0     0 2024-01-21T20:01:51Z 2024-01-21T20:18:23Z 2024-01-21T20:18:22Z MEMBER   0 pydata/xarray/pulls/8634

While fixing the previous issue, I introduced another (but didn't see it because of the errors from the test suite, probably should have looked closer...)

This doesn't fix the behavior, but I think it's minor so fine to push off. I do prioritize getting the tests where pass vs failure is meaningful again

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8634/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2092300888 PR_kwDOAMm_X85koz3r 8631 Partially fix doctests max-sixty 5635139 closed 0     1 2024-01-21T01:25:02Z 2024-01-21T01:33:43Z 2024-01-21T01:31:46Z MEMBER   0 pydata/xarray/pulls/8631

Currently getting a error without pyarrow in CI: https://github.com/pydata/xarray/actions/runs/7577666145/job/20693665924

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8631/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1923361961 I_kwDOAMm_X85ypCyp 8263 Surprising `.groupby` behavior with float index max-sixty 5635139 closed 0     0 2023-10-03T05:50:49Z 2024-01-08T01:05:25Z 2024-01-08T01:05:25Z MEMBER      

What is your issue?

We raise an error on grouping without supplying dims, but not for float indexes — is this intentional or an oversight?

This is without flox installed

```python

da = xr.tutorial.open_dataset("air_temperature")['air']

da.drop_vars('lat').groupby('lat').sum() ```

```

ValueError Traceback (most recent call last) Cell In[8], line 1 ----> 1 da.drop_vars('lat').groupby('lat').sum() ... ValueError: cannot reduce over dimensions ['lat']. expected either '...' to reduce over all dimensions or one or more of ('time', 'lon'). ```

But with a float index, we don't raise:

python da.groupby('lat').sum()

...returns the original array:

Out[15]: <xarray.DataArray 'air' (time: 2920, lat: 25, lon: 53)> array([[[296.29 , 296.79 , 297.1 , ..., 296.9 , 296.79 , 296.6 ], [295.9 , 296.19998, 296.79 , ..., 295.9 , 295.9 , 295.19998], [296.6 , 296.19998, 296.4 , ..., 295.4 , 295.1 , 294.69998], ...

And if we try this with a non-float index, we get the error again:

python da.groupby('time').sum()

ValueError: cannot reduce over dimensions ['time']. expected either '...' to reduce over all dimensions or one or more of ('lat', 'lon').

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8263/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1916677049 I_kwDOAMm_X85yPiu5 8245 Tools for writing distributed zarrs max-sixty 5635139 open 0     0 2023-09-28T04:25:45Z 2024-01-04T00:15:09Z   MEMBER      

What is your issue?

There seems to be a common pattern for writing zarrs from a distributed set of machines, in parallel. It's somewhat described in the prose of the io docs. Quoting:

  • Creating the template — "the first step is creating an initial Zarr store without writing all of its array data. This can be done by first creating a Dataset with dummy values stored in dask, and then calling to_zarr with compute=False to write only metadata to Zarr"
  • Writing out each region from workers — "a Zarr store with the correct variable shapes and attributes exists that can be filled out by subsequent calls to to_zarr. The region provides a mapping from dimension names to Python slice objects indicating where the data should be written (in index space, not coordinate space)"

I've been using this fairly successfully recently. It's much better than writing hundreds or thousands of data variables, since many small data variables create a huge number of files.

Are there some tools we can provide to make this easier? Some ideas: - [ ] compute=False is arguably a less-than-obvious kwarg meaning "write metadata". Maybe this should be a method, maybe it's a candidate for renaming? Or maybe make_template can be an abstraction over it. Something like xarray_beam.make_template to make the template from a Dataset? - Or from an array of indexes? - https://github.com/pydata/xarray/issues/8343 - https://github.com/pydata/xarray/pull/8460 - [ ] What happens if one worker's data isn't aligned on some dimensions? Will that write to the wrong location? Could we offer an option, similar to the above, to reindex on the template dimensions?

  • [ ] When writing a region, we need to drop other vars. Can we offer this as a kwarg? Occasionally I'll add a dimension with an index to a dataset, run the function to write it — and it'll fail, because I forgot to add that index to the .drop_vars call that precedes the write. When we're writing a template, all the indexes are written up front anyway. (edit: #6260)
    • https://github.com/pydata/xarray/pull/8460

More minor papercuts: - [ ] I've hit an issue where writing a region seemed to cause the worker to attempt to load the whole array into memory — can we offer guarantees for when (non-metadata) data will be loaded during to_zarr? - [ ] How about adding raise_if_dask_computes to our public API? The alternative I've been doing is watching htop and existing if I see memory ballooning, which is less cerebral... - [ ] It doesn't seem easy to write coords on a DataArray. For example, writing xr.tutorial.load_dataset('air_temperature').assign_coords(lat2=da.lat + 2, a=(('lon',), ['a'] * len(da.lon))).chunk().to_zarr('foo.zarr', compute=False) will cause the non-index coords to be written as empty. But writing them separately conflicts with having a single variable. Currently I manually load each coord before writing, which is not super-friendly.

Some things that were in the list here, as they've been completed!! - [x] Requiring region to be specified as an int range can be inconvenient — would it feasible to have a function that grabs the template metadata, calculates the region ints, and then calculates the implied indexes? - Edit: suggested at https://github.com/pydata/xarray/issues/7702

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8245/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 1
}
    xarray 13221727 issue
1975574237 I_kwDOAMm_X851wN7d 8409 Task graphs on `.map_blocks` with many chunks can be huge max-sixty 5635139 closed 0     6 2023-11-03T07:14:45Z 2024-01-03T04:10:16Z 2024-01-03T04:10:16Z MEMBER      

What happened?

I'm getting task graphs > 1GB, I think possibly because the full indexes are being included in every task?

What did you expect to happen?

Only the relevant sections of the index would be included

Minimal Complete Verifiable Example

```Python da = xr.tutorial.load_dataset('air_temperature')

Dropping the index doesn't generally matter that much...

len(cloudpickle.dumps(da.chunk(lat=1, lon=1)))

15569320

len(cloudpickle.dumps(da.chunk().drop_vars(da.indexes)))

15477313

But with .map_blocks, it really matters — it's really big with the indexes, and the same size without:

len(cloudpickle.dumps(da.chunk(lat=1, lon=1).map_blocks(lambda x: x)))

79307120

len(cloudpickle.dumps(da.chunk(lat=1, lon=1).drop_vars(da.indexes).map_blocks(lambda x: x)))

16016173

```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.9.18 (main, Aug 24 2023, 21:19:58) [Clang 14.0.3 (clang-1403.0.22.14.1)] python-bits: 64 OS: Darwin OS-release: 22.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: None xarray: 2023.10.1 pandas: 2.1.1 numpy: 1.26.1 scipy: 1.11.1 netCDF4: None pydap: None h5netcdf: 1.1.0 h5py: 3.8.0 Nio: None zarr: 2.16.0 cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: 1.3.7 dask: 2023.5.0 distributed: 2023.5.0 matplotlib: 3.6.0 cartopy: None seaborn: 0.12.2 numbagg: 0.6.0 fsspec: 2022.8.2 cupy: None pint: 0.22 sparse: 0.14.0 flox: 0.7.2 numpy_groupies: 0.9.22 setuptools: 68.1.2 pip: 23.2.1 conda: None pytest: 7.4.0 mypy: 1.6.1 IPython: 8.14.0 sphinx: 5.2.1
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8409/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
2052840951 I_kwDOAMm_X856W933 8566 Use `ddof=1` for `std` & `var` max-sixty 5635139 open 0     2 2023-12-21T17:47:21Z 2023-12-27T16:58:46Z   MEMBER      

What is your issue?

I've discussed this a bunch with @dcherian (though I'm not sure he necessarily agrees, I'll let him comment)

Currently xarray uses ddof=0 for std & var. This is: - Rarely what someone actually wants — xarray data is almost always a sample of some underlying distribution, for which ddof=1 is correct - Inconsistent with pandas

OTOH: - It is consistent with numpy - It wouldn't be a painless change — folks who don't read deprecation messages would see values change very slightly

Any thoughts?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8566/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
988158051 MDU6SXNzdWU5ODgxNTgwNTE= 5764 Implement __sizeof__ on objects? max-sixty 5635139 open 0     6 2021-09-03T23:36:53Z 2023-12-19T18:23:08Z   MEMBER      

Is your feature request related to a problem? Please describe. Currently ds.nbytes returns the size of the data.

But sys.getsizeof(ds) returns a very small number.

Describe the solution you'd like If we implement __sizeof__ on DataArrays & Datasets, this would work.

I think that would be something like ds.nbytes + the size of the ds container, + maybe attrs if those aren't handled by .nbytes?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5764/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  reopened xarray 13221727 issue
2033367994 PR_kwDOAMm_X85hj9np 8533 Offer a fixture for unifying DataArray & Dataset tests max-sixty 5635139 closed 0     2 2023-12-08T22:06:28Z 2023-12-18T21:30:41Z 2023-12-18T21:30:40Z MEMBER   0 pydata/xarray/pulls/8533

Some tests are literally copy & pasted between DataArray & Dataset tests. This change allows them to use a single test. Not everything will be able to use this — sometimes we want to check specifics — but some will — I've change the .cumulative tests to use this fixture.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8533/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1977661256 I_kwDOAMm_X8514LdI 8414 Is there any way of having `.map_blocks` be even more opaque to dask? max-sixty 5635139 closed 0     23 2023-11-05T06:56:43Z 2023-12-12T18:14:57Z 2023-12-12T18:14:57Z MEMBER      

Is your feature request related to a problem?

Currently I have a workload which does something a bit like:

python ds = open_zarr(source) ( ds.assign( x=ds.foo * ds.bar y=ds.foo + ds.bar ).to_zarr(dest) )

(the actual calc is a bit more complicated! And while I don't have a MVCE of the full calc, I pasted a task graph below)

Dask — while very impressive in many ways — handles this extremely badly, because it attempts to load the whole of ds into memory before writing out any chunks. There are lots of issues on this in the dask repo; it seems like an intractable problem for dask.

Describe the solution you'd like

I was hoping to make the internals of this task opaque to dask, so it became a much dumber task runner — just map over the blocks, running the function and writing the result, block by block. I thought I had some success with .map_blocks last week — the internals of the calc are now opaque at least. But the dask cluster is falling over again, I think because the write is seen as a separate task.

Is there any way to make the write more opaque too?

Describe alternatives you've considered

I've built a homegrown thing which is really hacky which does this on a custom scheduler — just runs the functions and writes with region. I'd much prefer to use & contribute to the broader ecosystem...

Additional context

(It's also possible I'm making some basic error — and I do remember it working much better last week — so please feel free to direct me / ask me for more examples, if this doesn't ring true)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8414/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
2034575163 PR_kwDOAMm_X85hn4Pn 8539 Filter out doctest warning max-sixty 5635139 closed 0     11 2023-12-10T23:11:36Z 2023-12-12T06:37:54Z 2023-12-11T21:00:01Z MEMBER   0 pydata/xarray/pulls/8539

Trying to fix #8537. Not sure it'll work and can't test locally so seeing if it passes CI

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8539/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2036491126 PR_kwDOAMm_X85hud-m 8543 Fix incorrect indent max-sixty 5635139 closed 0     0 2023-12-11T20:41:32Z 2023-12-11T20:43:26Z 2023-12-11T20:43:09Z MEMBER   0 pydata/xarray/pulls/8543

edit: my mistake, this is intended

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8543/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
866826033 MDU6SXNzdWU4NjY4MjYwMzM= 5215 Add an Cumulative aggregation, similar to Rolling max-sixty 5635139 closed 0     6 2021-04-24T19:59:49Z 2023-12-08T22:06:53Z 2023-12-08T22:06:53Z MEMBER      

Is your feature request related to a problem? Please describe.

Pandas has a .expanding aggregation, which is basically rolling with a full lookback. I often end up supplying rolling with the length of the dimension, and this is some nice sugar for that.

Describe the solution you'd like Basically the same as pandas — a .expanding method that returns an Expanding class, which implements the same methods as a Rolling class.

Describe alternatives you've considered Some options: – This – Don't add anything, the sugar isn't worth the additional API. – Go full out and write specialized expanding algos — which will be faster since they don't have to keep track of the window. But not that much faster, likely not worth the effort.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5215/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
2022202767 PR_kwDOAMm_X85g97hj 8512 Add Cumulative aggregation max-sixty 5635139 closed 0     1 2023-12-02T21:03:13Z 2023-12-08T22:06:53Z 2023-12-08T22:06:52Z MEMBER   0 pydata/xarray/pulls/8512

Closes #5215

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8512/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2019645081 I_kwDOAMm_X854YVaZ 8498 Allow some notion of ordering in Dataset dims max-sixty 5635139 closed 0     5 2023-11-30T22:57:23Z 2023-12-08T19:22:56Z 2023-12-08T19:22:55Z MEMBER      

What is your issue?

Currently a DataArray's dims are ordered, while a Dataset's are not.

Do we gain anything from have unordered dims in a Dataset? Could we have an ordering without enforcing it on every variable?

Here's one proposal, with fairly wide error-bars: - Datasets have a dim order, which is set at construction time or through .transpose - Currently .transpose changes the order of each variable's dims, but not the dataset's - If dims aren't supplied, we can just use the first variable's - Variables don't have to conform to that order — .assign(foo=differently_ordered) maintains the differently ordered dims. So this doesn't limit any current functionality. - When there are transformations which change dim ordering, Xarray is "allowed" to transpose variables to the dataset's ordering. Currently Xarray is "allowed" to change dim order arbitrarily — for example to put a core dim last. IIUC, we'd prefer to set a non-arbitrary order, but we don't have one to reference. - This would remove a bunch of boilerplate from methods that save the ordering, run .apply_ufunc and then reorder in the original order[^1]

What do folks think?

[^1]: though also we could do this in .apply_ufunc

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8498/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  not_planned xarray 13221727 issue
2026963757 I_kwDOAMm_X8540QMt 8522 Test failures on `main` max-sixty 5635139 closed 0     7 2023-12-05T19:22:01Z 2023-12-06T18:48:24Z 2023-12-06T17:28:13Z MEMBER      

What is your issue?

Any ideas what could be causing these? I can't immediately reproduce locally.

https://github.com/pydata/xarray/actions/runs/7105414268/job/19342564583

``` Error: TestDataArray.test_computation_objects[int64-method_groupby_bins-data]

AssertionError: Left and right DataArray objects are not close

Differing values: L <Quantity([[ nan nan 1. 1. ] [2. 2. 3. 3. ] [4. 4. 5. 5. ] [6. 6. 7. 7. ] [8. 8. 9. 9.333333]], 'meter')> R <Quantity([[0. 0. 1. 1. ] [2. 2. 3. 3. ] [4. 4. 5. 5. ] [6. 6. 7. 7. ] [8. 8. 9. 9.333333]], 'meter')> ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8522/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 1,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1192478248 I_kwDOAMm_X85HE8Yo 6440 Add `eval`? max-sixty 5635139 closed 0     0 2022-04-05T00:57:00Z 2023-12-06T17:52:47Z 2023-12-06T17:52:47Z MEMBER      

Is your feature request related to a problem?

We currently have query, which can runs a numexpr string using eval.

Describe the solution you'd like

Should we add an eval method itself? I find that when building something for the command line, allowing people to pass an eval-able expression can be a good interface.

Describe alternatives you've considered

No response

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6440/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1410303926 PR_kwDOAMm_X85A3Xqk 7163 Add `eval` method to Dataset max-sixty 5635139 closed 0     3 2022-10-15T22:12:23Z 2023-12-06T17:52:47Z 2023-12-06T17:52:46Z MEMBER   0 pydata/xarray/pulls/7163

This needs proper tests & docs, but would this be a good idea?

A couple of examples are in the docstring. It's mostly just deferring to pandas' excellent eval method.

  • [x] Closes #6440 (edit)
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [x] New functions/methods are listed in api.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7163/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 1
}
    xarray 13221727 pull
2019309352 PR_kwDOAMm_X85g0KvI 8493 Use numbagg for `rolling` methods max-sixty 5635139 closed 0     3 2023-11-30T18:52:08Z 2023-12-05T19:08:32Z 2023-12-05T19:08:31Z MEMBER   0 pydata/xarray/pulls/8493

A couple of tests are failing for the multi-dimensional case, which I'll fix before merge.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8493/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
907845790 MDU6SXNzdWU5MDc4NDU3OTA= 5413 Does the PyPI release job fire twice for each release? max-sixty 5635139 closed 0     2 2021-06-01T04:01:17Z 2023-12-04T19:22:32Z 2023-12-04T19:22:32Z MEMBER      

I was attempting to copy the great work here for numbagg and spotted this! Do we fire twice for each release? Maybe that's fine though?

https://github.com/pydata/xarray/actions/workflows/pypi-release.yaml

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5413/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
929840699 MDU6SXNzdWU5Mjk4NDA2OTk= 5531 Keyword only args for arguments like "drop" max-sixty 5635139 closed 0     12 2021-06-25T05:24:25Z 2023-12-04T19:22:24Z 2023-12-04T19:22:23Z MEMBER      

Is your feature request related to a problem? Please describe.

A method like .reset_index has a signature .reset_index(dims_or_levels, drop=False).

This means that passing .reset_index("x", "y") is actually like passing .reset_index("x", True), which is silent and confusing.

Describe the solution you'd like Move to kwarg-only arguments for these; like .reset_index(dims_or_levels, *, drop=False).

But we probably need a deprecation cycle, which will require some work.

Describe alternatives you've considered Not have a deprecation cycle? I imagine it's fairly rare to not pass the kwarg.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5531/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1165654699 I_kwDOAMm_X85Fenqr 6349 Rolling exp correlation max-sixty 5635139 closed 0     1 2022-03-10T19:51:57Z 2023-12-04T19:13:35Z 2023-12-04T19:13:34Z MEMBER      

Is your feature request related to a problem?

I'd like an exponentially moving correlation coefficient

Describe the solution you'd like

I think we could add a rolling_exp.corr method fairly easily — i.e. just in python, no need to add anything to numbagg: ewma here means rolling_exp(...).mean - ewma(A * B) - ewma(A) * ewma(B) for the rolling covar - divided by sqrt of (ewma(A**2) - ewma(A)**2 * ewma(B**2) - ewma(B)**2 for the sqrt of variance

We could also add a flag for cosine similarity, which wouldn't remove the mean. We could also add .var & .std & .covar as their own methods.

I think we'd need to mask the variables on their intersection, so we don't have values that are missing from B affecting A's variance without affecting its covariance.

Pandas does this in cython, possibly because it's faster to only do a single pass of the data. If anyone has correctness concerns about this simple approach of wrapping ewmas, please let me know. Or if the performance would be unacceptable such that it shouldn't go into xarray until it's a single pass.

Describe alternatives you've considered

Numagg

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6349/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
2019577432 PR_kwDOAMm_X85g1F3A 8495 Fix type of `.assign_coords` max-sixty 5635139 closed 0     1 2023-11-30T21:57:58Z 2023-12-04T19:11:57Z 2023-12-04T19:11:55Z MEMBER   0 pydata/xarray/pulls/8495

As discussed in #8455

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8495/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1995489227 I_kwDOAMm_X8528L_L 8455 Errors when assigning using `.from_pandas_multiindex` max-sixty 5635139 closed 0     3 2023-11-15T20:09:15Z 2023-12-04T19:10:12Z 2023-12-04T19:10:11Z MEMBER      

What happened?

Very possibly this is user-error, forgive me if so.

I'm trying to transition some code from the previous assignment of MultiIndexes, to the new world. Here's an MCVE:

What did you expect to happen?

No response

Minimal Complete Verifiable Example

```Python da = xr.tutorial.open_dataset("air_temperature")['air']

old code, works, but with a warning

da.expand_dims('foo').assign_coords(foo=(pd.MultiIndex.from_tuples([(1,2)])))

<ipython-input-25-f09b7f52bb42>:1: FutureWarning: the pandas.MultiIndex object(s) passed as 'foo' coordinate(s) or data variable(s) will no longer be implicitly promoted and wrapped into multiple indexed coordinates in the future (i.e., one coordinate for each multi-index level + one dimension coordinate). If you want to keep this behavior, you need to first wrap it explicitly using mindex_coords = xarray.Coordinates.from_pandas_multiindex(mindex_obj, 'dim') and pass it as coordinates, e.g., xarray.Dataset(coords=mindex_coords), dataset.assign_coords(mindex_coords) or dataarray.assign_coords(mindex_coords). da.expand_dims('foo').assign_coords(foo=(pd.MultiIndex.from_tuples([(1,2)]))) Out[25]: <xarray.DataArray 'air' (foo: 1, time: 2920, lat: 25, lon: 53)> array([[[[241.2 , 242.5 , 243.5 , ..., 232.79999, 235.5 , 238.59999], ... [297.69 , 298.09 , 298.09 , ..., 296.49 , 296.19 , 295.69 ]]]], dtype=float32) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0 * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 * foo (foo) object MultiIndex * foo_level_0 (foo) int64 1 * foo_level_1 (foo) int64 2

new code — seems to get confused between the number of values in the index — 1 — and the number of levels — 3 including the parent:

da.expand_dims('foo').assign_coords(foo=xr.Coordinates.from_pandas_multiindex(pd.MultiIndex.from_tuples([(1,2)]), dim='foo'))

ValueError Traceback (most recent call last) Cell In[26], line 1 ----> 1 da.expand_dims('foo').assign_coords(foo=xr.Coordinates.from_pandas_multiindex(pd.MultiIndex.from_tuples([(1,2)]), dim='foo'))

File ~/workspace/xarray/xarray/core/common.py:621, in DataWithCoords.assign_coords(self, coords, **coords_kwargs) 618 else: 619 results = self._calc_assign_results(coords_combined) --> 621 data.coords.update(results) 622 return data

File ~/workspace/xarray/xarray/core/coordinates.py:566, in Coordinates.update(self, other) 560 # special case for PandasMultiIndex: updating only its dimension coordinate 561 # is still allowed but depreciated. 562 # It is the only case where we need to actually drop coordinates here (multi-index levels) 563 # TODO: remove when removing PandasMultiIndex's dimension coordinate. 564 self._drop_coords(self._names - coords_to_align._names) --> 566 self._update_coords(coords, indexes)

File ~/workspace/xarray/xarray/core/coordinates.py:834, in DataArrayCoordinates._update_coords(self, coords, indexes) 832 coords_plus_data = coords.copy() 833 coords_plus_data[_THIS_ARRAY] = self._data.variable --> 834 dims = calculate_dimensions(coords_plus_data) 835 if not set(dims) <= set(self.dims): 836 raise ValueError( 837 "cannot add coordinates with new dimensions to a DataArray" 838 )

File ~/workspace/xarray/xarray/core/variable.py:3014, in calculate_dimensions(variables) 3012 last_used[dim] = k 3013 elif dims[dim] != size: -> 3014 raise ValueError( 3015 f"conflicting sizes for dimension {dim!r}: " 3016 f"length {size} on {k!r} and length {dims[dim]} on {last_used!r}" 3017 ) 3018 return dims

ValueError: conflicting sizes for dimension 'foo': length 1 on <this-array> and length 3 on {'lat': 'lat', 'lon': 'lon', 'time': 'time', 'foo': 'foo'} ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.9.18 (main, Nov 2 2023, 16:51:22) [Clang 14.0.3 (clang-1403.0.22.14.1)] python-bits: 64 OS: Darwin OS-release: 22.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2023.10.2.dev10+gccc8f998 pandas: 2.1.1 numpy: 1.25.2 scipy: 1.11.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.16.0 cftime: None nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: None dask: 2023.4.0 distributed: 2023.7.1 matplotlib: 3.5.1 cartopy: None seaborn: None numbagg: 0.2.3.dev30+gd26e29e fsspec: 2021.11.1 cupy: None pint: None sparse: None flox: None numpy_groupies: 0.9.19 setuptools: 68.2.2 pip: 23.3.1 conda: None pytest: 7.4.0 mypy: 1.6.0 IPython: 8.15.0 sphinx: 4.3.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8455/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  not_planned xarray 13221727 issue
2022178394 PR_kwDOAMm_X85g92vo 8511 Allow callables to `.drop_vars` max-sixty 5635139 closed 0     0 2023-12-02T19:39:53Z 2023-12-03T22:04:53Z 2023-12-03T22:04:52Z MEMBER   0 pydata/xarray/pulls/8511

This can be used as a nice more general alternative to .drop_indexes or .reset_coords(drop=True)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8511/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2021810083 PR_kwDOAMm_X85g8r6c 8508 Implement `np.clip` as `__array_function__` max-sixty 5635139 closed 0     2 2023-12-02T02:20:11Z 2023-12-03T05:27:38Z 2023-12-03T05:27:33Z MEMBER   0 pydata/xarray/pulls/8508

Would close https://github.com/pydata/xarray/issues/2570

Because of https://numpy.org/neps/nep-0018-array-function-protocol.html#partial-implementation-of-numpy-s-api, no option is ideal: - Don't do anything — don't implement __array_function__. Any numpy function that's not a ufunc — such as np.clip will materialize the array into memory. - Implement __array_function__ and lose the ability to call any non-ufunc-numpy-func that we don't explicitly configure here. So np.lexsort(da) wouldn't work, for example; and users would have to run np.lexsort(da.values). - Implement __array_function__, and attempt to handle the functions we don't explicitly configure by coercing to numpy arrays. This requires writing code to walk a tree of objects looking for arrays to coerce. It seems to go against the original numpy proposal.

@shoyer is this summary accurate?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8508/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2019642778 PR_kwDOAMm_X85g1URY 8497 Fully deprecate `.drop` max-sixty 5635139 closed 0     0 2023-11-30T22:54:57Z 2023-12-02T05:52:50Z 2023-12-02T05:52:49Z MEMBER   0 pydata/xarray/pulls/8497

I think it's time...

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8497/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2013544848 PR_kwDOAMm_X85ggbU0 8487 Start renaming `dims` to `dim` max-sixty 5635139 closed 0     1 2023-11-28T03:25:40Z 2023-11-28T21:04:49Z 2023-11-28T21:04:48Z MEMBER   0 pydata/xarray/pulls/8487

Begins the process of #6646. I don't think it's feasible / enjoyable to do this for everything at once, so I would suggest we do it gradually, while keeping the warnings quite quiet, so by the time we convert to louder warnings, users can do a find/replace easily.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8487/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2010795504 PR_kwDOAMm_X85gXOqo 8484 Fix Zarr region transpose max-sixty 5635139 closed 0     3 2023-11-25T21:01:28Z 2023-11-27T20:56:57Z 2023-11-27T20:56:56Z MEMBER   0 pydata/xarray/pulls/8484

This wasn't working on an unregion-ed write; I think because new_var was being lost.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8484/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2010797682 PR_kwDOAMm_X85gXPEM 8485 Refine rolling_exp error messages max-sixty 5635139 closed 0     0 2023-11-25T21:09:52Z 2023-11-25T21:55:20Z 2023-11-25T21:55:20Z MEMBER   0 pydata/xarray/pulls/8485

(Sorry, copy & pasted too liberally!)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8485/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1966733834 PR_kwDOAMm_X85eCSac 8389 Use numbagg for `ffill` by default max-sixty 5635139 closed 0     5 2023-10-28T20:40:13Z 2023-11-25T21:06:10Z 2023-11-25T21:06:09Z MEMBER   0 pydata/xarray/pulls/8389

The main perf advantage here is the array doesn't need to be unstacked & stacked, which is a huge win for large multi-dimensional arrays... (I actually was hitting a memory issue running an ffill on my own, and so thought I'd get this done!)

We could move these methods to DataWithCoords, since they're almost the same implementation between a DataArray & Dataset, and exactly the same for numbagg's implementation


For transparency — the logic of "check for numbagg, check for bottleneck" I wouldn't rate at my most confident. But I'm more confident that just installing numbagg will work. And if that works well enough, we could consider only supporting numbagg for some of these in the future.

I also haven't done the benchmarks here — though the functions are relatively well benchmarked at numbagg. I'm somewhat trading off getting through these (rolling functions are coming up too) vs. doing fewer slower, and leaning towards the former, but feedback welcome...

  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [x] New functions/methods are listed in api.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8389/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1964877168 PR_kwDOAMm_X85d8EmN 8381 Allow writing to zarr with differently ordered dims max-sixty 5635139 closed 0     2 2023-10-27T06:47:59Z 2023-11-25T21:02:20Z 2023-11-15T18:09:08Z MEMBER   0 pydata/xarray/pulls/8381

Is this reasonable?

  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8381/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2005419839 PR_kwDOAMm_X85gFPfF 8474 Improve "variable not found" error message max-sixty 5635139 closed 0     0 2023-11-22T01:52:47Z 2023-11-24T18:49:39Z 2023-11-24T18:49:38Z MEMBER   0 pydata/xarray/pulls/8474

One very small step as part of https://github.com/pydata/xarray/issues/8264.

The existing error is just KeyError: 'foo, which is annoyingly terse. Future improvements include searching for similar variable names, or even rewriting the user's calling code if there's a close variable name.

This PR creates a new test file. I don't love the format here — it's difficult to snapshot an error message, so it requires copying & pasting things, which doesn't scale well, and the traceback contains environment-specific lines such that it wouldn't be feasible to paste tracebacks.

(here's what we do in PRQL, which is (immodestly) great)

An alternative is just to put these in the mix of all the other tests; am open to that (and not difficult to change later)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8474/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2006891782 PR_kwDOAMm_X85gKSKW 8478 Add whatsnew for #8475 max-sixty 5635139 closed 0     0 2023-11-22T18:22:19Z 2023-11-22T18:45:23Z 2023-11-22T18:45:22Z MEMBER   0 pydata/xarray/pulls/8478

Sorry, forgot in the original PR

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8478/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2005656379 PR_kwDOAMm_X85gGCSj 8475 Allow `rank` to run on dask arrays max-sixty 5635139 closed 0     0 2023-11-22T06:22:44Z 2023-11-22T16:45:03Z 2023-11-22T16:45:02Z MEMBER   0 pydata/xarray/pulls/8475
  • [x] Tests added
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8475/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2005744975 PR_kwDOAMm_X85gGVaY 8476 Fix mypy tests max-sixty 5635139 closed 0     0 2023-11-22T07:36:43Z 2023-11-22T08:01:13Z 2023-11-22T08:01:12Z MEMBER   0 pydata/xarray/pulls/8476

I was seeing an error in #8475

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8476/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2000139267 PR_kwDOAMm_X85fzghA 8464 Fix `map_blocks` docs' formatting max-sixty 5635139 closed 0     1 2023-11-18T01:18:02Z 2023-11-21T18:25:16Z 2023-11-21T18:25:15Z MEMBER   0 pydata/xarray/pulls/8464

Was looking funky. Not 100% sure this is correct but seems consistent with the others

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8464/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2000154383 PR_kwDOAMm_X85fzju6 8466 Move Sphinx directives out of `See also` max-sixty 5635139 open 0     2 2023-11-18T01:57:17Z 2023-11-21T18:25:05Z   MEMBER   0 pydata/xarray/pulls/8466

This is potentially causing the See also to not render the links? (Does anyone know this better? It doesn't seem easy to build the docs locally...)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8466/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
2000146978 PR_kwDOAMm_X85fziKs 8465 Consolidate `_get_alpha` func max-sixty 5635139 closed 0     0 2023-11-18T01:37:25Z 2023-11-21T18:24:52Z 2023-11-21T18:24:51Z MEMBER   0 pydata/xarray/pulls/8465

Am changing this a bit so starting with consolidating it rather than converting twice

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8465/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
400444797 MDExOlB1bGxSZXF1ZXN0MjQ1NjMwOTUx 2687 Enable resampling on PeriodIndex max-sixty 5635139 closed 0     2 2019-01-17T20:13:25Z 2023-11-17T20:38:44Z 2023-11-17T20:38:44Z MEMBER   0 pydata/xarray/pulls/2687

This allows resampling with PeriodIndex objects by keeping the group as an index rather than coercing to a DataArray (which coerces any non-native types to objects)

I'm still getting one failure around the name of the IndexVariable still being __resample_dim__ after resample, but wanted to socialize the approach of allowing a name argument to IndexVariable - is this reasonable?

  • [x] Closes https://github.com/pydata/xarray/issues/1270
  • [x] Tests added
  • [ ] Fully documented, including whats-new.rst for all changes and api.rst for new API
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2687/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1995308522 I_kwDOAMm_X8527f3q 8454 Formalize `mode` / safety guarantees for Zarr max-sixty 5635139 open 0     1 2023-11-15T18:28:38Z 2023-11-15T20:38:04Z   MEMBER      

What is your issue?

It sounds like we're coalescing on when it's safe to write concurrently: - mode="r+" is safe to write concurrently to different parts of a dataset - mode="a" isn't safe, because it changes the shape of an array, for example extending a dimension

What are the existing operations that aren't consistent with this? - Is concurrently writing additional variables safe? Or it requires updating the centralized consolidated metadata? Currently that requires mode="a", which is overly conservative based on the above rules assuming it is safe — we can liberalize to allow with mode="r+". - https://github.com/pydata/xarray/issues/8371, ~but that's a bug~ — edit: or possibly an artifact of writing concurrently to overlapping chunks with a single to_zarr call. We could at least restrict non-aligned writes to mode="a", so it wasn't possible to hit this mistakenly while writing to different parts of a dataset. - Writing the same values to the same chunks concurrently isn't safe at the moment — we'll get an "Stale file handle" error if two processes write to the same location at the same time. I'm not sure if that's possible to allow; possibly it requires work on the Zarr side. If it were possible, we wouldn't have to be as careful about ensuring that each process has mutually exclusive chunks to write. (lower priority)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8454/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1953001043 I_kwDOAMm_X850aG5T 8343 Add `metadata_only` param to `.to_zarr`? max-sixty 5635139 open 0     17 2023-10-19T20:25:11Z 2023-11-15T05:22:12Z   MEMBER      

Is your feature request related to a problem?

A leaf from https://github.com/pydata/xarray/issues/8245, which has a bullet:

compute=False is arguably a less-than-obvious kwarg meaning "write metadata". Maybe this should be a method, maybe it's a candidate for renaming? Or maybe make_template can be an abstraction over it

I've also noticed that for large arrays, running compute=False can take several minutes, despite the indexes being very small. I think this is because it's building a dask task graph — which is then discarded, since the array is written from different machines with the region pattern.

Describe the solution you'd like

Would introducing a metadata_only parameter to to_zarr help here: - Better name - No dask graph

Describe alternatives you've considered

No response

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8343/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1980019336 I_kwDOAMm_X852BLKI 8421 `to_zarr` could transpose dims max-sixty 5635139 closed 0     0 2023-11-06T20:38:35Z 2023-11-14T19:23:08Z 2023-11-14T19:23:08Z MEMBER      

Is your feature request related to a problem?

Currently we need to know the order of dims when using region in to_zarr. Generally in xarray we're fine with the order, because we have the names, so this is a bit of an aberration. It means that code needs to carry around the correct order of dims.

Here's an MCVE:

```python

ds = xr.tutorial.load_dataset('air_temperature')

ds.to_zarr('foo', mode='w')

ds.transpose(..., 'lat').to_zarr('foo', mode='r+')

ValueError: variable 'air' already exists with different dimension names ('time', 'lat', 'lon') != ('time', 'lon', 'lat'), but changing variable dimensions is not supported by to_zarr().

```

Describe the solution you'd like

I think we should be able to transpose them based on the target?

Describe alternatives you've considered

No response

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8421/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1986643906 I_kwDOAMm_X852acfC 8437 Restrict pint test runs max-sixty 5635139 open 0     10 2023-11-10T00:50:52Z 2023-11-13T21:57:45Z   MEMBER      

What is your issue?

Pint tests are failing on main — https://github.com/pydata/xarray/actions/runs/6817674274/job/18541677930

E TypeError: no implementation found for 'numpy.min' on types that implement __array_function__: [<class 'pint.util.Quantity'>]

If we can't fix soon, should we disable?

CC @keewis

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8437/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1986758555 PR_kwDOAMm_X85fGE95 8438 Rename `to_array` to `to_dataarray` max-sixty 5635139 closed 0     2 2023-11-10T02:58:21Z 2023-11-10T06:15:03Z 2023-11-10T06:15:02Z MEMBER   0 pydata/xarray/pulls/8438

This is a very minor nit, so I'm not sure it's worth changing.

What do others think?

(I would have opened an issue but it's just as quick to just do the PR)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8438/reactions",
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
874039546 MDU6SXNzdWU4NzQwMzk1NDY= 5246 test_save_mfdataset_compute_false_roundtrip fails max-sixty 5635139 open 0     1 2021-05-02T20:41:48Z 2023-11-02T04:38:05Z   MEMBER      

What happened:

test_save_mfdataset_compute_false_roundtrip consistently fails in windows-latest-3.9, e.g. https://github.com/pydata/xarray/pull/5244/checks?check_run_id=2485202784

Here's the traceback:

```python self = <xarray.tests.test_backends.TestDask object at 0x000001FF45A9B640>

def test_save_mfdataset_compute_false_roundtrip(self):
    from dask.delayed import Delayed

    original = Dataset({"foo": ("x", np.random.randn(10))}).chunk()
    datasets = [original.isel(x=slice(5)), original.isel(x=slice(5, 10))]
    with create_tmp_file(allow_cleanup_failure=ON_WINDOWS) as tmp1:
        with create_tmp_file(allow_cleanup_failure=ON_WINDOWS) as tmp2:
            delayed_obj = save_mfdataset(
                datasets, [tmp1, tmp2], engine=self.engine, compute=False
            )
            assert isinstance(delayed_obj, Delayed)
            delayed_obj.compute()
            with open_mfdataset(
                [tmp1, tmp2], combine="nested", concat_dim="x"
            ) as actual:
              assert_identical(actual, original)

E AssertionError: Left and right Dataset objects are not identical E
E
E Differing data variables: E L foo (x) float64 dask.array<chunksize=(5,), meta=np.ndarray> E R foo (x) float64 dask.array<chunksize=(10,), meta=np.ndarray> ```

Anything else we need to know?:

xfailed in https://github.com/pydata/xarray/pull/5245

Environment:

[Eliding since it's the test env]

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5246/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1923431725 I_kwDOAMm_X85ypT0t 8264 Improve error messages max-sixty 5635139 open 0     4 2023-10-03T06:42:57Z 2023-10-24T18:40:04Z   MEMBER      

Is your feature request related to a problem?

Coming back to xarray, and using it based on what I remember from a year ago or so, means I make lots of mistakes. I've also been using it outside of a repl, where error messages are more important, given I can't explore a dataset inline.

Some of the error messages could be much more helpful. Take one example:

xarray.core.merge.MergeError: conflicting values for variable 'date' on objects to be combined. You can skip this check by specifying compat='override'.

The second sentence is nice. But the first could be give us much more information: - Which variables conflict? I'm merging four objects, so would be so helpful to know which are causing the issue. - What is the conflict? Is one a superset and I can join=...? Are they off by 1 or are they completely different types? - Our testing.assert_equal produces pretty nice errors, as a comparison

Having these good is really useful, lets folks stay in the flow while they're working, and it signals that we're a well-built, refined library.

Describe the solution you'd like

I'm not sure the best way to surface the issues — error messages make for less legible contributions than features or bug fixes, and the primary audience for good error messages is often the opposite of those actively developing the library. They're also more difficult to manage as GH issues — there could be scores of marginal issues which would often be out of date.

One thing we do in PRQL is have a file that snapshots error messages test_bad_error_messages.rs, which can then be a nice contribution to change those from bad to good. I'm not sure whether that would work here (python doesn't seem to have a great snapshotter, pytest-regtest is the best I've found; I wrote pytest-accept but requires doctests).

Any other ideas?

Describe alternatives you've considered

No response

Additional context

A couple of specific error-message issues: - https://github.com/pydata/xarray/issues/2078 - https://github.com/pydata/xarray/issues/5290

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8264/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1952859208 PR_kwDOAMm_X85dTmUR 8341 Deprecate tuples of chunks? max-sixty 5635139 closed 0     1 2023-10-19T18:44:25Z 2023-10-21T01:45:28Z 2023-10-21T00:49:19Z MEMBER   0 pydata/xarray/pulls/8341

(I was planning on putting an issue in, but then thought it wasn't much more difficult to make the PR. But it's totally fine if we don't think this is a good idea...)

Allowing a tuple of dims means we're reliant on dimension order, which we really try and not be reliant on. It also makes the type signature even more complicated.

So are we OK to encourage a dict of dim: chunksize, rather than a tuple of chunksizes?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8341/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1953143391 PR_kwDOAMm_X85dUk-m 8347 2023.10.1 release notes max-sixty 5635139 closed 0     0 2023-10-19T22:19:43Z 2023-10-19T22:42:48Z 2023-10-19T22:42:47Z MEMBER   0 pydata/xarray/pulls/8347  
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8347/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1948037836 PR_kwDOAMm_X85dDNka 8325 internal: Improve version handling for numbagg max-sixty 5635139 closed 0     1 2023-10-17T18:45:43Z 2023-10-19T15:59:15Z 2023-10-19T15:59:14Z MEMBER   0 pydata/xarray/pulls/8325

Uses the approach in #8316, a bit nicer. Only internal.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8325/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1948548087 PR_kwDOAMm_X85dE9ga 8329 Request to adjust pyright config max-sixty 5635139 closed 0     3 2023-10-18T01:04:00Z 2023-10-18T20:10:42Z 2023-10-18T20:10:41Z MEMBER   0 pydata/xarray/pulls/8329

Would it be possible to not have this config? It overrides the local VS Code config, and means VS Code constantly is reporting errors for me.

Totally open to other approaches ofc. Or that we decide that the tradeoff is worthwhile

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8329/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1948529004 PR_kwDOAMm_X85dE5aA 8327 Add docs to `reindex_like` re broadcasting max-sixty 5635139 closed 0     0 2023-10-18T00:46:52Z 2023-10-18T18:16:43Z 2023-10-18T16:51:12Z MEMBER   0 pydata/xarray/pulls/8327

This wasn't clear to me so I added some examples & a reference to broadcast_like

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8327/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1943054301 PR_kwDOAMm_X85cyrdc 8307 Add `corr`, `cov`, `std` & `var` to `.rolling_exp` max-sixty 5635139 closed 0     0 2023-10-14T07:25:31Z 2023-10-18T17:35:35Z 2023-10-18T16:55:35Z MEMBER   0 pydata/xarray/pulls/8307

From the new routines in numbagg.

Maybe needs better tests (though these are quite heavily tested in numbagg), docs, and potentially need to think about types (maybe existing binary ops can help here?)

(will fail while the build is cached on an old version of numbagg)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8307/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1948537810 PR_kwDOAMm_X85dE7Te 8328 Refine curvefit doctest max-sixty 5635139 closed 0     0 2023-10-18T00:55:16Z 2023-10-18T01:19:27Z 2023-10-18T01:19:26Z MEMBER   0 pydata/xarray/pulls/8328

A very small change

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8328/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1946081841 PR_kwDOAMm_X85c8kKB 8321 Remove a couple of trailing commas in tests max-sixty 5635139 closed 0     0 2023-10-16T20:57:04Z 2023-10-16T21:26:50Z 2023-10-16T21:26:49Z MEMBER   0 pydata/xarray/pulls/8321  
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8321/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1913983402 I_kwDOAMm_X85yFRGq 8233 numbagg & flox max-sixty 5635139 closed 0     13 2023-09-26T17:33:32Z 2023-10-15T07:48:56Z 2023-10-09T15:40:29Z MEMBER      

What is your issue?

I've been doing some work recently on our old friend numbagg, improving the ewm routines & adding some more.

I'm keen to get numbagg back in shape, doing the things that it does best, and trimming anything it doesn't. I notice that it has grouped calcs. Am I correct to think that flox does this better? I haven't been up with the latest. flox looks like it's particularly focused on dask arrays, whereas numpy_groupies, one of the inspirations for this, was applicable to numpy arrays too.

At least from the xarray perspective, are we OK to deprecate these numbagg functions, and direct folks to flox?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8233/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1920172346 PR_kwDOAMm_X85blZOk 8256 Accept `lambda` for `other` param max-sixty 5635139 closed 0     0 2023-09-30T08:24:36Z 2023-10-14T07:26:28Z 2023-09-30T18:50:33Z MEMBER   0 pydata/xarray/pulls/8256  
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8256/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1931467868 PR_kwDOAMm_X85cLSzK 8283 Ask bug reporters to confirm they're using a recent version of xarray max-sixty 5635139 closed 0     0 2023-10-07T19:07:17Z 2023-10-14T07:26:28Z 2023-10-09T13:30:03Z MEMBER   0 pydata/xarray/pulls/8283  
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8283/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1931584082 PR_kwDOAMm_X85cLpuZ 8286 Fix `GroupBy` import max-sixty 5635139 closed 0     0 2023-10-08T01:15:37Z 2023-10-14T07:26:28Z 2023-10-09T13:38:44Z MEMBER   0 pydata/xarray/pulls/8286

Not sure why this only breaks tests for me, vs. in CI, but hopefully no downside to this change...

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8286/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1931581491 PR_kwDOAMm_X85cLpMS 8284 Enable `.rolling_exp` to work on dask arrays max-sixty 5635139 closed 0     0 2023-10-08T01:06:04Z 2023-10-14T07:26:27Z 2023-10-10T06:37:20Z MEMBER   0 pydata/xarray/pulls/8284

Another benefit of the move to .apply_ufunc...

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8284/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1931582554 PR_kwDOAMm_X85cLpap 8285 Add `min_weight` param to `rolling_exp` functions max-sixty 5635139 closed 0     2 2023-10-08T01:09:59Z 2023-10-14T07:24:48Z 2023-10-14T07:24:48Z MEMBER   0 pydata/xarray/pulls/8285  
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8285/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1939241220 PR_kwDOAMm_X85cmBPP 8296 mypy 1.6.0 passing max-sixty 5635139 closed 0     4 2023-10-12T06:04:46Z 2023-10-12T22:13:18Z 2023-10-12T19:06:13Z MEMBER   0 pydata/xarray/pulls/8296

I did the easy things, but will need help for the final couple on _typed_ops.py

Because we don't pin mypy (should we?), this blocks other PRs if we gate them on mypy passing

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8296/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1940614908 PR_kwDOAMm_X85cqvBb 8299 xfail flaky test max-sixty 5635139 closed 0     0 2023-10-12T19:03:59Z 2023-10-12T22:00:51Z 2023-10-12T22:00:47Z MEMBER   0 pydata/xarray/pulls/8299

Would be better to fix it, but in lieu of fixing, better to skip it

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8299/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1920359276 PR_kwDOAMm_X85bl9er 8257 Mandate kwargs on `to_zarr` max-sixty 5635139 closed 0     0 2023-09-30T18:33:13Z 2023-10-12T18:33:15Z 2023-10-04T19:05:02Z MEMBER   0 pydata/xarray/pulls/8257

This aleviates some of the dangers of having these in a different order between da & ds.

Technically it's a breaking change, but only very technically, given that I would wager literally no one has a dozen positional arguments to this method. So I think it's OK.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8257/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1926810300 PR_kwDOAMm_X85b7rlX 8273 Allow a function in `.sortby` method max-sixty 5635139 closed 0     0 2023-10-04T19:04:03Z 2023-10-12T18:33:14Z 2023-10-06T03:35:22Z MEMBER   0 pydata/xarray/pulls/8273  
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8273/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1931585098 PR_kwDOAMm_X85cLp7r 8287 Rename `reset_encoding` to `drop_encoding` max-sixty 5635139 closed 0     1 2023-10-08T01:19:25Z 2023-10-12T17:11:07Z 2023-10-12T17:11:03Z MEMBER   0 pydata/xarray/pulls/8287

Closes #8259

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8287/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1920369929 I_kwDOAMm_X85ydoUJ 8259 Should `.reset_encoding` be `.drop_encoding`? max-sixty 5635139 closed 0     1 2023-09-30T19:11:46Z 2023-10-12T17:11:06Z 2023-10-12T17:11:06Z MEMBER      

What is your issue?

Not the greatest issue facing the universe — but for the cause of consistency — should .reset_encoding be .drop_encoding, since it drops all encoding attributes?

For comparison: - .reset_coords — "Given names of coordinates, reset them to become variables." - '.drop_vars` — "Drop variables from this dataset."

Also ref #8258

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8259/reactions",
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1917929597 PR_kwDOAMm_X85bd2nm 8249 Refine `chunks=None` handling max-sixty 5635139 closed 0     0 2023-09-28T16:54:59Z 2023-10-04T18:34:27Z 2023-09-28T20:01:13Z MEMBER   0 pydata/xarray/pulls/8249

Based on comment in https://github.com/pydata/xarray/pull/8247. This doesn't make it perfect, but allows the warning to get hit and clarifies the type comment, as a stop-gap

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8249/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1216647336 PR_kwDOAMm_X8421oXV 6521 Move license from readme to LICENSE max-sixty 5635139 open 0     3 2022-04-27T00:59:03Z 2023-10-01T09:31:37Z   MEMBER   0 pydata/xarray/pulls/6521  
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6521/reactions",
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1918061661 I_kwDOAMm_X85yU0xd 8251 `.chunk()` doesn't create chunks on 0 dim arrays max-sixty 5635139 open 0     0 2023-09-28T18:30:50Z 2023-09-30T21:31:05Z   MEMBER      

What happened?

.chunk's docstring states:

``` """Coerce this array's data into a dask arrays with the given chunks.

    If this variable is a non-dask array, it will be converted to dask
    array. If it's a dask array, it will be rechunked to the given chunk
    sizes.

```

...but this doesn't happen for 0 dim arrays; example below.

For context, as part of #8245, I had a function that creates a template array. It created an empty DataArray, then expanded dims for each dimension. And it kept blowing up memory! ...until I realized that it was actually not a lazy array.

What did you expect to happen?

It may be that we can't have a 0-dim dask array — but then we should raise in this method, rather than return the wrong thing.

Minimal Complete Verifiable Example

```Python [ins] In [1]: type(xr.DataArray().chunk().data) Out[1]: numpy.ndarray

[ins] In [2]: type(xr.DataArray(1).chunk().data) Out[2]: numpy.ndarray

[ins] In [3]: type(xr.DataArray([1]).chunk().data) Out[3]: dask.array.core.Array ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: 0d6cd2a39f61128e023628c4352f653537585a12 python: 3.9.18 (main, Aug 24 2023, 21:19:58) [Clang 14.0.3 (clang-1403.0.22.14.1)] python-bits: 64 OS: Darwin OS-release: 22.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2023.8.1.dev25+g8215911a.d20230914 pandas: 2.1.1 numpy: 1.25.2 scipy: 1.11.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.16.0 cftime: None nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: None dask: 2023.4.0 distributed: 2023.7.1 matplotlib: 3.5.1 cartopy: None seaborn: None numbagg: 0.2.3.dev30+gd26e29e fsspec: 2021.11.1 cupy: None pint: None sparse: None flox: 0.7.2 numpy_groupies: 0.9.19 setuptools: 68.1.2 pip: 23.2.1 conda: None pytest: 7.4.0 mypy: 1.5.1 IPython: 8.15.0 sphinx: 4.3.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8251/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1920167070 I_kwDOAMm_X85yc2ye 8255 Allow a `lambda` for the `other` param to `where` max-sixty 5635139 closed 0     1 2023-09-30T08:05:54Z 2023-09-30T19:02:42Z 2023-09-30T19:02:42Z MEMBER      

Is your feature request related to a problem?

Currently we allow:

python da.where(lambda x: x.foo == 5)

...but we don't allow:

python da.where(lambda x: x.foo == 5, lambda x: x - x.shift(1))

...which would be nice

Describe the solution you'd like

No response

Describe alternatives you've considered

I don't think this offers many downsides — it's not like we want to fill the array with a callable object.

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8255/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
124154674 MDU6SXNzdWUxMjQxNTQ2NzQ= 688 Keep attrs & Add a 'keep_coords' argument to Dataset.apply max-sixty 5635139 closed 0     14 2015-12-29T02:42:48Z 2023-09-30T18:47:07Z 2023-09-30T18:47:07Z MEMBER      

Generally this isn't a problem, since the coords are carried over by the resulting DataArrays:

``` python In [11]:

ds = xray.Dataset({ 'a':pd.DataFrame(pd.np.random.rand(10,3)), 'b':pd.Series(pd.np.random.rand(10)) }) ds.coords['c'] = pd.Series(pd.np.random.rand(10)) ds Out[11]: <xray.Dataset> Dimensions: (dim_0: 10, dim_1: 3) Coordinates: * dim_0 (dim_0) int64 0 1 2 3 4 5 6 7 8 9 * dim_1 (dim_1) int64 0 1 2 c (dim_0) float64 0.9318 0.2899 0.3853 0.6235 0.9436 0.7928 ... Data variables: a (dim_0, dim_1) float64 0.5707 0.9485 0.3541 0.5987 0.406 0.7992 ... b (dim_0) float64 0.4106 0.2316 0.5804 0.6393 0.5715 0.6463 ... In [12]:

ds.apply(lambda x: x*2) Out[12]: <xray.Dataset> Dimensions: (dim_0: 10, dim_1: 3) Coordinates: c (dim_0) float64 0.9318 0.2899 0.3853 0.6235 0.9436 0.7928 ... * dim_0 (dim_0) int64 0 1 2 3 4 5 6 7 8 9 * dim_1 (dim_1) int64 0 1 2 Data variables: a (dim_0, dim_1) float64 1.141 1.897 0.7081 1.197 0.812 1.598 ... b (dim_0) float64 0.8212 0.4631 1.161 1.279 1.143 1.293 0.3507 ... ```

But if there's an operation that removes the coords from the DataArrays, the coords are not there on the result (notice c below). Should the Dataset retain them? Either always or with a keep_coords argument, similar to keep_attrs.

``` python In [13]:

ds = xray.Dataset({ 'a':pd.DataFrame(pd.np.random.rand(10,3)), 'b':pd.Series(pd.np.random.rand(10)) }) ds.coords['c'] = pd.Series(pd.np.random.rand(10)) ds Out[13]: <xray.Dataset> Dimensions: (dim_0: 10, dim_1: 3) Coordinates: * dim_0 (dim_0) int64 0 1 2 3 4 5 6 7 8 9 * dim_1 (dim_1) int64 0 1 2 c (dim_0) float64 0.4121 0.2507 0.6326 0.4031 0.6169 0.441 0.1146 ... Data variables: a (dim_0, dim_1) float64 0.4813 0.2479 0.5158 0.2787 0.06672 ... b (dim_0) float64 0.2638 0.5788 0.6591 0.7174 0.3645 0.5655 ... In [14]:

ds.apply(lambda x: x.to_pandas()*2) Out[14]: <xray.Dataset> Dimensions: (dim_0: 10, dim_1: 3) Coordinates: * dim_0 (dim_0) int64 0 1 2 3 4 5 6 7 8 9 * dim_1 (dim_1) int64 0 1 2 Data variables: a (dim_0, dim_1) float64 0.9627 0.4957 1.032 0.5574 0.1334 0.8289 ... b (dim_0) float64 0.5275 1.158 1.318 1.435 0.7291 1.131 0.1903 ... ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/688/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1916391948 PR_kwDOAMm_X85bYlaM 8242 Add modules to `check-untyped` max-sixty 5635139 closed 0     2 2023-09-27T21:56:45Z 2023-09-29T17:43:07Z 2023-09-29T16:39:34Z MEMBER   0 pydata/xarray/pulls/8242

In reviewing https://github.com/pydata/xarray/pull/8241, I realize that we actually want check-untyped-defs, which is a bit less strict, but lets us add some more modules on. I did have to add a couple of ignores, think it's a reasonable tradeoff to add big modules like computation on.

Errors with this enabled are actual type errors, not just mypy pedanticness, so would be good to get as much as possible into this list...

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8242/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1878288525 PR_kwDOAMm_X85ZYos5 8139 Fix pandas' `interpolate(fill_value=)` error max-sixty 5635139 closed 0     6 2023-09-02T02:41:45Z 2023-09-28T16:48:51Z 2023-09-04T18:05:14Z MEMBER   0 pydata/xarray/pulls/8139

Pandas no longer has a fill_value parameter for interpolate.

Weirdly I wasn't getting this locally, on pandas 2.1.0, only in CI on https://github.com/pydata/xarray/actions/runs/6054400455/job/16431747966?pr=8138.

Removing it passes locally, let's see whether this works in CI

Would close #8125

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8139/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull

Next page

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 64.259ms · About: xarray-datasette