github: issues: 506 rows where user = 5635139 sorted by updated

506 rows where user = 5635139 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	comments	created_at	updated_at ▲	closed_at	author_association	draft	pull_request	body	reactions	state_reason	repo	type
2272299822	PR_kwDOAMm_X85uL82a	8989	Skip flaky `test_open_mfdataset_manyfiles` test	max-sixty 5635139	closed	0	2024-04-30T19:24:41Z	2024-04-30T20:27:04Z	2024-04-30T19:46:34Z	MEMBER	0	pydata/xarray/pulls/8989	Don't just xfail, and not only on windows, since it can crash the worker	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8989/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2271670475	PR_kwDOAMm_X85uJ5Er	8988	Remove `.drop` warning allow	max-sixty 5635139	closed	0	2024-04-30T14:39:35Z	2024-04-30T19:26:17Z	2024-04-30T19:26:16Z	MEMBER	0	pydata/xarray/pulls/8988		{ "url": "https://api.github.com/repos/pydata/xarray/issues/8988/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2271652603	PR_kwDOAMm_X85uJ122	8987	Add notes on when to add ignores to warnings	max-sixty 5635139	closed	0	2024-04-30T14:34:52Z	2024-04-30T14:56:47Z	2024-04-30T14:56:46Z	MEMBER	0	pydata/xarray/pulls/8987		{ "url": "https://api.github.com/repos/pydata/xarray/issues/8987/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1250939008	I_kwDOAMm_X85Kj9CA	6646	`dim` vs `dims`	max-sixty 5635139	closed	4	2022-05-27T16:15:02Z	2024-04-29T18:24:56Z	2024-04-29T18:24:56Z	MEMBER			What is your issue? I've recently been hit with this when experimenting with `xr.dot` and `xr.corr` — `xr.dot` takes `dims`, and `xr.cov` takes `dim`. Because they each take multiple arrays as positional args, kwargs are more conventional. Should we standardize on one of these?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6646/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
2268058661	PR_kwDOAMm_X85t9f5f	8982	Switch all methods to `dim`	max-sixty 5635139	closed	0	2024-04-29T03:42:34Z	2024-04-29T18:24:56Z	2024-04-29T18:24:55Z	MEMBER	0	pydata/xarray/pulls/8982	I think this is the final set of methods [x] Closes #6646 [ ] Tests added [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8982/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2267810980	PR_kwDOAMm_X85t8q4s	8981	Enable ffill for datetimes	max-sixty 5635139	closed	5	2024-04-28T20:53:18Z	2024-04-29T18:09:48Z	2024-04-28T23:02:11Z	MEMBER	0	pydata/xarray/pulls/8981	Notes inline. Would fix #4587	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8981/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2262478932	PR_kwDOAMm_X85tqpUi	8974	Raise errors on new warnings from within xarray	max-sixty 5635139	closed	2	2024-04-25T01:50:48Z	2024-04-29T12:18:42Z	2024-04-29T02:50:21Z	MEMBER	0	pydata/xarray/pulls/8974	Notes are inline. [x] Closes https://github.com/pydata/xarray/issues/8494 [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` Done with some help from an LLM — quite good for doing tedious tasks that we otherwise wouldn't want to do — can paste in all the warnings output and get a decent start on rules for exclusions	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8974/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1997537503	PR_kwDOAMm_X85fqp3A	8459	Check for aligned chunks when writing to existing variables	max-sixty 5635139	closed	5	2023-11-16T18:56:06Z	2024-04-29T03:05:36Z	2024-03-29T14:35:50Z	MEMBER	0	pydata/xarray/pulls/8459	While I don't feel super confident that this is designed to protect against any bugs, it does solve the immediate problem in #8371, by hoisting the encoding check above the code that runs for only new variables. The encoding check is somewhat implicit, so this was an easy thing to miss prior. [x] Closes #8371, [x] Closes #8882 [x] Closes #8876 [x] Tests added [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8459/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2244681150	PR_kwDOAMm_X85suxIl	8947	Add mypy to dev dependencies	max-sixty 5635139	closed	0	2024-04-15T21:39:19Z	2024-04-17T16:39:23Z	2024-04-17T16:39:22Z	MEMBER	0	pydata/xarray/pulls/8947		{ "url": "https://api.github.com/repos/pydata/xarray/issues/8947/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1960332384	I_kwDOAMm_X8502Exg	8371	Writing to regions with unaligned chunks can lose data	max-sixty 5635139	closed	20	2023-10-25T01:17:59Z	2024-03-29T14:35:51Z	2024-03-29T14:35:51Z	MEMBER			What happened? Writing with `region` with chunks that aren't aligned can lose data. I've recreated an example below. While it's unlikely that folks are passing different values to `.chunk` for the template vs. the regions, I had an `"auto"` chunk, which can then set different chunk values. (FWIW, this was fairly painful, and I managed to lose a lot of time by not noticing this, and then not really considering this could happen as I was trying to debug. I think we should really strive to ensure that we don't lose data / incorrectly report that we've successfully written data...) What did you expect to happen? If there's a risk of data loss, raise an error... Minimal Complete Verifiable Example ```Python ds = xr.DataArray(np.arange(120).reshape(4,3,-1),dims=list("abc")).rename('var1').to_dataset().chunk(2) ds <xarray.Dataset> Dimensions: (a: 4, b: 3, c: 10) Dimensions without coordinates: a, b, c Data variables: var1 (a, b, c) int64 dask.array<chunksize=(2, 2, 2), meta=np.ndarray> def write(ds): ds.chunk(5).to_zarr('foo.zarr', compute=False, mode='w') for r in (range(ds.sizes['a'])): ds.chunk(3).isel(a=[r]).to_zarr('foo.zarr', region=dict(a=slice(r, r+1))) def read(ds): result = xr.open_zarr('foo.zarr') assert result.compute().identical(ds) print(result.chunksizes, ds.chunksizes) write(ds); read(ds) AssertionError xr.open_zarr('foo.zarr').compute()['var1'] <xarray.DataArray 'var1' (a: 4, b: 3, c: 10)> array([[[ 0, 0, 0, 3, 4, 5, 0, 0, 0, 9], [ 0, 0, 0, 13, 14, 15, 0, 0, 0, 19], [ 0, 0, 0, 23, 24, 25, 0, 0, 0, 29]], `[[ 30, 31, 32, 0, 0, 35, 36, 37, 38, 0], [ 40, 41, 42, 0, 0, 45, 46, 47, 48, 0], [ 50, 51, 52, 0, 0, 55, 56, 57, 58, 0]], [[ 60, 61, 62, 0, 0, 65, 0, 0, 0, 69], [ 70, 71, 72, 0, 0, 75, 0, 0, 0, 79], [ 80, 81, 82, 0, 0, 85, 0, 0, 0, 89]], [[ 0, 0, 0, 93, 94, 95, 96, 97, 98, 0], [ 0, 0, 0, 103, 104, 105, 106, 107, 108, 0], [ 0, 0, 0, 113, 114, 115, 116, 117, 118, 0]]])` Dimensions without coordinates: a, b, c ``` MVCE confirmation [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [X] Complete example — the example is self-contained, including all data and the text of any traceback. [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [X] New issue — a search of GitHub Issues suggests this is not a duplicate. [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies. Relevant log output No response Anything else we need to know? No response Environment INSTALLED VERSIONS ------------------ commit: ccc8f9987b553809fb6a40c52fa1a8a8095c8c5f python: 3.9.18 (main, Aug 24 2023, 21:19:58) [Clang 14.0.3 (clang-1403.0.22.14.1)] python-bits: 64 OS: Darwin OS-release: 22.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2023.10.2.dev10+gccc8f998 pandas: 2.1.1 numpy: 1.25.2 scipy: 1.11.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.16.0 cftime: None nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: None dask: 2023.4.0 distributed: 2023.7.1 matplotlib: 3.5.1 cartopy: None seaborn: None numbagg: 0.2.3.dev30+gd26e29e fsspec: 2021.11.1 cupy: None pint: None sparse: None flox: None numpy_groupies: 0.9.19 setuptools: 68.1.2 pip: 23.2.1 conda: None pytest: 7.4.0 mypy: 1.6.0 IPython: 8.15.0 sphinx: 4.3.2	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8371/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
2110888925	I_kwDOAMm_X8590Zvd	8690	Add `nbytes` to repr?	max-sixty 5635139	closed	9	2024-01-31T20:13:59Z	2024-02-19T22:18:47Z	2024-02-07T20:47:38Z	MEMBER			Is your feature request related to a problem? Would having the `nbytes` value in the `Dataset` repr be reasonable? I frequently find myself logging this separately. For example: diff <xarray.Dataset> Dimensions: (lat: 25, time: 2920, lon: 53) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 Data variables: - air (time, lat, lon) float32 dask.array<chunksize=(2920, 25, 53), meta=np.ndarray> + air (time, lat, lon) float32 15MB dask.array<chunksize=(2920, 25, 53), meta=np.ndarray> Attributes: Conventions: COARDS title: 4x daily NMC reanalysis (1948) description: Data is from NMC initialized reanalysis\n(4x/day). These a... platform: Model references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly... Describe the solution you'd like No response Describe alternatives you've considered Status quo :) Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8690/reactions", "total_count": 6, "+1": 6, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
2128692061	PR_kwDOAMm_X85mkDqu	8735	Remove fsspec exclusion from 2021	max-sixty 5635139	closed	1	2024-02-10T19:43:14Z	2024-02-11T00:19:30Z	2024-02-11T00:19:29Z	MEMBER	0	pydata/xarray/pulls/8735	Presumably no longer needed	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8735/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2128687154	PR_kwDOAMm_X85mkCum	8734	Silence dask doctest warning	max-sixty 5635139	closed	0	2024-02-10T19:25:47Z	2024-02-10T23:44:24Z	2024-02-10T23:44:24Z	MEMBER	0	pydata/xarray/pulls/8734	Closes #8732. Not the most elegant implementation but it's only temporary	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8734/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1920361792	PR_kwDOAMm_X85bl988	8258	Add a `.drop_attrs` method	max-sixty 5635139	open	9	2023-09-30T18:42:12Z	2024-02-09T18:49:22Z		MEMBER	0	pydata/xarray/pulls/8258	Part of #3891 ~Do we think this is a good idea? I'll add docs & tests if so...~ Ready to go, just needs agreement on whether it's good	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8258/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2126375172	I_kwDOAMm_X85-vekE	8726	PRs requiring approval & merging main?	max-sixty 5635139	closed	4	2024-02-09T02:35:58Z	2024-02-09T18:23:52Z	2024-02-09T18:21:59Z	MEMBER			What is your issue? Sorry I haven't been on the calls at all recently (unfortunately the schedule is difficult for me). Maybe this was discussed there? PRs now seem to require a separate approval prior to merging. Is there an upside to this? Is there any difference between those who can approve and those who can merge? Otherwise it just seems like more clicking. PRs also now seem to require merging the latest main prior to merging? I get there's some theoretical value to this, because changes can semantically conflict with each other. But it's extremely rare that this actually happens (can we point to cases?), and it limits the immediacy & throughput of PRs. If the bad outcome does ever happen, we find out quickly when main tests fail and can revert. (fwiw I wrote a few principles around this down a while ago here; those are much stronger than what I'm suggesting in this issue though)	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8726/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
2126095122	PR_kwDOAMm_X85mbRG7	8724	Switch `.dt` to raise an `AttributeError`	max-sixty 5635139	closed	0	2024-02-08T21:26:06Z	2024-02-09T02:21:47Z	2024-02-09T02:21:46Z	MEMBER	0	pydata/xarray/pulls/8724	Discussion at #8718	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8724/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1984961987	I_kwDOAMm_X852UB3D	8432	Writing a datetime coord ignores chunks	max-sixty 5635139	closed	5	2023-11-09T07:00:39Z	2024-01-29T19:12:33Z	2024-01-29T19:12:33Z	MEMBER			What happened? When writing a coord with a datetime type, the chunking on the coord is ignored, and the whole coord is written as a single chunk. (or at least it can be, I haven't done enough to confirm whether it'll always be...) This can be quite inconvenient. Any attempt to write to that dataset from a distributed process will have errors, since each process will be attempting to write another process's data, rather than only its region. And less severely, the chunks won't be unified. Minimal Complete Verifiable Example ```Python ds = xr.tutorial.load_dataset('air_temperature') ( ds.chunk() .expand_dims(a=1000) .assign_coords( time2=lambda x: x.time, time_int=lambda x: (("time"), np.full(ds.sizes["time"], 1)), ) .chunk(time=10) .to_zarr("foo.zarr", mode="w") ) xr.open_zarr('foo.zarr') Note the `chunksize=(2920,)` vs `chunksize=(10,)`! <xarray.Dataset> Dimensions: (a: 1000, time: 2920, lat: 25, lon: 53) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 time2 (time) datetime64[ns] dask.array<chunksize=(2920,), meta=np.ndarray> # here time_int (time) int64 dask.array<chunksize=(10,), meta=np.ndarray> # here Dimensions without coordinates: a Data variables: air (a, time, lat, lon) float32 dask.array<chunksize=(1000, 10, 25, 53), meta=np.ndarray> Attributes: Conventions: COARDS description: Data is from NMC initialized reanalysis\n(4x/day). These a... platform: Model references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly... title: 4x daily NMC reanalysis (1948) xr.open_zarr('foo.zarr').chunks ValueError Traceback (most recent call last) Cell In[13], line 1 ----> 1 xr.open_zarr('foo.zarr').chunks File /opt/homebrew/lib/python3.9/site-packages/xarray/core/dataset.py:2567, in Dataset.chunks(self) 2552 @property 2553 def chunks(self) -> Mapping[Hashable, tuple[int, ...]]: 2554 """ 2555 Mapping from dimension names to block lengths for this dataset's data, or None if 2556 the underlying data is not a dask array. (...) 2565 xarray.unify_chunks 2566 """ -> 2567 return get_chunksizes(self.variables.values()) File /opt/homebrew/lib/python3.9/site-packages/xarray/core/common.py:2013, in get_chunksizes(variables) 2011 for dim, c in v.chunksizes.items(): 2012 if dim in chunks and c != chunks[dim]: -> 2013 raise ValueError( 2014 f"Object has inconsistent chunks along dimension {dim}. " 2015 "This can be fixed by calling unify_chunks()." 2016 ) 2017 chunks[dim] = c 2018 return Frozen(chunks) ValueError: Object has inconsistent chunks along dimension time. This can be fixed by calling unify_chunks(). ``` MVCE confirmation [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [X] Complete example — the example is self-contained, including all data and the text of any traceback. [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [X] New issue — a search of GitHub Issues suggests this is not a duplicate. [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies. Relevant log output No response Anything else we need to know? No response Environment INSTALLED VERSIONS ------------------ commit: None python: 3.9.18 (main, Nov 2 2023, 16:51:22) [Clang 14.0.3 (clang-1403.0.22.14.1)] python-bits: 64 OS: Darwin OS-release: 22.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: None xarray: 2023.10.1 pandas: 2.1.1 numpy: 1.26.1 scipy: 1.11.1 netCDF4: None pydap: None h5netcdf: 1.1.0 h5py: 3.8.0 Nio: None zarr: 2.16.0 cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: 1.3.7 dask: 2023.5.0 distributed: 2023.5.0 matplotlib: 3.6.0 cartopy: None seaborn: 0.12.2 numbagg: 0.6.0 fsspec: 2022.8.2 cupy: None pint: 0.22 sparse: 0.14.0 flox: 0.8.1 numpy_groupies: 0.9.22 setuptools: 68.2.2 pip: 23.3.1 conda: None pytest: 7.4.0 mypy: 1.6.1 IPython: 8.14.0 sphinx: 5.2.1	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8432/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
2099077744	PR_kwDOAMm_X85k_vqU	8661	Add `dev` dependencies to `pyproject.toml`	max-sixty 5635139	closed	1	2024-01-24T20:48:55Z	2024-01-25T06:24:37Z	2024-01-25T06:24:36Z	MEMBER	0	pydata/xarray/pulls/8661		{ "url": "https://api.github.com/repos/pydata/xarray/issues/8661/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2097231358	PR_kwDOAMm_X85k5dSd	8648	xfail another test on windows	max-sixty 5635139	closed	0	2024-01-24T01:04:01Z	2024-01-24T01:23:26Z	2024-01-24T01:23:26Z	MEMBER	0	pydata/xarray/pulls/8648	As ever, very open to approaches to fix these. But unless we can fix them, xfailing them seems like the most reasonable solution	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8648/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2089331658	PR_kwDOAMm_X85keyUs	8624	Use ddof in `numbagg>=0.7.0` for aggregations	max-sixty 5635139	closed	0	2024-01-19T00:23:15Z	2024-01-23T02:25:39Z	2024-01-23T02:25:38Z	MEMBER	0	pydata/xarray/pulls/8624		{ "url": "https://api.github.com/repos/pydata/xarray/issues/8624/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2094956413	PR_kwDOAMm_X85kxwAk	8643	xfail zarr test on Windows	max-sixty 5635139	closed	0	2024-01-22T23:24:12Z	2024-01-23T00:40:29Z	2024-01-23T00:40:28Z	MEMBER	0	pydata/xarray/pulls/8643	I see this failing quite a lot of the time... Ofc open to a proper solution but in the meantime setting this to xfail	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8643/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2092299525	PR_kwDOAMm_X85kozmg	8630	Use `T_DataArray` in `Weighted`	max-sixty 5635139	closed	0	2024-01-21T01:18:14Z	2024-01-22T04:28:07Z	2024-01-22T04:28:07Z	MEMBER	0	pydata/xarray/pulls/8630	Allows subtypes. (I had this in my git stash, so commiting it...)	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8630/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2092855603	PR_kwDOAMm_X85kqlH4	8639	Silence deprecation warning from `.dims` in tests	max-sixty 5635139	closed	1	2024-01-22T00:25:07Z	2024-01-22T02:04:54Z	2024-01-22T02:04:53Z	MEMBER	0	pydata/xarray/pulls/8639		{ "url": "https://api.github.com/repos/pydata/xarray/issues/8639/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2092790802	PR_kwDOAMm_X85kqX8y	8637	xfail a cftime test	max-sixty 5635139	closed	0	2024-01-21T21:43:59Z	2024-01-21T22:00:59Z	2024-01-21T22:00:58Z	MEMBER	0	pydata/xarray/pulls/8637	https://github.com/pydata/xarray/pull/8636#issuecomment-1902775153	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8637/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2092777417	PR_kwDOAMm_X85kqVIH	8636	xfail another dask/pyarrow test	max-sixty 5635139	closed	1	2024-01-21T21:26:19Z	2024-01-21T21:42:22Z	2024-01-21T21:42:21Z	MEMBER	0	pydata/xarray/pulls/8636	Unsure why this wasn't showing prior -- having tests fail in the good state does make it much more difficult to ensure everything is fixed before merging.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8636/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2089351473	PR_kwDOAMm_X85ke2qd	8625	Don't show stdlib paths for `user_level_warnings`	max-sixty 5635139	closed	0	2024-01-19T00:45:14Z	2024-01-21T21:08:40Z	2024-01-21T21:08:39Z	MEMBER	0	pydata/xarray/pulls/8625	Was previously seeing: <frozen _collections_abc>:801: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`. Now: /Users/maximilian/workspace/xarray/xarray/tests/test_dataset.py:701: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`. assert ds.dims == ds.sizes It's a heuristic, so not perfect, but I think very likely to be accurate. Any contrary cases very welcome...	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8625/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2092762468	PR_kwDOAMm_X85kqSLW	8635	xfail pyarrow test	max-sixty 5635139	closed	0	2024-01-21T20:42:50Z	2024-01-21T21:03:35Z	2024-01-21T21:03:34Z	MEMBER	0	pydata/xarray/pulls/8635	Sorry for the repeated PR -- some tests passed but some failed without pyarrow installed. So this xfails the test for the moment	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8635/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2092747686	PR_kwDOAMm_X85kqPTB	8634	Workaround broken test from pyarrow	max-sixty 5635139	closed	0	2024-01-21T20:01:51Z	2024-01-21T20:18:23Z	2024-01-21T20:18:22Z	MEMBER	0	pydata/xarray/pulls/8634	While fixing the previous issue, I introduced another (but didn't see it because of the errors from the test suite, probably should have looked closer...) This doesn't fix the behavior, but I think it's minor so fine to push off. I do prioritize getting the tests where pass vs failure is meaningful again	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8634/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2092300888	PR_kwDOAMm_X85koz3r	8631	Partially fix doctests	max-sixty 5635139	closed	1	2024-01-21T01:25:02Z	2024-01-21T01:33:43Z	2024-01-21T01:31:46Z	MEMBER	0	pydata/xarray/pulls/8631	Currently getting a error without pyarrow in CI: https://github.com/pydata/xarray/actions/runs/7577666145/job/20693665924	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8631/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1923361961	I_kwDOAMm_X85ypCyp	8263	Surprising `.groupby` behavior with float index	max-sixty 5635139	closed	0	2023-10-03T05:50:49Z	2024-01-08T01:05:25Z	2024-01-08T01:05:25Z	MEMBER			What is your issue? We raise an error on grouping without supplying dims, but not for float indexes — is this intentional or an oversight? This is without `flox` installed ```python da = xr.tutorial.open_dataset("air_temperature")['air'] da.drop_vars('lat').groupby('lat').sum() ``` ``` ValueError Traceback (most recent call last) Cell In[8], line 1 ----> 1 da.drop_vars('lat').groupby('lat').sum() ... ValueError: cannot reduce over dimensions ['lat']. expected either '...' to reduce over all dimensions or one or more of ('time', 'lon'). ``` But with a float index, we don't raise: `python da.groupby('lat').sum()` ...returns the original array: `Out[15]: <xarray.DataArray 'air' (time: 2920, lat: 25, lon: 53)> array([[[296.29 , 296.79 , 297.1 , ..., 296.9 , 296.79 , 296.6 ], [295.9 , 296.19998, 296.79 , ..., 295.9 , 295.9 , 295.19998], [296.6 , 296.19998, 296.4 , ..., 295.4 , 295.1 , 294.69998], ...` And if we try this with a non-float index, we get the error again: `python da.groupby('time').sum()` `ValueError: cannot reduce over dimensions ['time']. expected either '...' to reduce over all dimensions or one or more of ('lat', 'lon').`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8263/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1916677049	I_kwDOAMm_X85yPiu5	8245	Tools for writing distributed zarrs	max-sixty 5635139	open	0	2023-09-28T04:25:45Z	2024-01-04T00:15:09Z		MEMBER			What is your issue? There seems to be a common pattern for writing zarrs from a distributed set of machines, in parallel. It's somewhat described in the prose of the io docs. Quoting: Creating the template — "the first step is creating an initial Zarr store without writing all of its array data. This can be done by first creating a Dataset with dummy values stored in dask, and then calling to_zarr with compute=False to write only metadata to Zarr" Writing out each region from workers — "a Zarr store with the correct variable shapes and attributes exists that can be filled out by subsequent calls to to_zarr. The region provides a mapping from dimension names to Python slice objects indicating where the data should be written (in index space, not coordinate space)" I've been using this fairly successfully recently. It's much better than writing hundreds or thousands of data variables, since many small data variables create a huge number of files. Are there some tools we can provide to make this easier? Some ideas: - [ ] `compute=False` is arguably a less-than-obvious kwarg meaning "write metadata". Maybe this should be a method, maybe it's a candidate for renaming? Or maybe `make_template` can be an abstraction over it. Something like `xarray_beam.make_template` to make the template from a Dataset? - Or from an array of indexes? - https://github.com/pydata/xarray/issues/8343 - https://github.com/pydata/xarray/pull/8460 - [ ] What happens if one worker's data isn't aligned on some dimensions? Will that write to the wrong location? Could we offer an option, similar to the above, to reindex on the template dimensions? [ ] When writing a region, we need to drop other vars. Can we offer this as a kwarg? Occasionally I'll add a dimension with an index to a dataset, run the function to write it — and it'll fail, because I forgot to add that index to the `.drop_vars` call that precedes the write. When we're writing a template, all the indexes are written up front anyway. (edit: #6260) https://github.com/pydata/xarray/pull/8460 More minor papercuts: - [ ] I've hit an issue where writing a region seemed to cause the worker to attempt to load the whole array into memory — can we offer guarantees for when (non-metadata) data will be loaded during `to_zarr`? - [ ] How about adding `raise_if_dask_computes` to our public API? The alternative I've been doing is watching `htop` and existing if I see memory ballooning, which is less cerebral... - [ ] It doesn't seem easy to write coords on a DataArray. For example, writing `xr.tutorial.load_dataset('air_temperature').assign_coords(lat2=da.lat + 2, a=(('lon',), ['a'] * len(da.lon))).chunk().to_zarr('foo.zarr', compute=False)` will cause the non-index coords to be written as empty. But writing them separately conflicts with having a single variable. Currently I manually load each coord before writing, which is not super-friendly. Some things that were in the list here, as they've been completed!! - [x] Requiring `region` to be specified as an int range can be inconvenient — would it feasible to have a function that grabs the template metadata, calculates the region ints, and then calculates the implied indexes? - Edit: suggested at https://github.com/pydata/xarray/issues/7702	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8245/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 1 }		xarray 13221727	issue
1975574237	I_kwDOAMm_X851wN7d	8409	Task graphs on `.map_blocks` with many chunks can be huge	max-sixty 5635139	closed	6	2023-11-03T07:14:45Z	2024-01-03T04:10:16Z	2024-01-03T04:10:16Z	MEMBER			What happened? I'm getting task graphs > 1GB, I think possibly because the full indexes are being included in every task? What did you expect to happen? Only the relevant sections of the index would be included Minimal Complete Verifiable Example ```Python da = xr.tutorial.load_dataset('air_temperature') Dropping the index doesn't generally matter that much... len(cloudpickle.dumps(da.chunk(lat=1, lon=1))) 15569320 len(cloudpickle.dumps(da.chunk().drop_vars(da.indexes))) 15477313 But with `.map_blocks`, it really matters — it's really big with the indexes, and the same size without: len(cloudpickle.dumps(da.chunk(lat=1, lon=1).map_blocks(lambda x: x))) 79307120 len(cloudpickle.dumps(da.chunk(lat=1, lon=1).drop_vars(da.indexes).map_blocks(lambda x: x))) 16016173 ``` MVCE confirmation [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [X] Complete example — the example is self-contained, including all data and the text of any traceback. [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [X] New issue — a search of GitHub Issues suggests this is not a duplicate. [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies. Relevant log output No response Anything else we need to know? No response Environment INSTALLED VERSIONS ------------------ commit: None python: 3.9.18 (main, Aug 24 2023, 21:19:58) [Clang 14.0.3 (clang-1403.0.22.14.1)] python-bits: 64 OS: Darwin OS-release: 22.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: None xarray: 2023.10.1 pandas: 2.1.1 numpy: 1.26.1 scipy: 1.11.1 netCDF4: None pydap: None h5netcdf: 1.1.0 h5py: 3.8.0 Nio: None zarr: 2.16.0 cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: 1.3.7 dask: 2023.5.0 distributed: 2023.5.0 matplotlib: 3.6.0 cartopy: None seaborn: 0.12.2 numbagg: 0.6.0 fsspec: 2022.8.2 cupy: None pint: 0.22 sparse: 0.14.0 flox: 0.7.2 numpy_groupies: 0.9.22 setuptools: 68.1.2 pip: 23.2.1 conda: None pytest: 7.4.0 mypy: 1.6.1 IPython: 8.14.0 sphinx: 5.2.1	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8409/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
2052840951	I_kwDOAMm_X856W933	8566	Use `ddof=1` for `std` & `var`	max-sixty 5635139	open	2	2023-12-21T17:47:21Z	2023-12-27T16:58:46Z		MEMBER			What is your issue? I've discussed this a bunch with @dcherian (though I'm not sure he necessarily agrees, I'll let him comment) Currently xarray uses `ddof=0` for `std` & `var`. This is: - Rarely what someone actually wants — xarray data is almost always a sample of some underlying distribution, for which `ddof=1` is correct - Inconsistent with pandas OTOH: - It is consistent with numpy - It wouldn't be a painless change — folks who don't read deprecation messages would see values change very slightly Any thoughts?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8566/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
988158051	MDU6SXNzdWU5ODgxNTgwNTE=	5764	Implement __sizeof__ on objects?	max-sixty 5635139	open	6	2021-09-03T23:36:53Z	2023-12-19T18:23:08Z		MEMBER			Is your feature request related to a problem? Please describe. Currently `ds.nbytes` returns the size of the data. But `sys.getsizeof(ds)` returns a very small number. Describe the solution you'd like If we implement `__sizeof__` on DataArrays & Datasets, this would work. I think that would be something like `ds.nbytes` + the size of the `ds` container, + maybe attrs if those aren't handled by `.nbytes`?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5764/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	reopened	xarray 13221727	issue
2033367994	PR_kwDOAMm_X85hj9np	8533	Offer a fixture for unifying DataArray & Dataset tests	max-sixty 5635139	closed	2	2023-12-08T22:06:28Z	2023-12-18T21:30:41Z	2023-12-18T21:30:40Z	MEMBER	0	pydata/xarray/pulls/8533	Some tests are literally copy & pasted between DataArray & Dataset tests. This change allows them to use a single test. Not everything will be able to use this — sometimes we want to check specifics — but some will — I've change the `.cumulative` tests to use this fixture.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8533/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1977661256	I_kwDOAMm_X8514LdI	8414	Is there any way of having `.map_blocks` be even more opaque to dask?	max-sixty 5635139	closed	23	2023-11-05T06:56:43Z	2023-12-12T18:14:57Z	2023-12-12T18:14:57Z	MEMBER			Is your feature request related to a problem? Currently I have a workload which does something a bit like: `python ds = open_zarr(source) ( ds.assign( x=ds.foo * ds.bar y=ds.foo + ds.bar ).to_zarr(dest) )` (the actual calc is a bit more complicated! And while I don't have a MVCE of the full calc, I pasted a task graph below) Dask — while very impressive in many ways — handles this extremely badly, because it attempts to load the whole of `ds` into memory before writing out any chunks. There are lots of issues on this in the dask repo; it seems like an intractable problem for dask. Describe the solution you'd like I was hoping to make the internals of this task opaque to dask, so it became a much dumber task runner — just map over the blocks, running the function and writing the result, block by block. I thought I had some success with `.map_blocks` last week — the internals of the calc are now opaque at least. But the dask cluster is falling over again, I think because the write is seen as a separate task. Is there any way to make the write more opaque too? Describe alternatives you've considered I've built a homegrown thing which is really hacky which does this on a custom scheduler — just runs the functions and writes with `region`. I'd much prefer to use & contribute to the broader ecosystem... Additional context (It's also possible I'm making some basic error — and I do remember it working much better last week — so please feel free to direct me / ask me for more examples, if this doesn't ring true)	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8414/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
2034575163	PR_kwDOAMm_X85hn4Pn	8539	Filter out doctest warning	max-sixty 5635139	closed	11	2023-12-10T23:11:36Z	2023-12-12T06:37:54Z	2023-12-11T21:00:01Z	MEMBER	0	pydata/xarray/pulls/8539	Trying to fix #8537. Not sure it'll work and can't test locally so seeing if it passes CI	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8539/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2036491126	PR_kwDOAMm_X85hud-m	8543	Fix incorrect indent	max-sixty 5635139	closed	0	2023-12-11T20:41:32Z	2023-12-11T20:43:26Z	2023-12-11T20:43:09Z	MEMBER	0	pydata/xarray/pulls/8543	edit: my mistake, this is intended	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8543/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
866826033	MDU6SXNzdWU4NjY4MjYwMzM=	5215	Add an Cumulative aggregation, similar to Rolling	max-sixty 5635139	closed	6	2021-04-24T19:59:49Z	2023-12-08T22:06:53Z	2023-12-08T22:06:53Z	MEMBER			Is your feature request related to a problem? Please describe. Pandas has a `.expanding` aggregation, which is basically rolling with a full lookback. I often end up supplying rolling with the length of the dimension, and this is some nice sugar for that. Describe the solution you'd like Basically the same as pandas — a `.expanding` method that returns an `Expanding` class, which implements the same methods as a `Rolling` class. Describe alternatives you've considered Some options: – This – Don't add anything, the sugar isn't worth the additional API. – Go full out and write specialized expanding algos — which will be faster since they don't have to keep track of the window. But not that much faster, likely not worth the effort.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5215/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
2022202767	PR_kwDOAMm_X85g97hj	8512	Add Cumulative aggregation	max-sixty 5635139	closed	1	2023-12-02T21:03:13Z	2023-12-08T22:06:53Z	2023-12-08T22:06:52Z	MEMBER	0	pydata/xarray/pulls/8512	Closes #5215	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8512/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2019645081	I_kwDOAMm_X854YVaZ	8498	Allow some notion of ordering in Dataset dims	max-sixty 5635139	closed	5	2023-11-30T22:57:23Z	2023-12-08T19:22:56Z	2023-12-08T19:22:55Z	MEMBER			What is your issue? Currently a `DataArray`'s dims are ordered, while a `Dataset`'s are not. Do we gain anything from have unordered dims in a Dataset? Could we have an ordering without enforcing it on every variable? Here's one proposal, with fairly wide error-bars: - Datasets have a dim order, which is set at construction time or through `.transpose` - Currently `.transpose` changes the order of each variable's dims, but not the dataset's - If dims aren't supplied, we can just use the first variable's - Variables don't have to conform to that order — `.assign(foo=differently_ordered)` maintains the differently ordered dims. So this doesn't limit any current functionality. - When there are transformations which change dim ordering, Xarray is "allowed" to transpose variables to the dataset's ordering. Currently Xarray is "allowed" to change dim order arbitrarily — for example to put a core dim last. IIUC, we'd prefer to set a non-arbitrary order, but we don't have one to reference. - This would remove a bunch of boilerplate from methods that save the ordering, run `.apply_ufunc` and then reorder in the original order[^1] What do folks think? [^1]: though also we could do this in `.apply_ufunc`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8498/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	not_planned	xarray 13221727	issue
2026963757	I_kwDOAMm_X8540QMt	8522	Test failures on `main`	max-sixty 5635139	closed	7	2023-12-05T19:22:01Z	2023-12-06T18:48:24Z	2023-12-06T17:28:13Z	MEMBER			What is your issue? Any ideas what could be causing these? I can't immediately reproduce locally. https://github.com/pydata/xarray/actions/runs/7105414268/job/19342564583 ``` Error: TestDataArray.test_computation_objects[int64-method_groupby_bins-data] AssertionError: Left and right DataArray objects are not close Differing values: L <Quantity([[ nan nan 1. 1. ] [2. 2. 3. 3. ] [4. 4. 5. 5. ] [6. 6. 7. 7. ] [8. 8. 9. 9.333333]], 'meter')> R <Quantity([[0. 0. 1. 1. ] [2. 2. 3. 3. ] [4. 4. 5. 5. ] [6. 6. 7. 7. ] [8. 8. 9. 9.333333]], 'meter')> ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8522/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 1, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1192478248	I_kwDOAMm_X85HE8Yo	6440	Add `eval`?	max-sixty 5635139	closed	0	2022-04-05T00:57:00Z	2023-12-06T17:52:47Z	2023-12-06T17:52:47Z	MEMBER			Is your feature request related to a problem? We currently have `query`, which can runs a numexpr string using `eval`. Describe the solution you'd like Should we add an `eval` method itself? I find that when building something for the command line, allowing people to pass an `eval`-able expression can be a good interface. Describe alternatives you've considered No response Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6440/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1410303926	PR_kwDOAMm_X85A3Xqk	7163	Add `eval` method to Dataset	max-sixty 5635139	closed	3	2022-10-15T22:12:23Z	2023-12-06T17:52:47Z	2023-12-06T17:52:46Z	MEMBER	0	pydata/xarray/pulls/7163	This needs proper tests & docs, but would this be a good idea? A couple of examples are in the docstring. It's mostly just deferring to pandas' excellent `eval` method. [x] Closes #6440 (edit) [x] Tests added [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` [x] New functions/methods are listed in `api.rst`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7163/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 1 }		xarray 13221727	pull
2019309352	PR_kwDOAMm_X85g0KvI	8493	Use numbagg for `rolling` methods	max-sixty 5635139	closed	3	2023-11-30T18:52:08Z	2023-12-05T19:08:32Z	2023-12-05T19:08:31Z	MEMBER	0	pydata/xarray/pulls/8493	A couple of tests are failing for the multi-dimensional case, which I'll fix before merge.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8493/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 1, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
907845790	MDU6SXNzdWU5MDc4NDU3OTA=	5413	Does the PyPI release job fire twice for each release?	max-sixty 5635139	closed	2	2021-06-01T04:01:17Z	2023-12-04T19:22:32Z	2023-12-04T19:22:32Z	MEMBER			I was attempting to copy the great work here for numbagg and spotted this! Do we fire twice for each release? Maybe that's fine though? https://github.com/pydata/xarray/actions/workflows/pypi-release.yaml	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5413/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
929840699	MDU6SXNzdWU5Mjk4NDA2OTk=	5531	Keyword only args for arguments like "drop"	max-sixty 5635139	closed	12	2021-06-25T05:24:25Z	2023-12-04T19:22:24Z	2023-12-04T19:22:23Z	MEMBER			Is your feature request related to a problem? Please describe. A method like `.reset_index` has a signature `.reset_index(dims_or_levels, drop=False)`. This means that passing `.reset_index("x", "y")` is actually like passing `.reset_index("x", True)`, which is silent and confusing. Describe the solution you'd like Move to kwarg-only arguments for these; like `.reset_index(dims_or_levels, , drop=False)`. But we probably need a deprecation cycle, which will require some work. Describe alternatives you've considered* Not have a deprecation cycle? I imagine it's fairly rare to not pass the kwarg.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5531/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1165654699	I_kwDOAMm_X85Fenqr	6349	Rolling exp correlation	max-sixty 5635139	closed	1	2022-03-10T19:51:57Z	2023-12-04T19:13:35Z	2023-12-04T19:13:34Z	MEMBER			Is your feature request related to a problem? I'd like an exponentially moving correlation coefficient Describe the solution you'd like I think we could add a `rolling_exp.corr` method fairly easily — i.e. just in python, no need to add anything to numbagg: `ewma` here means `rolling_exp(...).mean` - `ewma(A * B) - ewma(A) * ewma(B)` for the rolling covar - divided by `sqrt` of `(ewma(A2) - ewma(A)2` `` `ewma(B2) - ewma(B)2` for the sqrt of variance We could also add a flag for cosine similarity, which wouldn't remove the mean. We could also add `.var` & `.std` & `.covar` as their own methods. I think we'd need to mask the variables on their intersection, so we don't have values that are missing from B affecting A's variance without affecting its covariance. Pandas does this in cython, possibly because it's faster to only do a single pass of the data. If anyone has correctness concerns about this simple approach of wrapping `ewma`s, please let me know. Or if the performance would be unacceptable such that it shouldn't go into xarray until it's a single pass. Describe alternatives you've considered Numagg Additional context No response*	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6349/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
2019577432	PR_kwDOAMm_X85g1F3A	8495	Fix type of `.assign_coords`	max-sixty 5635139	closed	1	2023-11-30T21:57:58Z	2023-12-04T19:11:57Z	2023-12-04T19:11:55Z	MEMBER	0	pydata/xarray/pulls/8495	As discussed in #8455	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8495/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1995489227	I_kwDOAMm_X8528L_L	8455	Errors when assigning using `.from_pandas_multiindex`	max-sixty 5635139	closed	3	2023-11-15T20:09:15Z	2023-12-04T19:10:12Z	2023-12-04T19:10:11Z	MEMBER			What happened? Very possibly this is user-error, forgive me if so. I'm trying to transition some code from the previous assignment of MultiIndexes, to the new world. Here's an MCVE: What did you expect to happen? No response Minimal Complete Verifiable Example ```Python da = xr.tutorial.open_dataset("air_temperature")['air'] old code, works, but with a warning da.expand_dims('foo').assign_coords(foo=(pd.MultiIndex.from_tuples([(1,2)]))) <ipython-input-25-f09b7f52bb42>:1: FutureWarning: the `pandas.MultiIndex` object(s) passed as 'foo' coordinate(s) or data variable(s) will no longer be implicitly promoted and wrapped into multiple indexed coordinates in the future (i.e., one coordinate for each multi-index level + one dimension coordinate). If you want to keep this behavior, you need to first wrap it explicitly using `mindex_coords = xarray.Coordinates.from_pandas_multiindex(mindex_obj, 'dim')` and pass it as coordinates, e.g., `xarray.Dataset(coords=mindex_coords)`, `dataset.assign_coords(mindex_coords)` or `dataarray.assign_coords(mindex_coords)`. da.expand_dims('foo').assign_coords(foo=(pd.MultiIndex.from_tuples([(1,2)]))) Out[25]: <xarray.DataArray 'air' (foo: 1, time: 2920, lat: 25, lon: 53)> array([[[[241.2 , 242.5 , 243.5 , ..., 232.79999, 235.5 , 238.59999], ... [297.69 , 298.09 , 298.09 , ..., 296.49 , 296.19 , 295.69 ]]]], dtype=float32) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0 * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 * foo (foo) object MultiIndex * foo_level_0 (foo) int64 1 * foo_level_1 (foo) int64 2 new code — seems to get confused between the number of values in the index — 1 — and the number of levels — 3 including the parent: da.expand_dims('foo').assign_coords(foo=xr.Coordinates.from_pandas_multiindex(pd.MultiIndex.from_tuples([(1,2)]), dim='foo')) ValueError Traceback (most recent call last) Cell In[26], line 1 ----> 1 da.expand_dims('foo').assign_coords(foo=xr.Coordinates.from_pandas_multiindex(pd.MultiIndex.from_tuples([(1,2)]), dim='foo')) File ~/workspace/xarray/xarray/core/common.py:621, in DataWithCoords.assign_coords(self, coords, *coords_kwargs) 618 else: 619 results = self._calc_assign_results(coords_combined) --> 621 data.coords.update(results) 622 return data File ~/workspace/xarray/xarray/core/coordinates.py:566, in Coordinates.update(self, other) 560 # special case for PandasMultiIndex: updating only its dimension coordinate 561 # is still allowed but depreciated. 562 # It is the only case where we need to actually drop coordinates here (multi-index levels) 563 # TODO: remove when removing PandasMultiIndex's dimension coordinate. 564 self._drop_coords(self._names - coords_to_align._names) --> 566 self._update_coords(coords, indexes) File ~/workspace/xarray/xarray/core/coordinates.py:834, in DataArrayCoordinates._update_coords(self, coords, indexes) 832 coords_plus_data = coords.copy() 833 coords_plus_data[_THIS_ARRAY] = self._data.variable --> 834 dims = calculate_dimensions(coords_plus_data) 835 if not set(dims) <= set(self.dims): 836 raise ValueError( 837 "cannot add coordinates with new dimensions to a DataArray" 838 ) File ~/workspace/xarray/xarray/core/variable.py:3014, in calculate_dimensions(variables) 3012 last_used[dim] = k 3013 elif dims[dim] != size: -> 3014 raise ValueError( 3015 f"conflicting sizes for dimension {dim!r}: " 3016 f"length {size} on {k!r} and length {dims[dim]} on {last_used!r}" 3017 ) 3018 return dims ValueError: conflicting sizes for dimension 'foo': length 1 on <this-array> and length 3 on {'lat': 'lat', 'lon': 'lon', 'time': 'time', 'foo': 'foo'} ``` MVCE confirmation [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [X] Complete example — the example is self-contained, including all data and the text of any traceback. [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [X] New issue — a search of GitHub Issues suggests this is not a duplicate. [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies. Relevant log output No response* Anything else we need to know? No response Environment INSTALLED VERSIONS ------------------ commit: None python: 3.9.18 (main, Nov 2 2023, 16:51:22) [Clang 14.0.3 (clang-1403.0.22.14.1)] python-bits: 64 OS: Darwin OS-release: 22.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2023.10.2.dev10+gccc8f998 pandas: 2.1.1 numpy: 1.25.2 scipy: 1.11.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.16.0 cftime: None nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: None dask: 2023.4.0 distributed: 2023.7.1 matplotlib: 3.5.1 cartopy: None seaborn: None numbagg: 0.2.3.dev30+gd26e29e fsspec: 2021.11.1 cupy: None pint: None sparse: None flox: None numpy_groupies: 0.9.19 setuptools: 68.2.2 pip: 23.3.1 conda: None pytest: 7.4.0 mypy: 1.6.0 IPython: 8.15.0 sphinx: 4.3.2	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8455/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	not_planned	xarray 13221727	issue
2022178394	PR_kwDOAMm_X85g92vo	8511	Allow callables to `.drop_vars`	max-sixty 5635139	closed	0	2023-12-02T19:39:53Z	2023-12-03T22:04:53Z	2023-12-03T22:04:52Z	MEMBER	0	pydata/xarray/pulls/8511	This can be used as a nice more general alternative to `.drop_indexes` or `.reset_coords(drop=True)`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8511/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2021810083	PR_kwDOAMm_X85g8r6c	8508	Implement `np.clip` as `__array_function__`	max-sixty 5635139	closed	2	2023-12-02T02:20:11Z	2023-12-03T05:27:38Z	2023-12-03T05:27:33Z	MEMBER	0	pydata/xarray/pulls/8508	Would close https://github.com/pydata/xarray/issues/2570 Because of https://numpy.org/neps/nep-0018-array-function-protocol.html#partial-implementation-of-numpy-s-api, no option is ideal: - Don't do anything — don't implement `__array_function__`. Any numpy function that's not a `ufunc` — such as `np.clip` will materialize the array into memory. - Implement `__array_function__` and lose the ability to call any non-ufunc-numpy-func that we don't explicitly configure here. So `np.lexsort(da)` wouldn't work, for example; and users would have to run `np.lexsort(da.values)`. - Implement `__array_function__`, and attempt to handle the functions we don't explicitly configure by coercing to numpy arrays. This requires writing code to walk a tree of objects looking for arrays to coerce. It seems to go against the original numpy proposal. @shoyer is this summary accurate?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8508/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2019642778	PR_kwDOAMm_X85g1URY	8497	Fully deprecate `.drop`	max-sixty 5635139	closed	0	2023-11-30T22:54:57Z	2023-12-02T05:52:50Z	2023-12-02T05:52:49Z	MEMBER	0	pydata/xarray/pulls/8497	I think it's time...	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8497/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2013544848	PR_kwDOAMm_X85ggbU0	8487	Start renaming `dims` to `dim`	max-sixty 5635139	closed	1	2023-11-28T03:25:40Z	2023-11-28T21:04:49Z	2023-11-28T21:04:48Z	MEMBER	0	pydata/xarray/pulls/8487	Begins the process of #6646. I don't think it's feasible / enjoyable to do this for everything at once, so I would suggest we do it gradually, while keeping the warnings quite quiet, so by the time we convert to louder warnings, users can do a find/replace easily.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8487/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2010795504	PR_kwDOAMm_X85gXOqo	8484	Fix Zarr region transpose	max-sixty 5635139	closed	3	2023-11-25T21:01:28Z	2023-11-27T20:56:57Z	2023-11-27T20:56:56Z	MEMBER	0	pydata/xarray/pulls/8484	This wasn't working on an unregion-ed write; I think because `new_var` was being lost.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8484/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2010797682	PR_kwDOAMm_X85gXPEM	8485	Refine rolling_exp error messages	max-sixty 5635139	closed	0	2023-11-25T21:09:52Z	2023-11-25T21:55:20Z	2023-11-25T21:55:20Z	MEMBER	0	pydata/xarray/pulls/8485	(Sorry, copy & pasted too liberally!)	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8485/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1966733834	PR_kwDOAMm_X85eCSac	8389	Use numbagg for `ffill` by default	max-sixty 5635139	closed	5	2023-10-28T20:40:13Z	2023-11-25T21:06:10Z	2023-11-25T21:06:09Z	MEMBER	0	pydata/xarray/pulls/8389	The main perf advantage here is the array doesn't need to be unstacked & stacked, which is a huge win for large multi-dimensional arrays... (I actually was hitting a memory issue running an `ffill` on my own, and so thought I'd get this done!) We could move these methods to `DataWithCoords`, since they're almost the same implementation between a `DataArray` & `Dataset`, and exactly the same for numbagg's implementation For transparency — the logic of "check for numbagg, check for bottleneck" I wouldn't rate at my most confident. But I'm more confident that just installing numbagg will work. And if that works well enough, we could consider only supporting numbagg for some of these in the future. I also haven't done the benchmarks here — though the functions are relatively well benchmarked at numbagg. I'm somewhat trading off getting through these (rolling functions are coming up too) vs. doing fewer slower, and leaning towards the former, but feedback welcome... [x] Tests added [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` [x] New functions/methods are listed in `api.rst`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8389/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1964877168	PR_kwDOAMm_X85d8EmN	8381	Allow writing to zarr with differently ordered dims	max-sixty 5635139	closed	2	2023-10-27T06:47:59Z	2023-11-25T21:02:20Z	2023-11-15T18:09:08Z	MEMBER	0	pydata/xarray/pulls/8381	Is this reasonable? [x] Tests added [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8381/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2005419839	PR_kwDOAMm_X85gFPfF	8474	Improve "variable not found" error message	max-sixty 5635139	closed	0	2023-11-22T01:52:47Z	2023-11-24T18:49:39Z	2023-11-24T18:49:38Z	MEMBER	0	pydata/xarray/pulls/8474	One very small step as part of https://github.com/pydata/xarray/issues/8264. The existing error is just `KeyError: 'foo`, which is annoyingly terse. Future improvements include searching for similar variable names, or even rewriting the user's calling code if there's a close variable name. This PR creates a new test file. I don't love the format here — it's difficult to snapshot an error message, so it requires copying & pasting things, which doesn't scale well, and the traceback contains environment-specific lines such that it wouldn't be feasible to paste tracebacks. (here's what we do in PRQL, which is (immodestly) great) An alternative is just to put these in the mix of all the other tests; am open to that (and not difficult to change later)	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8474/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2006891782	PR_kwDOAMm_X85gKSKW	8478	Add whatsnew for #8475	max-sixty 5635139	closed	0	2023-11-22T18:22:19Z	2023-11-22T18:45:23Z	2023-11-22T18:45:22Z	MEMBER	0	pydata/xarray/pulls/8478	Sorry, forgot in the original PR	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8478/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2005656379	PR_kwDOAMm_X85gGCSj	8475	Allow `rank` to run on dask arrays	max-sixty 5635139	closed	0	2023-11-22T06:22:44Z	2023-11-22T16:45:03Z	2023-11-22T16:45:02Z	MEMBER	0	pydata/xarray/pulls/8475	[x] Tests added [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8475/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2005744975	PR_kwDOAMm_X85gGVaY	8476	Fix mypy tests	max-sixty 5635139	closed	0	2023-11-22T07:36:43Z	2023-11-22T08:01:13Z	2023-11-22T08:01:12Z	MEMBER	0	pydata/xarray/pulls/8476	I was seeing an error in #8475	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8476/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2000139267	PR_kwDOAMm_X85fzghA	8464	Fix `map_blocks` docs' formatting	max-sixty 5635139	closed	1	2023-11-18T01:18:02Z	2023-11-21T18:25:16Z	2023-11-21T18:25:15Z	MEMBER	0	pydata/xarray/pulls/8464	Was looking funky. Not 100% sure this is correct but seems consistent with the others	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8464/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2000154383	PR_kwDOAMm_X85fzju6	8466	Move Sphinx directives out of `See also`	max-sixty 5635139	open	2	2023-11-18T01:57:17Z	2023-11-21T18:25:05Z		MEMBER	0	pydata/xarray/pulls/8466	This is potentially causing the `See also` to not render the links? (Does anyone know this better? It doesn't seem easy to build the docs locally...)	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8466/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
2000146978	PR_kwDOAMm_X85fziKs	8465	Consolidate `_get_alpha` func	max-sixty 5635139	closed	0	2023-11-18T01:37:25Z	2023-11-21T18:24:52Z	2023-11-21T18:24:51Z	MEMBER	0	pydata/xarray/pulls/8465	Am changing this a bit so starting with consolidating it rather than converting twice	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8465/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
400444797	MDExOlB1bGxSZXF1ZXN0MjQ1NjMwOTUx	2687	Enable resampling on PeriodIndex	max-sixty 5635139	closed	2	2019-01-17T20:13:25Z	2023-11-17T20:38:44Z	2023-11-17T20:38:44Z	MEMBER	0	pydata/xarray/pulls/2687	This allows resampling with `PeriodIndex` objects by keeping the `group` as an index rather than coercing to a DataArray (which coerces any non-native types to objects) I'm still getting one failure around the name of the IndexVariable still being `__resample_dim__` after resample, but wanted to socialize the approach of allowing a `name` argument to `IndexVariable` - is this reasonable? [x] Closes https://github.com/pydata/xarray/issues/1270 [x] Tests added [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2687/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1995308522	I_kwDOAMm_X8527f3q	8454	Formalize `mode` / safety guarantees for Zarr	max-sixty 5635139	open	1	2023-11-15T18:28:38Z	2023-11-15T20:38:04Z		MEMBER			What is your issue? It sounds like we're coalescing on when it's safe to write concurrently: - `mode="r+"` is safe to write concurrently to different parts of a dataset - `mode="a"` isn't safe, because it changes the shape of an array, for example extending a dimension What are the existing operations that aren't consistent with this? - Is concurrently writing additional variables safe? Or it requires updating the centralized consolidated metadata? Currently that requires `mode="a"`, which is overly conservative based on the above rules assuming it is safe — we can liberalize to allow with `mode="r+"`. - https://github.com/pydata/xarray/issues/8371, ~but that's a bug~ — edit: or possibly an artifact of writing concurrently to overlapping chunks with a single `to_zarr` call. We could at least restrict non-aligned writes to `mode="a"`, so it wasn't possible to hit this mistakenly while writing to different parts of a dataset. - Writing the same values to the same chunks concurrently isn't safe at the moment — we'll get an "Stale file handle" error if two processes write to the same location at the same time. I'm not sure if that's possible to allow; possibly it requires work on the Zarr side. If it were possible, we wouldn't have to be as careful about ensuring that each process has mutually exclusive chunks to write. (lower priority)	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8454/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1953001043	I_kwDOAMm_X850aG5T	8343	Add `metadata_only` param to `.to_zarr`?	max-sixty 5635139	open	17	2023-10-19T20:25:11Z	2023-11-15T05:22:12Z		MEMBER			Is your feature request related to a problem? A leaf from https://github.com/pydata/xarray/issues/8245, which has a bullet: compute=False is arguably a less-than-obvious kwarg meaning "write metadata". Maybe this should be a method, maybe it's a candidate for renaming? Or maybe make_template can be an abstraction over it I've also noticed that for large arrays, running `compute=False` can take several minutes, despite the indexes being very small. I think this is because it's building a dask task graph — which is then discarded, since the array is written from different machines with the `region` pattern. Describe the solution you'd like Would introducing a `metadata_only` parameter to `to_zarr` help here: - Better name - No dask graph Describe alternatives you've considered No response Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8343/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1980019336	I_kwDOAMm_X852BLKI	8421	`to_zarr` could transpose dims	max-sixty 5635139	closed	0	2023-11-06T20:38:35Z	2023-11-14T19:23:08Z	2023-11-14T19:23:08Z	MEMBER			Is your feature request related to a problem? Currently we need to know the order of dims when using `region` in `to_zarr`. Generally in xarray we're fine with the order, because we have the names, so this is a bit of an aberration. It means that code needs to carry around the correct order of dims. Here's an MCVE: ```python ds = xr.tutorial.load_dataset('air_temperature') ds.to_zarr('foo', mode='w') ds.transpose(..., 'lat').to_zarr('foo', mode='r+') ValueError: variable 'air' already exists with different dimension names ('time', 'lat', 'lon') != ('time', 'lon', 'lat'), but changing variable dimensions is not supported by to_zarr(). ``` Describe the solution you'd like I think we should be able to transpose them based on the target? Describe alternatives you've considered No response Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8421/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1986643906	I_kwDOAMm_X852acfC	8437	Restrict pint test runs	max-sixty 5635139	open	10	2023-11-10T00:50:52Z	2023-11-13T21:57:45Z		MEMBER			What is your issue? Pint tests are failing on main — https://github.com/pydata/xarray/actions/runs/6817674274/job/18541677930 `E TypeError: no implementation found for 'numpy.min' on types that implement __array_function__: [<class 'pint.util.Quantity'>]` If we can't fix soon, should we disable? CC @keewis	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8437/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1986758555	PR_kwDOAMm_X85fGE95	8438	Rename `to_array` to `to_dataarray`	max-sixty 5635139	closed	2	2023-11-10T02:58:21Z	2023-11-10T06:15:03Z	2023-11-10T06:15:02Z	MEMBER	0	pydata/xarray/pulls/8438	This is a very minor nit, so I'm not sure it's worth changing. What do others think? (I would have opened an issue but it's just as quick to just do the PR)	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8438/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
874039546	MDU6SXNzdWU4NzQwMzk1NDY=	5246	test_save_mfdataset_compute_false_roundtrip fails	max-sixty 5635139	open	1	2021-05-02T20:41:48Z	2023-11-02T04:38:05Z		MEMBER			What happened: `test_save_mfdataset_compute_false_roundtrip` consistently fails in windows-latest-3.9, e.g. https://github.com/pydata/xarray/pull/5244/checks?check_run_id=2485202784 Here's the traceback: ```python self = <xarray.tests.test_backends.TestDask object at 0x000001FF45A9B640> def test_save_mfdataset_compute_false_roundtrip(self): from dask.delayed import Delayed original = Dataset({"foo": ("x", np.random.randn(10))}).chunk() datasets = [original.isel(x=slice(5)), original.isel(x=slice(5, 10))] with create_tmp_file(allow_cleanup_failure=ON_WINDOWS) as tmp1: with create_tmp_file(allow_cleanup_failure=ON_WINDOWS) as tmp2: delayed_obj = save_mfdataset( datasets, [tmp1, tmp2], engine=self.engine, compute=False ) assert isinstance(delayed_obj, Delayed) delayed_obj.compute() with open_mfdataset( [tmp1, tmp2], combine="nested", concat_dim="x" ) as actual: `assert_identical(actual, original)` E AssertionError: Left and right Dataset objects are not identical E E E Differing data variables: E L foo (x) float64 dask.array<chunksize=(5,), meta=np.ndarray> E R foo (x) float64 dask.array<chunksize=(10,), meta=np.ndarray> ``` Anything else we need to know?: xfailed in https://github.com/pydata/xarray/pull/5245 Environment: [Eliding since it's the test env]	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5246/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1923431725	I_kwDOAMm_X85ypT0t	8264	Improve error messages	max-sixty 5635139	open	4	2023-10-03T06:42:57Z	2023-10-24T18:40:04Z		MEMBER			Is your feature request related to a problem? Coming back to xarray, and using it based on what I remember from a year ago or so, means I make lots of mistakes. I've also been using it outside of a repl, where error messages are more important, given I can't explore a dataset inline. Some of the error messages could be much more helpful. Take one example: `xarray.core.merge.MergeError: conflicting values for variable 'date' on objects to be combined. You can skip this check by specifying compat='override'.` The second sentence is nice. But the first could be give us much more information: - Which variables conflict? I'm merging four objects, so would be so helpful to know which are causing the issue. - What is the conflict? Is one a superset and I can `join=...`? Are they off by 1 or are they completely different types? - Our `testing.assert_equal` produces pretty nice errors, as a comparison Having these good is really useful, lets folks stay in the flow while they're working, and it signals that we're a well-built, refined library. Describe the solution you'd like I'm not sure the best way to surface the issues — error messages make for less legible contributions than features or bug fixes, and the primary audience for good error messages is often the opposite of those actively developing the library. They're also more difficult to manage as GH issues — there could be scores of marginal issues which would often be out of date. One thing we do in PRQL is have a file that snapshots error messages `test_bad_error_messages.rs`, which can then be a nice contribution to change those from bad to good. I'm not sure whether that would work here (python doesn't seem to have a great snapshotter, `pytest-regtest` is the best I've found; I wrote `pytest-accept` but requires doctests). Any other ideas? Describe alternatives you've considered No response Additional context A couple of specific error-message issues: - https://github.com/pydata/xarray/issues/2078 - https://github.com/pydata/xarray/issues/5290	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8264/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1952859208	PR_kwDOAMm_X85dTmUR	8341	Deprecate tuples of chunks?	max-sixty 5635139	closed	1	2023-10-19T18:44:25Z	2023-10-21T01:45:28Z	2023-10-21T00:49:19Z	MEMBER	0	pydata/xarray/pulls/8341	(I was planning on putting an issue in, but then thought it wasn't much more difficult to make the PR. But it's totally fine if we don't think this is a good idea...) Allowing a tuple of dims means we're reliant on dimension order, which we really try and not be reliant on. It also makes the type signature even more complicated. So are we OK to encourage a dict of `dim: chunksize`, rather than a tuple of chunksizes?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8341/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1953143391	PR_kwDOAMm_X85dUk-m	8347	2023.10.1 release notes	max-sixty 5635139	closed	0	2023-10-19T22:19:43Z	2023-10-19T22:42:48Z	2023-10-19T22:42:47Z	MEMBER	0	pydata/xarray/pulls/8347		{ "url": "https://api.github.com/repos/pydata/xarray/issues/8347/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1948037836	PR_kwDOAMm_X85dDNka	8325	internal: Improve version handling for numbagg	max-sixty 5635139	closed	1	2023-10-17T18:45:43Z	2023-10-19T15:59:15Z	2023-10-19T15:59:14Z	MEMBER	0	pydata/xarray/pulls/8325	Uses the approach in #8316, a bit nicer. Only internal.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8325/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1948548087	PR_kwDOAMm_X85dE9ga	8329	Request to adjust pyright config	max-sixty 5635139	closed	3	2023-10-18T01:04:00Z	2023-10-18T20:10:42Z	2023-10-18T20:10:41Z	MEMBER	0	pydata/xarray/pulls/8329	Would it be possible to not have this config? It overrides the local VS Code config, and means VS Code constantly is reporting errors for me. Totally open to other approaches ofc. Or that we decide that the tradeoff is worthwhile	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8329/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1948529004	PR_kwDOAMm_X85dE5aA	8327	Add docs to `reindex_like` re broadcasting	max-sixty 5635139	closed	0	2023-10-18T00:46:52Z	2023-10-18T18:16:43Z	2023-10-18T16:51:12Z	MEMBER	0	pydata/xarray/pulls/8327	This wasn't clear to me so I added some examples & a reference to `broadcast_like`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8327/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1943054301	PR_kwDOAMm_X85cyrdc	8307	Add `corr`, `cov`, `std` & `var` to `.rolling_exp`	max-sixty 5635139	closed	0	2023-10-14T07:25:31Z	2023-10-18T17:35:35Z	2023-10-18T16:55:35Z	MEMBER	0	pydata/xarray/pulls/8307	From the new routines in numbagg. Maybe needs better tests (though these are quite heavily tested in numbagg), docs, and potentially need to think about types (maybe existing binary ops can help here?) (will fail while the build is cached on an old version of numbagg)	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8307/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1948537810	PR_kwDOAMm_X85dE7Te	8328	Refine curvefit doctest	max-sixty 5635139	closed	0	2023-10-18T00:55:16Z	2023-10-18T01:19:27Z	2023-10-18T01:19:26Z	MEMBER	0	pydata/xarray/pulls/8328	A very small change	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8328/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1946081841	PR_kwDOAMm_X85c8kKB	8321	Remove a couple of trailing commas in tests	max-sixty 5635139	closed	0	2023-10-16T20:57:04Z	2023-10-16T21:26:50Z	2023-10-16T21:26:49Z	MEMBER	0	pydata/xarray/pulls/8321		{ "url": "https://api.github.com/repos/pydata/xarray/issues/8321/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1913983402	I_kwDOAMm_X85yFRGq	8233	numbagg & flox	max-sixty 5635139	closed	13	2023-09-26T17:33:32Z	2023-10-15T07:48:56Z	2023-10-09T15:40:29Z	MEMBER			What is your issue? I've been doing some work recently on our old friend numbagg, improving the ewm routines & adding some more. I'm keen to get numbagg back in shape, doing the things that it does best, and trimming anything it doesn't. I notice that it has grouped calcs. Am I correct to think that flox does this better? I haven't been up with the latest. flox looks like it's particularly focused on dask arrays, whereas numpy_groupies, one of the inspirations for this, was applicable to numpy arrays too. At least from the xarray perspective, are we OK to deprecate these numbagg functions, and direct folks to flox?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8233/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1920172346	PR_kwDOAMm_X85blZOk	8256	Accept `lambda` for `other` param	max-sixty 5635139	closed	0	2023-09-30T08:24:36Z	2023-10-14T07:26:28Z	2023-09-30T18:50:33Z	MEMBER	0	pydata/xarray/pulls/8256		{ "url": "https://api.github.com/repos/pydata/xarray/issues/8256/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1931467868	PR_kwDOAMm_X85cLSzK	8283	Ask bug reporters to confirm they're using a recent version of xarray	max-sixty 5635139	closed	0	2023-10-07T19:07:17Z	2023-10-14T07:26:28Z	2023-10-09T13:30:03Z	MEMBER	0	pydata/xarray/pulls/8283		{ "url": "https://api.github.com/repos/pydata/xarray/issues/8283/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1931584082	PR_kwDOAMm_X85cLpuZ	8286	Fix `GroupBy` import	max-sixty 5635139	closed	0	2023-10-08T01:15:37Z	2023-10-14T07:26:28Z	2023-10-09T13:38:44Z	MEMBER	0	pydata/xarray/pulls/8286	Not sure why this only breaks tests for me, vs. in CI, but hopefully no downside to this change...	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8286/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1931581491	PR_kwDOAMm_X85cLpMS	8284	Enable `.rolling_exp` to work on dask arrays	max-sixty 5635139	closed	0	2023-10-08T01:06:04Z	2023-10-14T07:26:27Z	2023-10-10T06:37:20Z	MEMBER	0	pydata/xarray/pulls/8284	Another benefit of the move to `.apply_ufunc`...	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8284/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1931582554	PR_kwDOAMm_X85cLpap	8285	Add `min_weight` param to `rolling_exp` functions	max-sixty 5635139	closed	2	2023-10-08T01:09:59Z	2023-10-14T07:24:48Z	2023-10-14T07:24:48Z	MEMBER	0	pydata/xarray/pulls/8285		{ "url": "https://api.github.com/repos/pydata/xarray/issues/8285/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1939241220	PR_kwDOAMm_X85cmBPP	8296	mypy 1.6.0 passing	max-sixty 5635139	closed	4	2023-10-12T06:04:46Z	2023-10-12T22:13:18Z	2023-10-12T19:06:13Z	MEMBER	0	pydata/xarray/pulls/8296	I did the easy things, but will need help for the final couple on `_typed_ops.py` Because we don't pin mypy (should we?), this blocks other PRs if we gate them on mypy passing	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8296/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1940614908	PR_kwDOAMm_X85cqvBb	8299	xfail flaky test	max-sixty 5635139	closed	0	2023-10-12T19:03:59Z	2023-10-12T22:00:51Z	2023-10-12T22:00:47Z	MEMBER	0	pydata/xarray/pulls/8299	Would be better to fix it, but in lieu of fixing, better to skip it	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8299/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1920359276	PR_kwDOAMm_X85bl9er	8257	Mandate kwargs on `to_zarr`	max-sixty 5635139	closed	0	2023-09-30T18:33:13Z	2023-10-12T18:33:15Z	2023-10-04T19:05:02Z	MEMBER	0	pydata/xarray/pulls/8257	This aleviates some of the dangers of having these in a different order between `da` & `ds`. Technically it's a breaking change, but only very technically, given that I would wager literally no one has a dozen positional arguments to this method. So I think it's OK.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8257/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1926810300	PR_kwDOAMm_X85b7rlX	8273	Allow a function in `.sortby` method	max-sixty 5635139	closed	0	2023-10-04T19:04:03Z	2023-10-12T18:33:14Z	2023-10-06T03:35:22Z	MEMBER	0	pydata/xarray/pulls/8273		{ "url": "https://api.github.com/repos/pydata/xarray/issues/8273/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1931585098	PR_kwDOAMm_X85cLp7r	8287	Rename `reset_encoding` to `drop_encoding`	max-sixty 5635139	closed	1	2023-10-08T01:19:25Z	2023-10-12T17:11:07Z	2023-10-12T17:11:03Z	MEMBER	0	pydata/xarray/pulls/8287	Closes #8259	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8287/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1920369929	I_kwDOAMm_X85ydoUJ	8259	Should `.reset_encoding` be `.drop_encoding`?	max-sixty 5635139	closed	1	2023-09-30T19:11:46Z	2023-10-12T17:11:06Z	2023-10-12T17:11:06Z	MEMBER			What is your issue? Not the greatest issue facing the universe — but for the cause of consistency — should `.reset_encoding` be `.drop_encoding`, since it drops all encoding attributes? For comparison: - `.reset_coords` — "Given names of coordinates, reset them to become variables." - '.drop_vars` — "Drop variables from this dataset." Also ref #8258	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8259/reactions", "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1917929597	PR_kwDOAMm_X85bd2nm	8249	Refine `chunks=None` handling	max-sixty 5635139	closed	0	2023-09-28T16:54:59Z	2023-10-04T18:34:27Z	2023-09-28T20:01:13Z	MEMBER	0	pydata/xarray/pulls/8249	Based on comment in https://github.com/pydata/xarray/pull/8247. This doesn't make it perfect, but allows the warning to get hit and clarifies the type comment, as a stop-gap	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8249/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1216647336	PR_kwDOAMm_X8421oXV	6521	Move license from readme to LICENSE	max-sixty 5635139	open	3	2022-04-27T00:59:03Z	2023-10-01T09:31:37Z		MEMBER	0	pydata/xarray/pulls/6521		{ "url": "https://api.github.com/repos/pydata/xarray/issues/6521/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1918061661	I_kwDOAMm_X85yU0xd	8251	`.chunk()` doesn't create chunks on 0 dim arrays	max-sixty 5635139	open	0	2023-09-28T18:30:50Z	2023-09-30T21:31:05Z		MEMBER			What happened? `.chunk`'s docstring states: ``` """Coerce this array's data into a dask arrays with the given chunks. `If this variable is a non-dask array, it will be converted to dask array. If it's a dask array, it will be rechunked to the given chunk sizes.` ``` ...but this doesn't happen for 0 dim arrays; example below. For context, as part of #8245, I had a function that creates a template array. It created an empty `DataArray`, then expanded dims for each dimension. And it kept blowing up memory! ...until I realized that it was actually not a lazy array. What did you expect to happen? It may be that we can't have a 0-dim dask array — but then we should raise in this method, rather than return the wrong thing. Minimal Complete Verifiable Example ```Python [ins] In [1]: type(xr.DataArray().chunk().data) Out[1]: numpy.ndarray [ins] In [2]: type(xr.DataArray(1).chunk().data) Out[2]: numpy.ndarray [ins] In [3]: type(xr.DataArray([1]).chunk().data) Out[3]: dask.array.core.Array ``` MVCE confirmation [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [X] Complete example — the example is self-contained, including all data and the text of any traceback. [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [X] New issue — a search of GitHub Issues suggests this is not a duplicate. Relevant log output No response Anything else we need to know? No response Environment INSTALLED VERSIONS ------------------ commit: 0d6cd2a39f61128e023628c4352f653537585a12 python: 3.9.18 (main, Aug 24 2023, 21:19:58) [Clang 14.0.3 (clang-1403.0.22.14.1)] python-bits: 64 OS: Darwin OS-release: 22.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2023.8.1.dev25+g8215911a.d20230914 pandas: 2.1.1 numpy: 1.25.2 scipy: 1.11.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.16.0 cftime: None nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: None dask: 2023.4.0 distributed: 2023.7.1 matplotlib: 3.5.1 cartopy: None seaborn: None numbagg: 0.2.3.dev30+gd26e29e fsspec: 2021.11.1 cupy: None pint: None sparse: None flox: 0.7.2 numpy_groupies: 0.9.19 setuptools: 68.1.2 pip: 23.2.1 conda: None pytest: 7.4.0 mypy: 1.5.1 IPython: 8.15.0 sphinx: 4.3.2	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8251/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
1920167070	I_kwDOAMm_X85yc2ye	8255	Allow a `lambda` for the `other` param to `where`	max-sixty 5635139	closed	1	2023-09-30T08:05:54Z	2023-09-30T19:02:42Z	2023-09-30T19:02:42Z	MEMBER			Is your feature request related to a problem? Currently we allow: `python da.where(lambda x: x.foo == 5)` ...but we don't allow: `python da.where(lambda x: x.foo == 5, lambda x: x - x.shift(1))` ...which would be nice Describe the solution you'd like No response Describe alternatives you've considered I don't think this offers many downsides — it's not like we want to fill the array with a callable object. Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8255/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
124154674	MDU6SXNzdWUxMjQxNTQ2NzQ=	688	Keep attrs & Add a 'keep_coords' argument to Dataset.apply	max-sixty 5635139	closed	14	2015-12-29T02:42:48Z	2023-09-30T18:47:07Z	2023-09-30T18:47:07Z	MEMBER			Generally this isn't a problem, since the coords are carried over by the resulting `DataArray`s: ``` python In [11]: ds = xray.Dataset({ 'a':pd.DataFrame(pd.np.random.rand(10,3)), 'b':pd.Series(pd.np.random.rand(10)) }) ds.coords['c'] = pd.Series(pd.np.random.rand(10)) ds Out[11]: <xray.Dataset> Dimensions: (dim_0: 10, dim_1: 3) Coordinates: * dim_0 (dim_0) int64 0 1 2 3 4 5 6 7 8 9 * dim_1 (dim_1) int64 0 1 2 c (dim_0) float64 0.9318 0.2899 0.3853 0.6235 0.9436 0.7928 ... Data variables: a (dim_0, dim_1) float64 0.5707 0.9485 0.3541 0.5987 0.406 0.7992 ... b (dim_0) float64 0.4106 0.2316 0.5804 0.6393 0.5715 0.6463 ... In [12]: ds.apply(lambda x: x2) Out[12]: <xray.Dataset> Dimensions: (dim_0: 10, dim_1: 3) Coordinates: c (dim_0) float64 0.9318 0.2899 0.3853 0.6235 0.9436 0.7928 ... dim_0 (dim_0) int64 0 1 2 3 4 5 6 7 8 9 * dim_1 (dim_1) int64 0 1 2 Data variables: a (dim_0, dim_1) float64 1.141 1.897 0.7081 1.197 0.812 1.598 ... b (dim_0) float64 0.8212 0.4631 1.161 1.279 1.143 1.293 0.3507 ... ``` But if there's an operation that removes the coords from the `DataArray`s, the coords are not there on the result (notice `c` below). Should the `Dataset` retain them? Either always or with a `keep_coords` argument, similar to `keep_attrs`. ``` python In [13]: ds = xray.Dataset({ 'a':pd.DataFrame(pd.np.random.rand(10,3)), 'b':pd.Series(pd.np.random.rand(10)) }) ds.coords['c'] = pd.Series(pd.np.random.rand(10)) ds Out[13]: <xray.Dataset> Dimensions: (dim_0: 10, dim_1: 3) Coordinates: * dim_0 (dim_0) int64 0 1 2 3 4 5 6 7 8 9 * dim_1 (dim_1) int64 0 1 2 c (dim_0) float64 0.4121 0.2507 0.6326 0.4031 0.6169 0.441 0.1146 ... Data variables: a (dim_0, dim_1) float64 0.4813 0.2479 0.5158 0.2787 0.06672 ... b (dim_0) float64 0.2638 0.5788 0.6591 0.7174 0.3645 0.5655 ... In [14]: ds.apply(lambda x: x.to_pandas()2) Out[14]: <xray.Dataset> Dimensions: (dim_0: 10, dim_1: 3) Coordinates: dim_0 (dim_0) int64 0 1 2 3 4 5 6 7 8 9 * dim_1 (dim_1) int64 0 1 2 Data variables: a (dim_0, dim_1) float64 0.9627 0.4957 1.032 0.5574 0.1334 0.8289 ... b (dim_0) float64 0.5275 1.158 1.318 1.435 0.7291 1.131 0.1903 ... ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/688/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1916391948	PR_kwDOAMm_X85bYlaM	8242	Add modules to `check-untyped`	max-sixty 5635139	closed	2	2023-09-27T21:56:45Z	2023-09-29T17:43:07Z	2023-09-29T16:39:34Z	MEMBER	0	pydata/xarray/pulls/8242	In reviewing https://github.com/pydata/xarray/pull/8241, I realize that we actually want `check-untyped-defs`, which is a bit less strict, but lets us add some more modules on. I did have to add a couple of ignores, think it's a reasonable tradeoff to add big modules like `computation` on. Errors with this enabled are actual type errors, not just `mypy` pedanticness, so would be good to get as much as possible into this list...	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8242/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1878288525	PR_kwDOAMm_X85ZYos5	8139	Fix pandas' `interpolate(fill_value=)` error	max-sixty 5635139	closed	6	2023-09-02T02:41:45Z	2023-09-28T16:48:51Z	2023-09-04T18:05:14Z	MEMBER	0	pydata/xarray/pulls/8139	Pandas no longer has a `fill_value` parameter for `interpolate`. Weirdly I wasn't getting this locally, on pandas 2.1.0, only in CI on https://github.com/pydata/xarray/actions/runs/6054400455/job/16431747966?pr=8138. Removing it passes locally, let's see whether this works in CI Would close #8125	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8139/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

506 rows where user = 5635139 sorted by updated_at descending

What is your issue?

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

<xarray.Dataset>

Dimensions: (a: 4, b: 3, c: 10)

Dimensions without coordinates: a, b, c

Data variables:

var1 (a, b, c) int64 dask.array<chunksize=(2, 2, 2), meta=np.ndarray>

AssertionError

MVCE confirmation

Relevant log output

Anything else we need to know?

Environment

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

What is your issue?

What happened?

Minimal Complete Verifiable Example

Note the chunksize=(2920,) vs chunksize=(10,)!

xr.open_zarr('foo.zarr').chunks

MVCE confirmation

Relevant log output

Anything else we need to know?

Environment

What is your issue?

```

What is your issue?

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

Dropping the index doesn't generally matter that much...

15569320

15477313

But with .map_blocks, it really matters — it's really big with the indexes, and the same size without:

79307120

16016173

MVCE confirmation

Relevant log output

Anything else we need to know?

Environment

What is your issue?

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

What is your issue?

What is your issue?

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

old code, works, but with a warning

new code — seems to get confused between the number of values in the index — 1 — and the number of levels — 3 including the parent:

da.expand_dims('foo').assign_coords(foo=xr.Coordinates.from_pandas_multiindex(pd.MultiIndex.from_tuples([(1,2)]), dim='foo'))

MVCE confirmation

Relevant log output

Anything else we need to know?

Environment

What is your issue?

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Is your feature request related to a problem?

ValueError: variable 'air' already exists with different dimension names ('time', 'lat', 'lon') != ('time', 'lon', 'lat'), but changing variable dimensions is not supported by to_zarr().

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Note the `chunksize=(2920,)` vs `chunksize=(10,)`!

But with `.map_blocks`, it really matters — it's really big with the indexes, and the same size without: