github: issues: 25 rows where state = "open" and user = 14808389 sorted by updated

25 rows where state = "open" and user = 14808389 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	comments	created_at	updated_at ▲	author_association	draft	pull_request	body	reactions	repo	type
2194953062	PR_kwDOAMm_X85qFqp1	8854	array api-related upstream-dev failures	keewis 14808389	open	15	2024-03-19T13:17:09Z	2024-05-03T22:46:41Z	MEMBER	0	pydata/xarray/pulls/8854	[x] towards #8844 This "fixes" the upstream-dev failures related to the removal of `numpy.array_api`. There are a couple of open questions, though: - `array-api-strict` is not installed by default, so `namedarray` would get a new dependency. Not sure how to deal with that – as far as I can tell, `numpy.array_api` was not supposed to be used that way, so maybe we need to use `array-api-compat` instead? What do you think, @andersy005, @Illviljan? - `array-api-strict` does not define `Array.nbytes` (causing a funny exception that wrongly claims `DataArray` does not define `nbytes`) - `array-api-strict` has a different `DType` class, which makes it tricky to work with both `numpy` dtypes and said dtype class in the same code. In particular, if I understand correctly we're supposed to check dtypes using `isdtype`, but `numpy.isdtype` will only exist in `numpy>=2`, `array-api-strict`'s version does not define datetime / string / object dtypes, and `numpy.issubdtype` does not work with the non-`numpy` dtype class). So maybe we need to use `array-api-compat` internally?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8854/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	pull
2269295936	PR_kwDOAMm_X85uBwtv	8983	fixes for the `pint` tests	keewis 14808389	open	0	2024-04-29T15:09:28Z	2024-05-03T18:30:06Z	MEMBER	0	pydata/xarray/pulls/8983	This removes the use of the deprecated `numpy.core._exceptions.UFuncError` (and multiplication as a way to attach units), and makes sure we run the `pint` tests in the upstream-dev CI again.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8983/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	pull
2234142680	PR_kwDOAMm_X85sK0g8	8923	`"source"` encoding for datasets opened from `fsspec` objects	keewis 14808389	open	5	2024-04-09T19:12:45Z	2024-04-23T16:54:09Z	MEMBER	0	pydata/xarray/pulls/8923	When opening files from path-like objects (`str`, `pathlib.Path`), the backend machinery (`_dataset_from_backend_dataset`) sets the `"source"` encoding. This is useful if we need the original path for additional processing, like writing to a similarly named file, or to extract additional metadata. This would be useful as well when using `fsspec` to open remote files. In this PR, I'm extracting the `path` attribute that most `fsspec` objects have to set that value. I've considered using `isinstance` checks instead of the `getattr`-with-default, but the list of potential classes is too big to be practical (at least 4 classes just within `fsspec` itself). If this sounds like a good idea, I'll update the documentation of the `"source"` encoding to mention this feature. [x] Tests added [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8923/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	pull
2241492018	PR_kwDOAMm_X85skF_A	8937	drop support for `python=3.9`	keewis 14808389	open	3	2024-04-13T10:18:04Z	2024-04-15T15:07:39Z	MEMBER	0	pydata/xarray/pulls/8937	According to our policy (and NEP-29) we can drop support for `python=3.9` since about a week ago. Interestingly, SPEC0 says we could have started doing this about half a year ago (Q4 2023). We could delay this until we have a release that is compatible with `numpy>=2.0`, though (`numpy>=2.1` will drop support for `python=3.9`). [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8937/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	pull
2079089277	I_kwDOAMm_X8577GJ9	8607	allow computing just a small number of variables	keewis 14808389	open	4	2024-01-12T15:21:27Z	2024-01-12T20:20:29Z	MEMBER			Is your feature request related to a problem? I frequently find myself computing a handful of variables of a dataset (typically coordinates) and assigning them back to the dataset, and wishing we had a method / function that allowed that. Describe the solution you'd like I'd imagine something like `python ds.compute(variables=variable_names)` but I'm undecided on whether that's a good idea (it might make `.compute` more complex?) Describe alternatives you've considered So far I've been using something like `python ds.assign_coords({k: lambda ds: ds[k].compute() for k in variable_names}) ds.pipe(lambda ds: ds.merge(ds[variable_names].compute()))` but both are not easy to type / understand (though having `.merge` take a callable would make this much easier). Also, the first option computes variables separately, which may not be ideal? Additional context No response	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8607/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue
1655290694	I_kwDOAMm_X85iqbtG	7721	`as_shared_dtype` converts scalars to 0d `numpy` arrays if chunked `cupy` is involved	keewis 14808389	open	7	2023-04-05T09:48:34Z	2023-12-04T10:45:43Z	MEMBER			I tried to run `where` with chunked `cupy` arrays: `python In [1]: import xarray as xr ...: import cupy ...: import dask.array as da ...: ...: arr = xr.DataArray(cupy.arange(4), dims="x") ...: mask = xr.DataArray(cupy.array([False, True, True, False]), dims="x")` this works: `python In [2]: arr.where(mask) Out[2]: <xarray.DataArray (x: 4)> array([nan, 1., 2., nan]) Dimensions without coordinates: x` this fails: ```python In [4]: arr.chunk().where(mask).compute() TypeError Traceback (most recent call last) Cell In[4], line 1 ----> 1 arr.chunk().where(mask).compute() File ~/repos/xarray/xarray/core/dataarray.py:1095, in DataArray.compute(self, kwargs) 1076 """Manually trigger loading of this array's data from disk or a 1077 remote source into memory and return a new array. The original is 1078 left unaltered. (...) 1092 dask.compute 1093 """ 1094 new = self.copy(deep=False) -> 1095 return new.load(kwargs) File ~/repos/xarray/xarray/core/dataarray.py:1069, in DataArray.load(self, kwargs) 1051 def load(self: T_DataArray, kwargs) -> T_DataArray: 1052 """Manually trigger loading of this array's data from disk or a 1053 remote source into memory and return this array. 1054 (...) 1067 dask.compute 1068 """ -> 1069 ds = self._to_temp_dataset().load(kwargs) 1070 new = self._from_temp_dataset(ds) 1071 self._variable = new._variable File ~/repos/xarray/xarray/core/dataset.py:752, in Dataset.load(self, kwargs) 749 import dask.array as da 751 # evaluate all the dask arrays simultaneously --> 752 evaluated_data = da.compute(lazy_data.values(),* kwargs) 754 for k, data in zip(lazy_data, evaluated_data): 755 self.variables[k].data = data File ~/.local/opt/mambaforge/envs/xarray/lib/python3.10/site-packages/dask/base.py:600, in compute(traverse, optimize_graph, scheduler, get, args, kwargs) 597 keys.append(x.dask_keys()) 598 postcomputes.append(x.dask_postcompute()) --> 600 results = schedule(dsk, keys, kwargs) 601 return repack([f(r, a) for r, (f, a) in zip(results, postcomputes)]) File ~/.local/opt/mambaforge/envs/xarray/lib/python3.10/site-packages/dask/threaded.py:89, in get(dsk, keys, cache, num_workers, pool, kwargs) 86 elif isinstance(pool, multiprocessing.pool.Pool): 87 pool = MultiprocessingPoolExecutor(pool) ---> 89 results = get_async( 90 pool.submit, 91 pool._max_workers, 92 dsk, 93 keys, 94 cache=cache, 95 get_id=_thread_get_id, 96 pack_exception=pack_exception, 97 kwargs, 98 ) 100 # Cleanup pools associated to dead threads 101 with pools_lock: File ~/.local/opt/mambaforge/envs/xarray/lib/python3.10/site-packages/dask/local.py:511, in get_async(submit, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, chunksize, *kwargs) 509 _execute_task(task, data) # Re-execute locally 510 else: --> 511 raise_exception(exc, tb) 512 res, worker_id = loads(res_info) 513 state["cache"][key] = res File ~/.local/opt/mambaforge/envs/xarray/lib/python3.10/site-packages/dask/local.py:319, in reraise(exc, tb) 317 if exc.traceback is not tb: 318 raise exc.with_traceback(tb) --> 319 raise exc File ~/.local/opt/mambaforge/envs/xarray/lib/python3.10/site-packages/dask/local.py:224, in execute_task(key, task_info, dumps, loads, get_id, pack_exception) 222 try: 223 task, data = loads(task_info) --> 224 result = _execute_task(task, data) 225 id = get_id() 226 result = dumps((result, id)) File ~/.local/opt/mambaforge/envs/xarray/lib/python3.10/site-packages/dask/core.py:119, in _execute_task(arg, cache, dsk) 115 func, args = arg[0], arg[1:] 116 # Note: Don't assign the subtask results to a variable. numpy detects 117 # temporaries by their reference count and can execute certain 118 # operations in-place. --> 119 return func((_execute_task(a, cache) for a in args)) 120 elif not ishashable(arg): 121 return arg File ~/.local/opt/mambaforge/envs/xarray/lib/python3.10/site-packages/dask/optimization.py:990, in SubgraphCallable.call(self, args) 988 if not len(args) == len(self.inkeys): 989 raise ValueError("Expected %d args, got %d" % (len(self.inkeys), len(args))) --> 990 return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args))) File ~/.local/opt/mambaforge/envs/xarray/lib/python3.10/site-packages/dask/core.py:149, in get(dsk, out, cache) 147 for key in toposort(dsk): 148 task = dsk[key] --> 149 result = _execute_task(task, cache) 150 cache[key] = result 151 result = _execute_task(out, cache) File ~/.local/opt/mambaforge/envs/xarray/lib/python3.10/site-packages/dask/core.py:119, in _execute_task(arg, cache, dsk) 115 func, args = arg[0], arg[1:] 116 # Note: Don't assign the subtask results to a variable. numpy detects 117 # temporaries by their reference count and can execute certain 118 # operations in-place. --> 119 return func((_execute_task(a, cache) for a in args)) 120 elif not ishashable(arg): 121 return arg File <array_function internals>:180, in where(args, *kwargs) File cupy/_core/core.pyx:1723, in cupy._core.core._ndarray_base.array_function() File ~/.local/opt/mambaforge/envs/xarray/lib/python3.10/site-packages/cupy/_sorting/search.py:211, in where(condition, x, y) 209 if fusion._is_fusing(): 210 return fusion._call_ufunc(_where_ufunc, condition, x, y) --> 211 return _where_ufunc(condition.astype('?'), x, y) File cupy/_core/_kernel.pyx:1287, in cupy._core._kernel.ufunc.call() File cupy/_core/_kernel.pyx:160, in cupy._core._kernel._preprocess_args() File cupy/_core/_kernel.pyx:146, in cupy._core._kernel._preprocess_arg() TypeError: Unsupported type <class 'numpy.ndarray'> `this works again:`python In [7]: arr.chunk().where(mask.chunk(), cupy.array(cupy.nan)).compute() Out[7]: <xarray.DataArray (x: 4)> array([nan, 1., 2., nan]) Dimensions without coordinates: x `` And other methods likefillna` show similar behavior. I think the reason is that this: https://github.com/pydata/xarray/blob/d4db16699f30ad1dc3e6861601247abf4ac96567/xarray/core/duck_array_ops.py#L195 is not sufficient to detect `cupy` beneath other layers of duckarrays (most commonly `dask`, `pint`, or both). In this specific case we could extend the condition to also match chunked `cupy` arrays (like `arr.cupy.is_cupy` does, but using `is_duck_dask_array`), but this will still break for other duckarray layers or if `dask` is not involved, and we're also in the process of moving away from special-casing `dask`. So short of asking `cupy` to treat 0d arrays like scalars I'm not sure how to fix this. cc @jacobtomlinson	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7721/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue
1158378382	I_kwDOAMm_X85FC3OO	6323	propagation of `encoding`	keewis 14808389	open	8	2022-03-03T12:57:29Z	2023-10-25T23:20:31Z	MEMBER			What is your issue? We frequently get bug reports related to `encoding` that can usually be fixed by clearing it or by overriding it using the `encoding` parameter of the `to_*` methods, e.g. - #4224 - #4380 - #4655 - #5427 - #5490 - fsspec/kerchunk#130 There are also a few discussions with more background: - https://github.com/pydata/xarray/pull/5065#issuecomment-806154872 - https://github.com/pydata/xarray/issues/1614 - #5082 - #5336 We discussed this in the meeting yesterday and as far as I can remember agreed that the current default behavior is not ideal and decided to investigate #5336: a `keep_encoding` option, similar to `keep_attrs`, that would be `True` (propagate `encoding`) by default but will be changed to `False` (drop `encoding` on any operation) in the future. cc @rabernat, @shoyer	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6323/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue
683142059	MDU6SXNzdWU2ODMxNDIwNTk=	4361	restructure the contributing guide	keewis 14808389	open	5	2020-08-20T22:51:39Z	2023-03-31T17:39:00Z	MEMBER			From #4355 @max-sixty: Stepping back on the contributing doc — I admit I haven't look at it in a while — I wonder whether we can slim it down a bit, for example by linking to other docs for generic tooling — I imagine we're unlikely to have the best docs on working with GH, for example. Or referencing our PR template rather than the (now out-of-date) PR checklist. We could also add a docstring guide since the `numpydoc` guide does not cover every little detail (for example, `default` notation, type spec vs. type hint, space before the colon separating parameter names from types, no colon for parameters without types, etc.)	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4361/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue
1306795760	I_kwDOAMm_X85N5B7w	6793	improve docstrings with examples and links	keewis 14808389	open	10	2022-07-16T12:30:33Z	2023-03-24T16:33:28Z	MEMBER			This is a (incomplete) checklist for #5816 to make it easier to find methods that are in need of examples and links to the narrative docs with further information (of course, changes to the docstrings of all other methods / functions part of the public API are also appreciated). Good examples explicitly construct small xarray objects to make it easier to follow (e.g. use `np.{ones,full,zeros}` or the `np.array` constructor instead of `np.random` / loading from files) and show both input and output of the function. Use `sh pytest --doctest-modules xarray --ignore xarray/tests/` to verify the examples, or push to a PR to have the CI do it for you (note that you will have much quicker feedback locally though). To easily generate the expected output install `pytest-accept` (docs) in your dev environment and then run `pytest --doctest-modules FILE_NAME --accept \|\| true` To link to other documentation pages we can use python :doc:`project:label` Description of the linked page where we can leave out `project` if we link to somewhere within xarray's documentation. To figure out the label, we can either look at the source, search the output of `python -m sphinx.ext.intersphinx https://docs.xarray.dev/en/latest/objects.inv`, or use `sphobjinv` (install from PyPI): `sh sphobjinv search -su https://docs.xarray.dev/en/latest/ missing` Top-level functions: - [ ] `get_options` - [ ] `decode_cf` - [ ] `polyval` - [ ] `unify_chunks` - [ ] `infer_freq` - [ ] `date_range` I/O: - [ ] `load_dataarray` - [ ] `load_dataset` - [ ] `open_dataarray` - [ ] `open_dataset` - [ ] `open_mfdataset` Contents: - [ ] `DataArray.assign_attrs`, `Dataset.assign_attrs` - [ ] `DataArray.expand_dims`, `Dataset.expand_dims` - [ ] `DataArray.drop_duplicates`, `Dataset.drop_duplicates` - [ ] `DataArray.drop_vars`, `Dataset.drop_vars` - [ ] `Dataset.drop_dims` - [ ] `DataArray.convert_calendar`, `Dataset.convert_calendar` - [ ] `DataArray.set_coords`, `Dataset.set_coords` - [ ] `DataArray.reset_coords`, `Dataset.reset_coords` Comparisons: - [ ] `DataArray.equals`, `Dataset.equals` - [ ] `DataArray.identical`, `Dataset.identical` - [ ] `DataArray.broadcast_equals`, `Dataset.broadcast_equals` Dask: - [ ] `DataArray.compute`, `Dataset.compute` - [ ] `DataArray.chunk`, `Dataset.chunk` - [ ] `DataArray.persist`, `Dataset.persist` Missing values: - [ ] `DataArray.bfill`, `Dataset.bfill` - [ ] `DataArray.ffill`, `Dataset.ffill` - [ ] `DataArray.fillna`, `Dataset.fillna` - [ ] `DataArray.dropna`, `Dataset.dropna` Indexing: - [ ] `DataArray.loc` (no docstring at all - came up in https://github.com/pydata/xarray/discussions/7528#discussion-4858556) - [ ] `DataArray.drop_isel` - [ ] `DataArray.drop_sel` - [ ] `DataArray.head`, `Dataset.head` - [ ] `DataArray.tail`, `Dataset.tail` - [ ] `DataArray.interp_like`, `Dataset.interp_like` - [ ] `DataArray.reindex_like`, `Dataset.reindex_like` - [ ] `Dataset.isel` Aggregations: - [ ] `Dataset.argmax` - [ ] `Dataset.argmin` - [ ] `DataArray.cumsum`, `Dataset.cumsum` (intermediate to advanced) - [ ] `DataArray.cumprod`, `Dataset.cumprod` (intermediate to advanced) - [ ] `DataArray.reduce`, `Dataset.reduce`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6793/reactions", "total_count": 4, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue
818059250	MDExOlB1bGxSZXF1ZXN0NTgxNDIzNTIx	4972	Automatic duck array testing - reductions	keewis 14808389	open	23	2021-02-27T23:57:23Z	2022-08-16T13:47:05Z	MEMBER	1	pydata/xarray/pulls/4972	This is the first of a series of PRs to add a framework to make testing the integration of duck arrays as simple as possible. It uses `hypothesis` for increased coverage and maintainability. [x] Tests added [x] Passes `pre-commit run --all-files` [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` [ ] New functions/methods are listed in `api.rst`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4972/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	pull
532696790	MDU6SXNzdWU1MzI2OTY3OTA=	3594	support for units with pint	keewis 14808389	open	7	2019-12-04T13:49:28Z	2022-08-03T11:44:05Z	MEMBER			`pint`'s implementation of NEP-18 (see hgrecco/pint#905) is close enough so we can finally start working on the `pint` support (i.e. make the integration tests pass). This would be the list of tasks to get there: * integration tests: - [x] implement integration tests for `DataArray`, `Dataset` and top-level functions (#3238, #3447, #3493) - [x] add tests for `Variable` as discussed in #3493 (#3654) - [x] clean up the current tests (#3600) - [x] use the standard `assert_identical` and `assert_allclose` functions (#3611, #3643, #3654, #3706, #3975) - [x] clean up the `TestVariable.test_pad` tests * actually get xarray to support units: - [x] top-level functions (#3611) - [x] `Variable` (#3706) + `rolling_window` and `identical` need larger modifications - [x] `DataArray` (#3643) - [x] `Dataset` - [x] silence all the `UnitStrippedWarnings` in the testsuite (#4163) - [ ] try to get `nanprod` to work with quantities - [x] add support for per variable fill values (#4165) - [x] `repr` with units (#2773) - [ ] type hierarchy (e.g. for `np.maximum(data_array, quantity)` vs `np.maximum(quantity, data_array)`) (#3950) * update the documentation - [x] point to pint-xarray (see #4530) - [x] mention the requirement for `UnitRegistry(force_ndarray=True)` or `UnitRegistry(force_ndarray_like=True)` (see https://pint-xarray.readthedocs.io/en/stable/creation.html#attaching-units) - [x] list the known issues (see https://github.com/pydata/xarray/pull/3643#issue-354872657 and https://github.com/pydata/xarray/pull/3643#issuecomment-602225731) (#4530): + `pandas` (indexing) + `bottleneck` (`bfill`, `ffill`) + `scipy` (`interp`) + `numbagg` (`rolling_exp`) + `numpy.lib.stride_tricks.as_strided`: `rolling` + `numpy.vectorize`: `interpolate_na` - [x] ~update the install instructions (we can use standard `conda` / `pip` now)~ this should be done by `pint-xarray`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3594/reactions", "total_count": 14, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 14, "rocket": 0, "eyes": 0 }	xarray 13221727	issue
597566530	MDExOlB1bGxSZXF1ZXN0NDAxNjU2MTc1	3960	examples for special methods on accessors	keewis 14808389	open	6	2020-04-09T21:34:30Z	2022-06-09T14:50:17Z	MEMBER	0	pydata/xarray/pulls/3960	This starts adding the parametrized accessor examples from #3829 to the accessor documentation as suggested by @jhamman. Since then the `weighted` methods have been added, though, so I'd like to use a different example instead (ideas welcome). Also, this feature can be abused to add functions to the main `DataArray` / `Dataset` namespace (by registering a function with the `register_*_accessor` decorators, see the second example). Is this something we want to explicitly discourage? (~When trying to build the docs locally, sphinx keeps complaining about a code block without code. Not sure what that is about~ seems the `ipython` directive does not allow more than one expression, so I used `code` instead) [x] Closes #3829 [x] Passes `isort -rc . && black . && mypy . && flake8` [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3960/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	pull
801728730	MDExOlB1bGxSZXF1ZXN0NTY3OTkzOTI3	4863	apply to dataset	keewis 14808389	open	14	2021-02-05T00:05:22Z	2022-06-09T14:50:17Z	MEMBER	0	pydata/xarray/pulls/4863	as discussed in #4837, this adds a method that applies a function to a `DataArray` by first converting it to a temporary dataset using `_to_temp_dataset`, applies the function and converts it back. I'm not really happy with the name but I can't find a better one. This function is really similar to `pipe`, so I guess a keyword argument to pipe would work, too. The disadvantage of that is that `pipe` passes all kwargs to the passed function, which means we would shadow a specific kwarg. [x] Closes #4837 [x] Tests added [x] Passes `pre-commit run --all-files` [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` [x] New functions/methods are listed in `api.rst`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4863/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	pull
959063390	MDExOlB1bGxSZXF1ZXN0NzAyMjM0ODc1	5668	create the context objects passed to custom `combine_attrs` functions	keewis 14808389	open	1	2021-08-03T12:24:50Z	2022-06-09T14:50:16Z	MEMBER	0	pydata/xarray/pulls/5668	Follow-up to #4896: this creates the context object in reduce methods and passes it to `merge_attrs`, with more planned. [ ] might help with xarray-contrib/cf-xarray#228 [ ] Tests added [x] Passes `pre-commit run --all-files` [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` [ ] New functions/methods are listed in `api.rst` Note that for now this is a bit inconvenient to use for provenance tracking (as discussed in the `cf-xarray` issue) because functions implementing that would still have to deal with merging the `attrs`.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5668/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	pull
1265366275	I_kwDOAMm_X85La_UD	6678	exception groups	keewis 14808389	open	1	2022-06-08T22:09:37Z	2022-06-08T23:38:28Z	MEMBER			What is your issue? As I mentioned in the meeting today, we have a lot of features where the the exception group support from PEP654 (which is scheduled for python 3.11 and consists of the class and a syntax change) might be useful. For example, we might want to collect all errors raised by `rename` in a exception group instead of raising them one-by-one. For `python<=3.10` there's a backport that contains the class and a workaround for the new syntax.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6678/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue
624778130	MDU6SXNzdWU2MjQ3NzgxMzA=	4095	merging non-dimension coordinates with the Dataset constructor	keewis 14808389	open	1	2020-05-26T10:30:37Z	2022-04-19T13:54:43Z	MEMBER			When adding two `DataArray` objects with different coordinates to a `Dataset`, a `MergeError` is raised even though one of the conflicting coords is a subset of the other. Merging dimension coordinates works so I'd expect associated non-dimension coordinates to work, too. This fails: ```python In [1]: import xarray as xr ...: import numpy as np In [2]: a = np.linspace(0, 1, 10) ...: b = np.linspace(-1, 0, 12) ...: ...: x_a = np.arange(10) ...: x_b = np.arange(12) ...: ...: y_a = x_a * 1000 ...: y_b = x_b * 1000 ...: ...: arr1 = xr.DataArray(data=a, coords={"x": x_a, "y": ("x", y_a)}, dims="x") ...: arr2 = xr.DataArray(data=b, coords={"x": x_b, "y": ("x", y_b)}, dims="x") ...: ...: xr.Dataset({"a": arr1, "b": arr2}) ... MergeError: conflicting values for variable 'y' on objects to be combined. You can skip this check by specifying compat='override'. `While this works:`python In [3]: a = np.linspace(0, 1, 10) ...: b = np.linspace(-1, 0, 12) ...: ...: x_a = np.arange(10) ...: x_b = np.arange(12) ...: ...: y_a = x_a * 1000 ...: y_b = x_b * 1000 ...: ...: xr.Dataset({ ...: "a": xr.DataArray(data=a, coords={"x": x_a}, dims="x"), ...: "b": xr.DataArray(data=b, coords={"x": x_b}, dims="x"), ...: }) Out[3]: <xarray.Dataset> Dimensions: (x: 12) Coordinates: * x (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 Data variables: a (x) float64 0.0 0.1111 0.2222 0.3333 0.4444 ... 0.8889 1.0 nan nan b (x) float64 -1.0 -0.9091 -0.8182 -0.7273 ... -0.1818 -0.09091 0.0 ``` I can work around this by calling: `python In [4]: xr.merge([arr1.rename("a").to_dataset(), arr2.rename("b").to_dataset()]) Out[4]: <xarray.Dataset> Dimensions: (x: 12) Coordinates: * x (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 y (x) float64 0.0 1e+03 2e+03 3e+03 ... 8e+03 9e+03 1e+04 1.1e+04 Data variables: a (x) float64 0.0 0.1111 0.2222 0.3333 0.4444 ... 0.8889 1.0 nan nan b (x) float64 -1.0 -0.9091 -0.8182 -0.7273 ... -0.1818 -0.09091 0.0` but I think the `Dataset` constructor should be capable of that, too.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4095/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue
539181896	MDU6SXNzdWU1MzkxODE4OTY=	3638	load_store and dump_to_store	keewis 14808389	open	1	2019-12-17T16:37:53Z	2021-11-08T21:11:26Z	MEMBER			Continuing from #3602, `load_store` and `dump_to_store` look like they are old and unmaintained functions: * `load_store` is referenced once in `api.rst` (I assume the reference to `from_store` was to `load_store`), but never tested, used or mentioned anywhere else * `dump_to_store` is tested (and probably used), but never mentioned except from the section on backends what should we do with these? Are they obsolete and should be removed or just unmaintained (then we should properly document and test them).	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3638/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue
789106802	MDU6SXNzdWU3ODkxMDY4MDI=	4825	clean up the API for renaming and changing dimensions / coordinates	keewis 14808389	open	5	2021-01-19T15:11:55Z	2021-09-10T15:04:14Z	MEMBER			From #4108: I wonder if it would be better to first "reorganize" all of the existing functions: we currently have `rename` (and `Dataset.rename_dims` / `Dataset.rename_vars`), `set_coords`, `reset_coords`, `set_index`, `reset_index` and `swap_dims`, which overlap partially. For example, the code sample from #4417 works if instead of `python ds = ds.rename(b='x') ds = ds.set_coords('x')` we use `python ds = ds.set_index(x="b")` and something similar for the code sample in #4107. I believe we currently have these use cases (not sure if that list is complete, though): - rename a `DataArray` → `rename` - rename a existing variable to a name that is not yet in the object → `rename` / `Dataset.rename_vars` / `Dataset.rename_dims` - convert a data variable to a coordinate (not a dimension coordinate) → `set_coords` - convert a coordinate (not a dimension coordinate) to a data variable → `reset_coords` - swap a existing dimension coordinate with a coordinate (which may not exist) and rename the dimension → `swap_dims` - use a existing coordinate / data variable as a dimension coordinate (do not rename the dimension) → `set_index` - stop using a coordinate as dimension coordinate and append `_` to its name (do not rename the dimension) → `reset_index` - use two existing coordinates / data variables as a MultiIndex → `set_index` - stop using a MultiIndex as a dimension coordinate and use its levels as coordinates → `reset_index` Sometimes, some of these can be emulated by combinations of others, for example: ```python x is a dimension without coordinates assert_identical(ds.set_index({"x": "b"}), ds.swap_dims({"x": "b"}).rename({"b": "x"})) assert_identical(ds.swap_dims({"x": "b"}), ds.set_index({"x": "b"}).rename({"x": "b"})) `and, with this PR:`python assert_identical(ds.set_index({"x": "b"}), ds.set_coords("b").rename({"b": "x"})) assert_identical(ds.swap_dims({"x": "b"}), ds.rename({"b": "x"})) `` which means that it would increase the overlap ofrename`,`set_index`, and`swap_dims`. In any case I think we should add a guide which explains which method to pick in which situation (or extend `howdoi`). Originally posted by @keewis in https://github.com/pydata/xarray/issues/4108#issuecomment-761907785	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4825/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue
935531700	MDU6SXNzdWU5MzU1MzE3MDA=	5562	hooks to "prepare" xarray objects for plotting	keewis 14808389	open	6	2021-07-02T08:14:02Z	2021-07-04T08:46:34Z	MEMBER			From https://github.com/xarray-contrib/pint-xarray/pull/61#discussion_r662485351 `matplotlib` has a module called `matplotlib.units` which manages a mapping of types to hooks. This is then used to convert custom types to something `matplotlib` can work with, and to optionally add axis labels. For example, with `pint`: `python In [9]: ureg = pint.UnitRegistry() ...: ureg.setup_matplotlib() ...: ...: t = ureg.Quantity(np.arange(10), "s") ...: v = ureg.Quantity(5, "m / s") ...: x = v * t ...: ...: fig, ax = plt.subplots(1, 1) ...: ax.plot(t, x) ...: ...: plt.show()` this will plot the data without `UnitStrippedWarning`s and even attach the units as labels to the axis (the format is hard-coded in `pint` right now). While this is pretty neat there are some issues: - `xarray`'s plotting code converts to `masked_array`, dropping metadata on the duck array (which means `matplotlib` won't see the duck arrays) - we will end up overwriting the axis labels once the variable names are added (not sure if there's a way to specify a label format?) - it is `matplotlib` specific, which means we have to reimplement once we go through with the plotting entrypoints discussed in #3553 and #3640 All of this makes me wonder: should we try to maintain our own mapping of hooks which "prepare" the object based on the data's type? My initial idea would be that the hook function receives a `Dataset` or `DataArray` object and modifies it to convert the data to `numpy` arrays and optionally modifies the `attrs`. For example for `pint` the hook would return the result of `.pint.dequantify()` but it could also be used to explicitly call `.get` on `cupy` arrays or `.todense` on `sparse` arrays. xref #5561	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5562/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue
589850951	MDU6SXNzdWU1ODk4NTA5NTE=	3917	running numpy functions on xarray objects	keewis 14808389	open	1	2020-03-29T18:17:29Z	2021-07-04T02:00:22Z	MEMBER			In the `pint` integration tests I tried to also test calling numpy functions on xarray objects (we provide methods for all of them). Some of these functions, like `numpy.median`, `numpy.searchsorted` and `numpy.clip`, depend on `__array_function__` (i.e. not `__array_ufunc__`) to dispatch. However, neither `Dataset` nor `DataArray` (nor `Variable`, I think?) define these protocols (see #3643). Should we define `__array_function__` on xarray objects?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3917/reactions", "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue
674445594	MDU6SXNzdWU2NzQ0NDU1OTQ=	4321	push inline formatting functions upstream	keewis 14808389	open	0	2020-08-06T16:35:04Z	2021-04-19T03:20:11Z	MEMBER			4248 added a `_repr_inline_` method duck arrays can use to customize their collapsed variable `repr`. We currently also have `inline_dask_repr` and `inline_sparse_repr` which remove redundant information like `dtype` and `shape` from `dask` and `sparse` arrays. In order to reduce the complexity of `inline_variable_array_repr`, we could try to push these functions upstream.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4321/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue
675342733	MDU6SXNzdWU2NzUzNDI3MzM=	4324	constructing nested inline reprs	keewis 14808389	open	9	2020-08-07T23:25:31Z	2021-04-19T03:20:01Z	MEMBER			While implementing the new `_repr_inline_` in xarray-contrib/pint-xarray#22, I realized that I designed that method with a single level of nesting in mind, e.g. `xarray(pint(x))` or `xarray(dask(x))`. From that PR: @keewis thinking about this some more, this doesn't work for anything other than `numpy.ndarray` objects. For now I guess we could use the magnitude's `_repr_inline_` (falling back to `__repr__` if that doesn't exist) and only use `format_array_flat` if the magnitude is a `ndarray`. However, as we nest deeper (e.g. `xarray(pint(uncertainties(dask(sparse(cupy)))))` – for argument's sake, let's assume that this actually makes sense) this might break or become really complicated. Does anyone have any ideas how to deal with that? If I'm simply missing something we have that discussion here, otherwise I guess we should open a issue on `xarray`'s issue tracker. @jthielen Yes, I agree that `format_array_flat` should probably just apply to magnitude being an `ndarray`. I think a cascading series of `_repr_inline_` should work for nested arrays, so long as * the metadata of the higher nested objects is considered the priority (if not, then we're back to a fully managed solution to the likes of [dask/dask#5329](https://github.com/dask/dask/issues/5329)) * small max lengths are handled gracefully (i.e., a minimum where it is just like `Dask.Array(...)`, then `...`, then nothing) * we're okay with the lowest arrays in large nesting chains not having any information show up in the inline repr (situation where there is not enough characters to even describe the full nesting has to be accounted for somehow) * it can be adopted without too much complaint across the ecosystem Assuming all this, then each layer of the nesting will reduce the max length of the inline repr string available to the layers below it, until a layer reaches a reasonable minimum where it "gives up". At least that's the natural design that I inferred from the simple `_repr_inline_(max_width)` API. All that being said, it might still be good to bring up on xarray's end since this is a more general issue with inline reprs of nested duck arrays, with nothing pint-specific other than it being the motivating use-case. How should we deal with this?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4324/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue
791277757	MDU6SXNzdWU3OTEyNzc3NTc=	4837	expose _to_temp_dataset / _from_temp_dataset as semi-public API?	keewis 14808389	open	5	2021-01-21T16:11:32Z	2021-01-22T02:07:08Z	MEMBER			When writing accessors which behave the same for both `Dataset` and `DataArray`, it would be incredibly useful to be able to use `DataArray._to_temp_dataset` / `DataArray._from_temp_dataset` to deduplicate code. Is it safe to use those in external packages (like `pint-xarray`)? Otherwise I guess it would be possible to use `python name = da.name if da.name is None else "__temp" temp_ds = da.to_dataset(name=name) new_da = temp_ds[name] if da.name is None: new_da = new_da.rename(da.name) assert_identical(da, new_da)` but that seems less efficient.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4837/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue
552896124	MDU6SXNzdWU1NTI4OTYxMjQ=	3711	PseudoNetCDF tests failing randomly	keewis 14808389	open	6	2020-01-21T14:01:49Z	2020-03-23T20:32:32Z	MEMBER			The `py37-windows` CI seems to fail for newer PRs: ```pytb ___ TestPseudoNetCDFFormat.test_uamiv_format_write ____ self = <xarray.tests.test_backends.TestPseudoNetCDFFormat object at 0x000002E11FF2DC08> `def test_uamiv_format_write(self): fmtkw = {"format": "uamiv"} expected = open_example_dataset( "example.uamiv", engine="pseudonetcdf", backend_kwargs=fmtkw ) with self.roundtrip( expected, save_kwargs=fmtkw, open_kwargs={"backend_kwargs": fmtkw}, allow_cleanup_failure=True, ) as actual:` `assert_identical(expected, actual)` xarray\tests\test_backends.py:3532: xarray\core\formatting.py:628: in diff_dataset_repr summary.append(diff_attrs_repr(a.attrs, b.attrs, compat)) a_mapping = {'CPROJ': 0, 'FILEDESC': 'CAMx ', 'FTYPE': 1, 'GDNAM': 'CAMx ', ...} b_mapping = {'CPROJ': 0, 'FILEDESC': 'CAMx ', 'FTYPE': 1, 'GDNAM': 'CAMx ', ...} compat = 'identical', title = 'Attributes' summarizer = <function summarize_attr at 0x000002E1156813A8>, col_width = None def _diff_mapping_repr(a_mapping, b_mapping, compat, title, summarizer, col_width=None): def extra_items_repr(extra_keys, mapping, ab_side): extra_repr = [summarizer(k, mapping[k], col_width) for k in extra_keys] if extra_repr: header = f"{title} only on the {ab_side} object:" return [header] + extra_repr else: return [] a_keys = set(a_mapping) b_keys = set(b_mapping) summary = [] diff_items = [] for k in a_keys & b_keys: try: # compare xarray variable compatible = getattr(a_mapping[k], compat)(b_mapping[k]) is_variable = True except AttributeError: # compare attribute value compatible = a_mapping[k] == b_mapping[k] is_variable = False `if not compatible:` E ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3711/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue
517195073	MDU6SXNzdWU1MTcxOTUwNzM=	3483	assign_coords with mixed DataArray / array args removes coords	keewis 14808389	open	5	2019-11-04T14:38:40Z	2019-11-07T15:46:15Z	MEMBER			I'm not sure if using `assign_coords` to overwrite the data of coords is the best way to do so, but using mixed args (on current master) turns out to have surprising results: ```python obj = xr.DataArray( ... data=[6, 3, 4, 6], ... coords={"x": list("abcd"), "y": ("x", range(4))}, ... dims="x", ... ) obj <xarray.DataArray 'obj' (x: 4)> array([6, 3, 4, 6]) Coordinates: * x (x) <U1 'a' 'b' 'c' 'd' y (x) int64 0 1 2 3 works as expected obj.assign_coords(coords={"x": list("efgh"), "y": ("x", [0, 2, 4, 6])}) <xarray.DataArray 'obj' (x: 4)> array([6, 3, 4, 6]) Coordinates: * x (x) <U1 'e' 'f' 'g' 'h' y (x) int64 0 2 4 6 works, too (same as .data / .values) obj.assign_coords(coords={ ... "x": obj.x.copy(data=list("efgh")).variable, ... "y": ("x", [0, 2, 4, 6]), ... }) <xarray.DataArray 'obj' (x: 4)> array([6, 3, 4, 6]) Coordinates: * x (x) <U1 'e' 'f' 'g' 'h' y (x) int64 0 2 4 6 this drops "y" obj.assign_coords(coords={ ... "x": obj.x.copy(data=list("efgh")), ... "y": ("x", [0, 2, 4, 6]), ... }) <xarray.DataArray 'obj' (x: 4)> array([6, 3, 4, 6]) Coordinates: * x (x) <U1 'e' 'f' 'g' 'h' Passing a `DataArray` for `y`, like `obj.y * 2` while also changing `x` (the type does not matter) always results in a `MergeError`:python obj.assign_coords(x=list("efgh"), y=obj.y * 2) xarray.core.merge.MergeError: conflicting values for index 'x' on objects to be combined: first value: Index(['e', 'f', 'g', 'h'], dtype='object', name='x') second value: Index(['a', 'b', 'c', 'd'], dtype='object', name='x') ``` I would expect the result to be the same regardless of the type of the new coords.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3483/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

25 rows where state = "open" and user = 14808389 sorted by updated_at descending

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

What is your issue?

What is your issue?

x is a dimension without coordinates

4248 added a _repr_inline_ method duck arrays can use to customize their collapsed variable repr.

works as expected

works, too (same as .data / .values)

this drops "y"

Advanced export

4248 added a `_repr_inline_` method duck arrays can use to customize their collapsed variable `repr`.