id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1292142108,PR_kwDOAMm_X846vvTZ,6746,improve typing of DataArray and Dataset reductions,32801740,closed,0,,,6,2022-07-02T21:06:41Z,2023-09-16T18:18:03Z,2023-09-14T12:13:58Z,CONTRIBUTOR,,0,pydata/xarray/pulls/6746,"This PR makes the typing of reduction methods (`count`, `all`, `any`, `max`, `min`, `mean`, `prod`, `sum`, `std`, `var`, `median`) for `Dataset` and `DataArray` consistent with the typing of most other methods of `Dataset` and `DataArray` after the great recent improvements by @headtr1ck in e.g. #6661. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6746/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 807764700,MDExOlB1bGxSZXF1ZXN0NTcyOTQ4MDQ3,4904,add typing to unary and binary arithmetic operators,32801740,closed,0,,,21,2021-02-13T15:01:59Z,2021-04-14T16:01:06Z,2021-04-14T15:59:59Z,CONTRIBUTOR,,0,pydata/xarray/pulls/4904,"- [x] Closes #4054 - [x] Passes `pre-commit run --all-files` - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` This draft PR is for soliciting feedback on an approach to adding typing to unary and binary operators for xarray classes. The approach involves generating a stub file using a simple script. In order to untangle the various helper classes implementing arithmetic throughout the code base I have refactored the code, thereby getting rid of the injection of the operators after class definition. It now inserts the operators during construction of the subclasses leveraging Python's `__init_subclass__` mechanism. For this example python file:
``` from typing import TYPE_CHECKING from xarray import DataArray # noqa: F401 from xarray import Dataset # noqa: F401 from xarray import Variable # noqa: F401 ds = Dataset({""v"": (""x"", [2])}) da = DataArray([3], dims=""x"") var = Variable(""x"", [4]) ds00 = -ds ds01 = ds + ds ds02 = da - ds ds03 = ds * da ds04 = var / ds ds05 = ds // var ds06 = ds & 1 ds07 = 2 | ds ds08 = ds == ds ds09 = ds != da ds10 = da <= ds ds11 = ds > var ds12 = var >= ds da0 = +da da1 = da % da da2 = da ** var da3 = var ^ da da4 = da & 123 da5 = 1.23 * da da6 = da == da da7 = da != var da8 = var < da var0 = ~var var1 = var + var var2 = var & 1 var3 = 3 ^ var var4 = var != var if TYPE_CHECKING: reveal_locals() # noqa: F821 for k, v in tuple(vars().items()): if k[0].islower(): print(k, type(v)) ```
the status quo gives this mypy output:
``` temp.py:11: error: Unsupported operand type for unary - (""Dataset"") temp.py:18: error: Unsupported operand types for | (""int"" and ""Dataset"") temp.py:25: error: Unsupported operand type for unary + (""DataArray"") temp.py:30: error: Unsupported operand types for * (""float"" and ""DataArray"") temp.py:35: error: Unsupported operand type for ~ (""Variable"") temp.py:38: error: Unsupported operand types for ^ (""int"" and ""Variable"") temp.py:42: note: Revealed local types are: temp.py:42: note: TYPE_CHECKING: builtins.bool temp.py:42: note: da: xarray.core.dataarray.DataArray temp.py:42: note: da0: Any temp.py:42: note: da1: Any temp.py:42: note: da2: Any temp.py:42: note: da3: Any temp.py:42: note: da4: Any temp.py:42: note: da5: builtins.float temp.py:42: note: da6: Any temp.py:42: note: da7: Any temp.py:42: note: da8: Any temp.py:42: note: ds: xarray.core.dataset.Dataset temp.py:42: note: ds00: Any temp.py:42: note: ds01: Any temp.py:42: note: ds02: Any temp.py:42: note: ds03: Any temp.py:42: note: ds04: Any temp.py:42: note: ds05: Any temp.py:42: note: ds06: Any temp.py:42: note: ds07: builtins.int temp.py:42: note: ds08: Any temp.py:42: note: ds09: Any temp.py:42: note: ds10: Any temp.py:42: note: ds11: Any temp.py:42: note: ds12: Any temp.py:42: note: var: xarray.core.variable.Variable temp.py:42: note: var0: Any temp.py:42: note: var1: Any temp.py:42: note: var2: Any temp.py:42: note: var3: builtins.int temp.py:42: note: var4: Any Found 6 errors in 1 file (checked 127 source files) ```
With this PR the output now becomes:
``` temp.py:42: note: Revealed local types are: temp.py:42: note: da: xarray.core.dataarray.DataArray temp.py:42: note: da0: xarray.core.dataarray.DataArray* temp.py:42: note: da1: xarray.core.dataarray.DataArray* temp.py:42: note: da2: xarray.core.dataarray.DataArray* temp.py:42: note: da3: xarray.core.dataarray.DataArray* temp.py:42: note: da4: xarray.core.dataarray.DataArray* temp.py:42: note: da5: xarray.core.dataarray.DataArray* temp.py:42: note: da6: xarray.core.dataarray.DataArray* temp.py:42: note: da7: xarray.core.dataarray.DataArray* temp.py:42: note: da8: xarray.core.dataarray.DataArray* temp.py:42: note: ds: xarray.core.dataset.Dataset temp.py:42: note: ds00: xarray.core.dataset.Dataset* temp.py:42: note: ds01: xarray.core.dataset.Dataset* temp.py:42: note: ds02: xarray.core.dataset.Dataset* temp.py:42: note: ds03: xarray.core.dataset.Dataset* temp.py:42: note: ds04: xarray.core.dataset.Dataset* temp.py:42: note: ds05: xarray.core.dataset.Dataset* temp.py:42: note: ds06: xarray.core.dataset.Dataset* temp.py:42: note: ds07: xarray.core.dataset.Dataset* temp.py:42: note: ds08: xarray.core.dataset.Dataset* temp.py:42: note: ds09: xarray.core.dataset.Dataset* temp.py:42: note: ds10: xarray.core.dataset.Dataset* temp.py:42: note: ds11: xarray.core.dataset.Dataset* temp.py:42: note: ds12: xarray.core.dataset.Dataset* temp.py:42: note: var: xarray.core.variable.Variable temp.py:42: note: var0: xarray.core.variable.Variable* temp.py:42: note: var1: xarray.core.variable.Variable* temp.py:42: note: var2: xarray.core.variable.Variable* temp.py:42: note: var3: xarray.core.variable.Variable* temp.py:42: note: var4: xarray.core.variable.Variable* ```
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4904/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 780955224,MDExOlB1bGxSZXF1ZXN0NTUwNzYyNjMx,4774,improve typing of OrderedSet,32801740,closed,0,,,1,2021-01-07T01:20:56Z,2021-01-07T23:28:53Z,2021-01-07T23:28:48Z,CONTRIBUTOR,,0,pydata/xarray/pulls/4774,"- [x] Passes `isort . && black . && mypy . && flake8` Small patch to improve typing of OrderedSet: allowing more general input type in its constructor and .update method (Iterator instead of AbstractSet) helps to get rid of ``# type: ignore``. Side-effect: some performance gain in micro-benchmark (before and after): ``` %timeit xr.core.utils.OrderedSet(range(10)) 3.46 µs ± 5.37 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) ``` ``` 2.06 µs ± 6.39 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) ```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4774/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 776542812,MDExOlB1bGxSZXF1ZXN0NTQ3MDEzNjY0,4742,speedup attribute style access and tab completion,32801740,closed,0,,,2,2020-12-30T16:45:03Z,2021-01-05T23:00:34Z,2021-01-05T23:00:29Z,CONTRIBUTOR,,0,pydata/xarray/pulls/4742," - [x] Closes #4741 - [x] Passes `isort . && black . && mypy . && flake8` - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - make _attr_sources and _item_sources lazy - make lookup of virtual dimension coordinates lazy - make helper class more general so it can be used for lookup of both virtual dimension and level coordinates","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4742/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 776520994,MDU6SXNzdWU3NzY1MjA5OTQ=,4741,Attribute style access is slow,32801740,closed,0,,,2,2020-12-30T15:52:07Z,2021-01-05T23:00:29Z,2021-01-05T23:00:29Z,CONTRIBUTOR,,,,"I appreciate xarray's ability to use attribute style access ``ds.foo`` as an alternative to ``ds[""foo""]`` as it requires less characters/keystrokes and has less 'visual clutter'. A drawback is that it can be much slower as lookup time seems to display `O(n)` behaviour instead of `O(1)` with `n` being the number of variables in the dataset. For e.g. `n=100` it is approximately 100 times slower than dictionary-style access: # Dataset with many (100) variables ds = xr.Dataset({f'var{v}': [] for v in range(100)}) %timeit ds['var0'] 462 µs ± 1.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) %timeit ds.var0 47.1 ms ± 205 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) ``dir()`` and ``_ipython_key_completions_()`` which are used for e.g. tab completion in iPython are equally slow: %timeit dir(ds) 47 ms ± 163 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) %timeit ds._ipython_key_completions_() 46.8 ms ± 210 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) I would like to see xarray having much better performance for attribute style access.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4741/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 763149369,MDExOlB1bGxSZXF1ZXN0NTM3NjMyNjUz,4683,Readd order and subok parameters to astype (GH4644),32801740,closed,0,,,3,2020-12-12T01:27:35Z,2020-12-24T16:38:49Z,2020-12-16T16:33:00Z,CONTRIBUTOR,,0,pydata/xarray/pulls/4683," - [x] Closes #4644 - [x] Tests added - [x] Passes `isort . && black . && mypy . && flake8` - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4683/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 755607271,MDU6SXNzdWU3NTU2MDcyNzE=,4644,astype method lost its order parameter,32801740,closed,0,,,7,2020-12-02T21:02:02Z,2020-12-16T16:33:00Z,2020-12-16T16:33:00Z,CONTRIBUTOR,,,," **What happened**: I upgraded from xarray 0.15.1 to 0.16.2 and the `astype` method seems to have lost the `order` parameter. ```python In [1]: import xarray as xr In [2]: xr.__version__ Out[2]: '0.16.2' In [3]: xr.DataArray([[1.0, 2.0], [3.0, 4.0]]).astype(dtype='d', order='F').values.strides --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in ----> 1 xr.DataArray([[1.0, 2.0], [3.0, 4.0]]).astype(dtype='d', order='F').values.strides TypeError: astype() got an unexpected keyword argument 'order' ``` **What you expected to happen**: I was expecting to get the same result as with xarray 0.15.1: ```python In [1]: import xarray as xr In [2]: xr.__version__ Out[2]: '0.15.1' In [3]: xr.DataArray([[1.0, 2.0], [3.0, 4.0]]).astype(dtype='d', order='F').values.strides Out[3]: (8, 16) ``` **Anything else we need to know?**: Looking at the documentation it seems it disappeared between 0.16.0 and 0.16.1. The documentation at http://xarray.pydata.org/en/v0.16.0/generated/xarray.DataArray.astype.html still has this snippet > order ({'C', 'F', 'A', 'K'}, optional) – Controls the memory layout order of the result. ‘C’ means C order, ‘F’ means Fortran order, ‘A’ means ‘F’ order if all the arrays are Fortran contiguous, ‘C’ order otherwise, and ‘K’ means as close to the order the array elements appear in memory as possible. Default is ‘K’. (which was identical to the documentation from numpy.ndarray.astype at https://numpy.org/doc/stable/reference/generated/numpy.ndarray.astype.html) while http://xarray.pydata.org/en/v0.16.1/generated/xarray.DataArray.astype.html seems to lack that part. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4644/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue