github: issues: 14 rows where comments = 6, repo = 13221727 and user = 5635139 sorted by updated

14 rows where comments = 6, repo = 13221727 and user = 5635139 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	comments	created_at	updated_at ▲	closed_at	author_association	draft	pull_request	body	reactions	state_reason	repo	type
1975574237	I_kwDOAMm_X851wN7d	8409	Task graphs on `.map_blocks` with many chunks can be huge	max-sixty 5635139	closed	6	2023-11-03T07:14:45Z	2024-01-03T04:10:16Z	2024-01-03T04:10:16Z	MEMBER			What happened? I'm getting task graphs > 1GB, I think possibly because the full indexes are being included in every task? What did you expect to happen? Only the relevant sections of the index would be included Minimal Complete Verifiable Example ```Python da = xr.tutorial.load_dataset('air_temperature') Dropping the index doesn't generally matter that much... len(cloudpickle.dumps(da.chunk(lat=1, lon=1))) 15569320 len(cloudpickle.dumps(da.chunk().drop_vars(da.indexes))) 15477313 But with `.map_blocks`, it really matters — it's really big with the indexes, and the same size without: len(cloudpickle.dumps(da.chunk(lat=1, lon=1).map_blocks(lambda x: x))) 79307120 len(cloudpickle.dumps(da.chunk(lat=1, lon=1).drop_vars(da.indexes).map_blocks(lambda x: x))) 16016173 ``` MVCE confirmation [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. [X] Complete example — the example is self-contained, including all data and the text of any traceback. [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. [X] New issue — a search of GitHub Issues suggests this is not a duplicate. [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies. Relevant log output No response Anything else we need to know? No response Environment INSTALLED VERSIONS ------------------ commit: None python: 3.9.18 (main, Aug 24 2023, 21:19:58) [Clang 14.0.3 (clang-1403.0.22.14.1)] python-bits: 64 OS: Darwin OS-release: 22.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: None xarray: 2023.10.1 pandas: 2.1.1 numpy: 1.26.1 scipy: 1.11.1 netCDF4: None pydap: None h5netcdf: 1.1.0 h5py: 3.8.0 Nio: None zarr: 2.16.0 cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: 1.3.7 dask: 2023.5.0 distributed: 2023.5.0 matplotlib: 3.6.0 cartopy: None seaborn: 0.12.2 numbagg: 0.6.0 fsspec: 2022.8.2 cupy: None pint: 0.22 sparse: 0.14.0 flox: 0.7.2 numpy_groupies: 0.9.22 setuptools: 68.1.2 pip: 23.2.1 conda: None pytest: 7.4.0 mypy: 1.6.1 IPython: 8.14.0 sphinx: 5.2.1	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8409/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
988158051	MDU6SXNzdWU5ODgxNTgwNTE=	5764	Implement __sizeof__ on objects?	max-sixty 5635139	open	6	2021-09-03T23:36:53Z	2023-12-19T18:23:08Z		MEMBER			Is your feature request related to a problem? Please describe. Currently `ds.nbytes` returns the size of the data. But `sys.getsizeof(ds)` returns a very small number. Describe the solution you'd like If we implement `__sizeof__` on DataArrays & Datasets, this would work. I think that would be something like `ds.nbytes` + the size of the `ds` container, + maybe attrs if those aren't handled by `.nbytes`?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5764/reactions", "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	reopened	xarray 13221727	issue
866826033	MDU6SXNzdWU4NjY4MjYwMzM=	5215	Add an Cumulative aggregation, similar to Rolling	max-sixty 5635139	closed	6	2021-04-24T19:59:49Z	2023-12-08T22:06:53Z	2023-12-08T22:06:53Z	MEMBER			Is your feature request related to a problem? Please describe. Pandas has a `.expanding` aggregation, which is basically rolling with a full lookback. I often end up supplying rolling with the length of the dimension, and this is some nice sugar for that. Describe the solution you'd like Basically the same as pandas — a `.expanding` method that returns an `Expanding` class, which implements the same methods as a `Rolling` class. Describe alternatives you've considered Some options: – This – Don't add anything, the sugar isn't worth the additional API. – Go full out and write specialized expanding algos — which will be faster since they don't have to keep track of the window. But not that much faster, likely not worth the effort.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5215/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
1878288525	PR_kwDOAMm_X85ZYos5	8139	Fix pandas' `interpolate(fill_value=)` error	max-sixty 5635139	closed	6	2023-09-02T02:41:45Z	2023-09-28T16:48:51Z	2023-09-04T18:05:14Z	MEMBER	0	pydata/xarray/pulls/8139	Pandas no longer has a `fill_value` parameter for `interpolate`. Weirdly I wasn't getting this locally, on pandas 2.1.0, only in CI on https://github.com/pydata/xarray/actions/runs/6054400455/job/16431747966?pr=8138. Removing it passes locally, let's see whether this works in CI Would close #8125	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8139/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
967854972	MDExOlB1bGxSZXF1ZXN0NzEwMDA1NzY4	5694	Ask PRs to annotate tests	max-sixty 5635139	closed	6	2021-08-12T02:19:28Z	2023-09-28T16:46:19Z	2023-06-19T05:46:36Z	MEMBER	0	pydata/xarray/pulls/5694	[x] Passes `pre-commit run --all-files` [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` As discussed https://github.com/pydata/xarray/pull/5690#issuecomment-897280353	{ "url": "https://api.github.com/repos/pydata/xarray/issues/5694/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
1874148181	I_kwDOAMm_X85vtTtV	8123	`.rolling_exp` arguments could be clearer	max-sixty 5635139	open	6	2023-08-30T18:09:04Z	2023-09-01T00:25:08Z		MEMBER			Is your feature request related to a problem? Currently we call `.rolling_exp` like: `da.rolling_exp(date=20).mean()` `20` refers to a "standard" window type — broadly "the same average distance as a simple rolling window. That works well, and matches the `.rolling(date=20).mean()` format. But we also have different window types, and this makes it a bit incongruent: `da.rolling_exp(date=0.5, window_type="alpha").mean()` ...since the `window_type` is completely changing the meaning of the value we pass to the dimension argument. A bit like someone asking "how many apples would you like to buy", and replying "5", and then separately saying "when I said 5, I meant 5 tonnes". Describe the solution you'd like One option would be: `.rolling_exp(dptr={"alpha": 0.5})` We pass a dict if we want a non-standard window type — so the value is attached to its type. We could still have the original form for `da.rolling_exp(date=20).mean()`. Describe alternatives you've considered No response Additional context (I realize I wrote this originally, all criticism directed at me! This is based on feedback from a colleague, which on reflection I agree with.) Unless anyone disagrees, I'll try and do this soon-ish™	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8123/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
729208432	MDExOlB1bGxSZXF1ZXN0NTA5NzM0NTM2	4540	numpy_groupies	max-sixty 5635139	closed	6	2020-10-26T03:37:19Z	2022-02-05T22:24:12Z	2021-10-24T00:18:52Z	MEMBER	0	pydata/xarray/pulls/4540	[x] Closes https://github.com/pydata/xarray/issues/4473 [ ] Tests added [x] Passes `isort . && black . && mypy . && flake8` [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` [ ] New functions/methods are listed in `api.rst` Very early effort — I found this harder than I expected — I was trying to use the existing groupby infra, but think I maybe should start afresh. The result of the `numpy_groupies` operation is a fully formed array, whereas we're used to handling an iterable of results which need to be concat. I also added some type signature / notes and I was going through the existing code; mostly for my own understanding If anyone has any thoughts, feel free to comment — otherwise I'll resume this soon	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4540/reactions", "total_count": 4, "+1": 2, "-1": 0, "laugh": 0, "hooray": 2, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
399164733	MDExOlB1bGxSZXF1ZXN0MjQ0NjU3NTk5	2674	Skipping variables in datasets that don't have the core dim	max-sixty 5635139	closed	6	2019-01-15T02:43:11Z	2021-05-13T22:02:19Z	2021-05-13T22:02:19Z	MEMBER	0	pydata/xarray/pulls/2674	ref https://github.com/pydata/xarray/pull/2650#issuecomment-454164295 This seems an ugly way of accomplishing the goal; any ideas for a better way of doing this? And stepping back, do others think a) it's helpful to skip variables in a dataset, and b) `apply_ufunc` should do this?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2674/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
298421965	MDU6SXNzdWUyOTg0MjE5NjU=	1923	Local test failure in test_backends	max-sixty 5635139	closed	6	2018-02-19T22:53:37Z	2020-09-05T20:32:17Z	2020-09-05T20:32:17Z	MEMBER			I'm happy to debug this further but before I do, is this an issue people have seen before? I'm running tests on master and hit an issue very early on. FWIW I don't use netCDF, and don't think I've got that installed Code Sample, a copy-pastable example if possible ```python ========================================================================== FAILURES ========================================================================== _________ ScipyInMemoryDataTest.test_bytesio_pickle __________ self = <xarray.tests.test_backends.ScipyInMemoryDataTest testMethod=test_bytesio_pickle> `@pytest.mark.skipif(PY2, reason='cannot pickle BytesIO on Python 2') def test_bytesio_pickle(self): data = Dataset({'foo': ('x', [1, 2, 3])}) fobj = BytesIO(data.to_netcdf()) with open_dataset(fobj, autoclose=self.autoclose) as ds:` `unpickled = pickle.loads(pickle.dumps(ds))` E TypeError: can't pickle _thread.lock objects xarray/tests/test_backends.py:1384: TypeError ``` Problem description [this should explain why the current behavior is a problem and why the expected output is a better solution.] Expected Output Skip or pass backends tests Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: d00721a3560f57a1b9226c5dbf5bf3af0356619d python: 3.6.4.final.0 python-bits: 64 OS: Darwin OS-release: 17.4.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.7.0-38-g1005a9e # not sure why this is tagged so early. I'm running on latest master pandas: 0.22.0 numpy: 1.14.0 scipy: 1.0.0 netCDF4: None h5netcdf: None h5py: None Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: None distributed: None matplotlib: 2.1.2 cartopy: None seaborn: 0.8.1 setuptools: 38.5.1 pip: 9.0.1 conda: None pytest: 3.4.0 IPython: 6.2.1 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1923/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
575088962	MDExOlB1bGxSZXF1ZXN0MzgzMzAwMjgw	3826	Allow ellipsis to be used in stack	max-sixty 5635139	closed	6	2020-03-04T02:21:21Z	2020-03-20T01:20:54Z	2020-03-19T22:55:09Z	MEMBER	0	pydata/xarray/pulls/3826	[x] Closes https://github.com/pydata/xarray/issues/3814 [x] Tests added [x] Passes `isort -rc . && black . && mypy . && flake8` [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3826/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
577283480	MDExOlB1bGxSZXF1ZXN0Mzg1MTA3OTU4	3846	Doctests fixes	max-sixty 5635139	closed	6	2020-03-07T05:44:27Z	2020-03-10T14:03:05Z	2020-03-10T14:03:00Z	MEMBER	0	pydata/xarray/pulls/3846	[ ] Closes #xxxx [ ] Tests added [x] Passes `isort -rc . && black . && mypy . && flake8` [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API Starting to get some fixes in. It's going to be a long journey though. I think maybe we whitelist some files and move gradually through before whitelisting the whole library.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3846/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
485437811	MDU6SXNzdWU0ODU0Mzc4MTE=	3265	Sparse tests failing on master	max-sixty 5635139	closed	6	2019-08-26T20:34:21Z	2019-08-27T00:01:18Z	2019-08-27T00:01:07Z	MEMBER			https://dev.azure.com/xarray/xarray/_build/results?buildId=695 ```python =================================== FAILURES =================================== ___ TestSparseVariable.test_unary_op ___ self = <xarray.tests.test_sparse.TestSparseVariable object at 0x7f24f0b21b70> `def test_unary_op(self):` `sparse.utils.assert_eq(-self.var.data, -self.data)` E AttributeError: module 'sparse' has no attribute 'utils' xarray/tests/test_sparse.py:285: AttributeError ___ TestSparseVariable.test_univariate_ufunc _____ self = <xarray.tests.test_sparse.TestSparseVariable object at 0x7f24ebc2bb38> `def test_univariate_ufunc(self):` `sparse.utils.assert_eq(np.sin(self.data), xu.sin(self.var).data)` E AttributeError: module 'sparse' has no attribute 'utils' xarray/tests/test_sparse.py:290: AttributeError ___ TestSparseVariable.test_bivariate_ufunc ______ self = <xarray.tests.test_sparse.TestSparseVariable object at 0x7f24f02a7e10> `def test_bivariate_ufunc(self):` `sparse.utils.assert_eq(np.maximum(self.data, 0), xu.maximum(self.var, 0).data)` E AttributeError: module 'sparse' has no attribute 'utils' xarray/tests/test_sparse.py:293: AttributeError ___ TestSparseVariable.testpickle ____ self = <xarray.tests.test_sparse.TestSparseVariable object at 0x7f24f04f2c50> `def test_pickle(self): v1 = self.var v2 = pickle.loads(pickle.dumps(v1))` `sparse.utils.assert_eq(v1.data, v2.data)` E AttributeError: module 'sparse' has no attribute 'utils' xarray/tests/test_sparse.py:307: AttributeError ``` Any ideas?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3265/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
457080809	MDExOlB1bGxSZXF1ZXN0Mjg4OTY1MzQ4	3029	Fix pandas-dev tests	max-sixty 5635139	closed	6	2019-06-17T18:15:16Z	2019-06-28T15:31:33Z	2019-06-28T15:31:28Z	MEMBER	0	pydata/xarray/pulls/3029	Currently pandas-dev tests get 'stuck' on the conda install. The last instruction to run is the standard install: `$ if [[ "$CONDA_ENV" == "docs" ]]; then conda env create -n test_env --file doc/environment.yml; elif [[ "$CONDA_ENV" == "lint" ]]; then conda env create -n test_env --file ci/requirements-py37.yml; else conda env create -n test_env --file ci/requirements-$CONDA_ENV.yml; fi` And after installing the libraries, it prints this and then stops: `Preparing transaction: - - done Verifying transaction: \| / \ \| / - \ \| / / done Executing transaction: \ \| / - \ \| / - \ \| / - \ \| / - \ \| / - \ \| / / - \ \| / - \ done No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.` I'm not that familiar with conda. Anyone have any ideas as to why this would fail while the other builds would succeed?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3029/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull
168901028	MDU6SXNzdWUxNjg5MDEwMjg=	934	Should indexing be possible on 1D coords, even if not dims?	max-sixty 5635139	closed	6	2016-08-02T14:33:43Z	2019-01-27T06:49:52Z	2019-01-27T06:49:52Z	MEMBER			``` python In [1]: arr = xr.DataArray(np.random.rand(4, 3), ...: ...: [('time', pd.date_range('2000-01-01', periods=4)), ...: ...: ('space', ['IA', 'IL', 'IN'])]) ...: ...: In [17]: arr.coords['space2'] = ('space', ['A','B','C']) In [18]: arr Out[18]: <xarray.DataArray (time: 4, space: 3)> array([[ 0.05187049, 0.04743067, 0.90329666], [ 0.59482538, 0.71014366, 0.86588207], [ 0.51893157, 0.49442107, 0.10697737], [ 0.16068189, 0.60756757, 0.31935279]]) Coordinates: * time (time) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 2000-01-04 * space (space) \|S2 'IA' 'IL' 'IN' space2 (space) \|S1 'A' 'B' 'C' ``` Now try to select on the space2 coord: ``` python In [19]: arr.sel(space2='A') ValueError Traceback (most recent call last) <ipython-input-19-eae5e4b64758> in <module>() ----> 1 arr.sel(space2='A') /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/xarray/core/dataarray.pyc in sel(self, method, tolerance, indexers) 601 """ 602 return self.isel(indexing.remap_label_indexers( --> 603 self, indexers, method=method, tolerance=tolerance)) 604 605 def isel_points(self, dim='points', indexers): /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/xarray/core/dataarray.pyc in isel(self, indexers) 588 DataArray.sel 589 """ --> 590 ds = self._to_temp_dataset().isel(indexers) 591 return self._from_temp_dataset(ds) 592 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/xarray/core/dataset.pyc in isel(self, indexers) 908 invalid = [k for k in indexers if k not in self.dims] 909 if invalid: --> 910 raise ValueError("dimensions %r do not exist" % invalid) 911 912 # all indexers should be int, slice or np.ndarrays ValueError: dimensions ['space2'] do not exist ``` Is there an easier way to do this? I couldn't think of anything... CC @justinkuosixty	{ "url": "https://api.github.com/repos/pydata/xarray/issues/934/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

14 rows where comments = 6, repo = 13221727 and user = 5635139 sorted by updated_at descending

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

Dropping the index doesn't generally matter that much...

15569320

15477313

But with `.map_blocks`, it really matters — it's really big with the indexes, and the same size without:

79307120

16016173

MVCE confirmation

Relevant log output

Anything else we need to know?

Environment

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of `xr.show_versions()`

Advanced export

issues

14 rows where comments = 6, repo = 13221727 and user = 5635139 sorted by updated_at descending

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

Dropping the index doesn't generally matter that much...

15569320

15477313

But with .map_blocks, it really matters — it's really big with the indexes, and the same size without:

79307120

16016173

MVCE confirmation

Relevant log output

Anything else we need to know?

Environment

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of xr.show_versions()

Advanced export

But with `.map_blocks`, it really matters — it's really big with the indexes, and the same size without:

Output of `xr.show_versions()`