github: issues: 2 rows where repo = 13221727, state = "open" and user = 1634164 sorted by updated

2 rows where repo = 13221727, state = "open" and user = 1634164 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at ▲	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
606846911	MDExOlB1bGxSZXF1ZXN0NDA4OTY0MTM3	4007	Allow DataArray.to_series() without invoking sparse.COO.todense()	khaeru 1634164	open	0			1	2020-04-25T20:15:16Z	2022-06-09T14:50:17Z		FIRST_TIME_CONTRIBUTOR		0	pydata/xarray/pulls/4007	This adds some code (from iiasa/ixmp#317) that allows DataArray.to_series() to be called without invoking sparse.COO.todense() when that is the backing data type. I'm aware this needs some improvement to meet the standard of the existing codebase, so I hope I could ask for some guidance on how to address the following points (including whom to ask about them): - [ ] Make the same improvement in {DataArray,Dataset}.to_dataframe(). - [ ] Possibly move the code out of dataarray.py to a more appropriate location (where?). - [ ] Possibly check for sparse.COO explicitly instead of xarray.core.pycompat.sparse_array_type. Other SparseArray subclasses, e.g. DOK, may not have the same attributes. Standard items: - [ ] Tests added. - [x] Passes `isort -rc . && black . && mypy . && flake8` (Sort of: these wanted to modify 7 files beyond the one I touched; didn't commit these changes.) - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4007/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }			xarray 13221727	pull
503711327	MDU6SXNzdWU1MDM3MTEzMjc=	3381	concat() fails when args have sparse.COO data and different fill values	khaeru 1634164	open	0			4	2019-10-07T21:54:06Z	2021-07-08T17:43:57Z		NONE				MCVE Code Sample ```python import numpy as np import pandas as pd import sparse import xarray as xr Indices and raw data foo = [f'foo{i}' for i in range(6)] bar = [f'bar{i}' for i in range(6)] raw = np.random.rand(len(foo) // 2, len(bar)) DataArray a = xr.DataArray( data=sparse.COO.from_numpy(raw), coords=[foo[:3], bar], dims=['foo', 'bar']) print(a.data.fill_value) # 0.0 Created from a pd.Series b_series = pd.DataFrame(raw, index=foo[3:], columns=bar) \ .stack() \ .rename_axis(index=['foo', 'bar']) b = xr.DataArray.from_series(b_series, sparse=True) print(b.data.fill_value) # nan Works despite inconsistent fill-values a + b a * b Fails: complains about inconsistent fill-values xr.concat([a, b], dim='foo') # * The fill_value argument doesn't help xr.concat([a, b], dim='foo', fill_value=np.nan) def fill_value(da): """Try to coerce one argument to a consistent fill-value.""" return xr.DataArray( data=sparse.as_coo(da.data, fill_value=np.nan), coords=da.coords, dims=da.dims, name=da.name, attrs=da.attrs, ) Fails: "Cannot provide a fill-value in combination with something that already has a fill-value" print(xr.concat([a.pipe(fill_value), b], dim='foo')) If we cheat by recreating 'a' from scratch, copying the fill value of the intended other argument, it works again: a = xr.DataArray( data=sparse.COO.from_numpy(raw, fill_value=b.data.fill_value), coords=[foo[:3], bar], dims=['foo', 'bar']) c = xr.concat([a, b], dim='foo') print(c.data.fill_value) # nan But simple operations again create objects with potentially incompatible fill-values d = c.sum(dim='bar') print(d.data.fill_value) # 0.0 ``` Expected `concat()` can be used without having to create new objects; i.e. the line marked `*` just works. Problem Description Some basic xarray manipulations don't work on `sparse.COO`-backed objects. xarray should automatically coerce objects into a compatible state, or at least provide users with methods to do so. Behaviour should also be documented, e.g. in this instance, which operations (here, `.sum()`) modify the underlying storage format in ways that necessitate some kind of (re-)conversion. Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Aug 20 2019, 17:04:43) [GCC 8.3.0] python-bits: 64 OS: Linux OS-release: 5.0.0-32-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_CA.UTF-8 LOCALE: en_CA.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.13.0 pandas: 0.25.0 numpy: 1.17.2 scipy: 1.2.1 netCDF4: 1.4.2 pydap: None h5netcdf: 0.7.1 h5py: 2.8.0 Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 2.1.0 distributed: None matplotlib: 3.1.1 cartopy: 0.17.0 seaborn: 0.9.0 numbagg: None setuptools: 40.8.0 pip: 19.2.3 conda: None pytest: 5.0.1 IPython: 5.8.0 sphinx: 2.2.0	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3381/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }			xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

2 rows where repo = 13221727, state = "open" and user = 1634164 sorted by updated_at descending

MCVE Code Sample

Indices and raw data

DataArray

Created from a pd.Series

Works despite inconsistent fill-values

Fails: complains about inconsistent fill-values

xr.concat([a, b], dim='foo') # ***

The fill_value argument doesn't help

xr.concat([a, b], dim='foo', fill_value=np.nan)

Fails: "Cannot provide a fill-value in combination with something that

already has a fill-value"

print(xr.concat([a.pipe(fill_value), b], dim='foo'))

If we cheat by recreating 'a' from scratch, copying the fill value of the

intended other argument, it works again:

But simple operations again create objects with potentially incompatible

fill-values

Expected

Problem Description

Output of `xr.show_versions()`

Advanced export

issues

2 rows where repo = 13221727, state = "open" and user = 1634164 sorted by updated_at descending

MCVE Code Sample

Indices and raw data

DataArray

Created from a pd.Series

Works despite inconsistent fill-values

Fails: complains about inconsistent fill-values

xr.concat([a, b], dim='foo') # ***

The fill_value argument doesn't help

xr.concat([a, b], dim='foo', fill_value=np.nan)

Fails: "Cannot provide a fill-value in combination with something that

already has a fill-value"

print(xr.concat([a.pipe(fill_value), b], dim='foo'))

If we cheat by recreating 'a' from scratch, copying the fill value of the

intended other argument, it works again:

But simple operations again create objects with potentially incompatible

fill-values

Expected

Problem Description

Output of xr.show_versions()

Advanced export

Output of `xr.show_versions()`