github: issues: 3 rows where comments = 4 and "updated_at" is on date 2021-07-08 sorted by updated

3 rows where comments = 4 and "updated_at" is on date 2021-07-08 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	comments	created_at	updated_at ▲	author_association	body	reactions	repo	type
503711327	MDU6SXNzdWU1MDM3MTEzMjc=	3381	concat() fails when args have sparse.COO data and different fill values	khaeru 1634164	open	4	2019-10-07T21:54:06Z	2021-07-08T17:43:57Z	NONE	MCVE Code Sample ```python import numpy as np import pandas as pd import sparse import xarray as xr Indices and raw data foo = [f'foo{i}' for i in range(6)] bar = [f'bar{i}' for i in range(6)] raw = np.random.rand(len(foo) // 2, len(bar)) DataArray a = xr.DataArray( data=sparse.COO.from_numpy(raw), coords=[foo[:3], bar], dims=['foo', 'bar']) print(a.data.fill_value) # 0.0 Created from a pd.Series b_series = pd.DataFrame(raw, index=foo[3:], columns=bar) \ .stack() \ .rename_axis(index=['foo', 'bar']) b = xr.DataArray.from_series(b_series, sparse=True) print(b.data.fill_value) # nan Works despite inconsistent fill-values a + b a * b Fails: complains about inconsistent fill-values xr.concat([a, b], dim='foo') # * The fill_value argument doesn't help xr.concat([a, b], dim='foo', fill_value=np.nan) def fill_value(da): """Try to coerce one argument to a consistent fill-value.""" return xr.DataArray( data=sparse.as_coo(da.data, fill_value=np.nan), coords=da.coords, dims=da.dims, name=da.name, attrs=da.attrs, ) Fails: "Cannot provide a fill-value in combination with something that already has a fill-value" print(xr.concat([a.pipe(fill_value), b], dim='foo')) If we cheat by recreating 'a' from scratch, copying the fill value of the intended other argument, it works again: a = xr.DataArray( data=sparse.COO.from_numpy(raw, fill_value=b.data.fill_value), coords=[foo[:3], bar], dims=['foo', 'bar']) c = xr.concat([a, b], dim='foo') print(c.data.fill_value) # nan But simple operations again create objects with potentially incompatible fill-values d = c.sum(dim='bar') print(d.data.fill_value) # 0.0 ``` Expected `concat()` can be used without having to create new objects; i.e. the line marked `*` just works. Problem Description Some basic xarray manipulations don't work on `sparse.COO`-backed objects. xarray should automatically coerce objects into a compatible state, or at least provide users with methods to do so. Behaviour should also be documented, e.g. in this instance, which operations (here, `.sum()`) modify the underlying storage format in ways that necessitate some kind of (re-)conversion. Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Aug 20 2019, 17:04:43) [GCC 8.3.0] python-bits: 64 OS: Linux OS-release: 5.0.0-32-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_CA.UTF-8 LOCALE: en_CA.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.13.0 pandas: 0.25.0 numpy: 1.17.2 scipy: 1.2.1 netCDF4: 1.4.2 pydap: None h5netcdf: 0.7.1 h5py: 2.8.0 Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 2.1.0 distributed: None matplotlib: 3.1.1 cartopy: 0.17.0 seaborn: 0.9.0 numbagg: None setuptools: 40.8.0 pip: 19.2.3 conda: None pytest: 5.0.1 IPython: 5.8.0 sphinx: 2.2.0	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3381/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue
489825483	MDU6SXNzdWU0ODk4MjU0ODM=	3281	[proposal] concatenate by axis, ignore dimension names	Hoeze 1200058	open	4	2019-09-05T15:06:22Z	2021-07-08T17:42:53Z	NONE	Hi, I wrote a helper function which allows to concatenate arrays like `xr.combine_nested` with the difference that it only supports `xr.DataArrays`, concatenates them by axis position similar to `np.concatenate` and overwrites all dimension names. I often need this to combine very different feature types. ```python from typing import Union, Tuple, List import numpy as np import xarray as xr def concat_by_axis( darrs: Union[List[xr.DataArray], Tuple[xr.DataArray]], dims: Union[List[str], Tuple[str]], axis: int = None, kwargs ): """ Concat arrays along some axis similar to `np.concatenate`. Automatically renames the dimensions to `dims`. Please note that this renaming happens by the axis position, therefore make sure to transpose all arrays to the correct dimension order. :param darrs: List or tuple of xr.DataArrays :param dims: The dimension names of the resulting array. Renames axes where necessary. :param axis: The axis which should be concatenated along :param kwargs: Additional arguments which will be passed to `xr.concat()` :return: Concatenated xr.DataArray with dimensions `dim`. """ # Get depth of nested lists. Assumes `darrs` is correctly formatted as list of lists. if axis is None: axis = 0 l = darrs # while l is a list or tuple and contains elements: while isinstance(l, List) or isinstance(l, Tuple) and l: # increase depth by one axis -= 1 l = l[0] if axis == 0: raise ValueError("`darrs` has to be a (possibly nested) list or tuple of xr.DataArrays!") to_concat = list() for i, da in enumerate(darrs): # recursive call for nested arrays; # most inner call should have axis = -1, # most outer call should have axis = - depth_of_darrs if isinstance(da, list) or isinstance(da, tuple): da = concat_axis(da, dims=dims, axis=axis + 1, kwargs) if not isinstance(da, xr.DataArray): raise ValueError("Input %d must be a xr.DataArray" % i) if len(da.dims) != len(dims): raise ValueError("Input %d must have the same number of dimensions as specified in the `dims` argument!" % i) # force-rename dimensions da = da.rename(dict(zip(da.dims, dims))) to_concat.append(da) return xr.concat(to_concat, dim=dims[axis], **kwargs) ``` Would it make sense to include this in xarray?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3281/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue
223231729	MDU6SXNzdWUyMjMyMzE3Mjk=	1379	xr.concat consuming too much resources	rafa-guedes 7799184	open	4	2017-04-20T23:33:52Z	2021-07-08T17:42:18Z	CONTRIBUTOR	Hi, I am reading in several (~1000) small ascii files into Dataset objects and trying to concatenate them over one specific dimension but I eventually blow my memory up. The file glob is not huge (~700M, my computer has ~16G) and I can do it fine if I only read in the Datasets appending them to a list without concatenating them (my memory increases by 5% only or so by the time I had read them all). However, when trying to concatenate each file into one single Dataset upon reading over a loop, the processing speeds drastically reduce before I have read 10% of the files or so and my memory usage keeps going up until it eventually blows up before I read and concatenate 30% of these files (the screenshot below was taken before it blew up, the memory usage was under 20% by the start of the processing). I was wondering if this is expected, or if there something that could be improved to make that work more efficiently please. I'm changing my approach now by extracting numpy arrays from the individual Datasets, concatenating these numpy arrays and defining the final Dataset only at the end. Thanks.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1379/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

3 rows where comments = 4 and "updated_at" is on date 2021-07-08 sorted by updated_at descending

MCVE Code Sample

Indices and raw data

DataArray

Created from a pd.Series

Works despite inconsistent fill-values

Fails: complains about inconsistent fill-values

xr.concat([a, b], dim='foo') # ***

The fill_value argument doesn't help

xr.concat([a, b], dim='foo', fill_value=np.nan)

Fails: "Cannot provide a fill-value in combination with something that

already has a fill-value"

print(xr.concat([a.pipe(fill_value), b], dim='foo'))

If we cheat by recreating 'a' from scratch, copying the fill value of the

intended other argument, it works again:

But simple operations again create objects with potentially incompatible

fill-values

Expected

Problem Description

Output of `xr.show_versions()`

Advanced export

issues

3 rows where comments = 4 and "updated_at" is on date 2021-07-08 sorted by updated_at descending

MCVE Code Sample

Indices and raw data

DataArray

Created from a pd.Series

Works despite inconsistent fill-values

Fails: complains about inconsistent fill-values

xr.concat([a, b], dim='foo') # ***

The fill_value argument doesn't help

xr.concat([a, b], dim='foo', fill_value=np.nan)

Fails: "Cannot provide a fill-value in combination with something that

already has a fill-value"

print(xr.concat([a.pipe(fill_value), b], dim='foo'))

If we cheat by recreating 'a' from scratch, copying the fill value of the

intended other argument, it works again:

But simple operations again create objects with potentially incompatible

fill-values

Expected

Problem Description

Output of xr.show_versions()

Advanced export

Output of `xr.show_versions()`