github: issues: 6 rows where comments = 4, type = "issue" and user = 6213168 sorted by updated

6 rows where comments = 4, type = "issue" and user = 6213168 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	comments	created_at	updated_at ▲	closed_at	author_association	body	reactions	state_reason	repo	type
309686915	MDU6SXNzdWUzMDk2ODY5MTU=	2027	square-bracket slice a Dataset with a DataArray	crusaderky 6213168	open	4	2018-03-29T09:39:57Z	2022-04-18T03:51:25Z		MEMBER	Given this: ``` ds = xarray.Dataset( data_vars={ 'vote': ('pupil', [5, 7, 8]), 'age': ('pupil', [15, 14, 16]) }, coords={ 'pupil': ['Alice', 'Bob', 'Charlie'] }) <xarray.Dataset> Dimensions: (pupil: 3) Coordinates: * pupil (pupil) <U7 'Alice' 'Bob' 'Charlie' Data variables: vote (pupil) int64 5 7 8 age (pupil) int64 15 14 16 ``` Why does this work: ``` ds.age[ds.vote >= 6] <xarray.DataArray 'age' (pupil: 2)> array([14, 16]) Coordinates: * pupil (pupil) <U7 'Bob' 'Charlie' ``` But this doesn't? ``` ds[ds.vote >= 6] KeyError: False `ds.vote >= 6`` is a DataArray with dims=('pupil', ) and dtype=bool, so I can't think of any ambiguity in what I want to achieve? Workaround: ``` ds.sel(pupil=ds.vote >= 6) <xarray.Dataset> Dimensions: (pupil: 2) Coordinates: * pupil (pupil) <U7 'Bob' 'Charlie' Data variables: vote (pupil) int64 7 8 age (pupil) int64 14 16 ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2027/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
506885041	MDU6SXNzdWU1MDY4ODUwNDE=	3397	"How Do I..." formatting issues	crusaderky 6213168	closed	4	2019-10-14T21:32:27Z	2019-10-16T21:41:06Z	2019-10-16T21:41:06Z	MEMBER	@dcherian The new page http://xarray.pydata.org/en/stable/howdoi.html (#3357) is somewhat painful to read on readthedocs. The table goes out of the screen and one is forced to scroll left and right non stop. Maybe a better alternative could be with Sphinx definitions syntax (which allows for automatic reflowing)? rst How do I ... ============ Add variables from other datasets to my dataset? :py:meth:`Dataset.merge` (that's a 4 spaces indent)	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3397/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
168469112	MDU6SXNzdWUxNjg0NjkxMTI=	926	stack() on dask array produces inefficient chunking	crusaderky 6213168	closed	4	2016-07-30T14:12:34Z	2019-02-01T16:04:43Z	2019-02-01T16:04:43Z	MEMBER	Whe the stack() method is used on a xarray with dask backend, one would expect that every output chunk is produced by exactly 1 input chunk. This is not the case, as stack() actually produces an extremely fragmented dask array: https://gist.github.com/crusaderky/07991681d49117bfbef7a8870e3cba67	{ "url": "https://api.github.com/repos/pydata/xarray/issues/926/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
172291585	MDU6SXNzdWUxNzIyOTE1ODU=	979	align() should align chunks	crusaderky 6213168	closed	4	2016-08-20T21:25:01Z	2019-01-24T17:19:30Z	2019-01-24T17:19:30Z	MEMBER	In the xarray docs I read With the current version of dask, there is no automatic alignment of chunks when performing operations between dask arrays with different chunk sizes. If your computation involves multiple dask arrays with different chunks, you may need to explicitly rechunk each array to ensure compatibility. While chunk auto-alignment could be done within the dask library, that would be limited to arrays with the same dimensionality and same dims order. For example it would not be possible to have a dask library call to align the chunks on xarrays with the following dims: - (time, latitude, longitude) - (time) - (longitude, latitude) even if it makes perfect sense in xarray. I think xarray.align() should take care of it automatically. A safe algorithm would be to always scale down the chunksize when in conflict. This would prevent having chunks larger than expected, and should minimise (in a greedy way) the number of operations. It's also a good idea on dask.distributed, where merging two chunks could cause one of them to travel on the network - which is very expensive. e.g. to reconcile chunksizes a: (5, 10, 6) b: (5, 7, 9) the algorithm would rechunk both arrays to (5, 7, 3, 6). Finally, when served with a numpy-based array and a dask-based array, align() should convert the numpy array to dask. The critical use case that would benefit from this behaviour is when align() is invoked inside a broadcast() between a tiny constant you just loaded from csv/pandas/pure python list/whatever - e.g. dims=(time, ) shape=(100, ) - and a huge dask-backed array e.g. dims=(time, scenario) shape=(100, 230) chunks=(25, 220).	{ "url": "https://api.github.com/repos/pydata/xarray/issues/979/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
339611449	MDU6SXNzdWUzMzk2MTE0NDk=	2273	to_netcdf uses deprecated and unnecessary dask call	crusaderky 6213168	closed	4	2018-07-09T21:20:20Z	2018-07-31T20:03:41Z	2018-07-31T19:42:20Z	MEMBER	``` ds = xarray.Dataset({'x': 1}) ds.to_netcdf('foo.nc') dask/utils.py:1010: UserWarning: Deprecated, see dask.base.get_scheduler instead ``` Stack trace: ``` xarray/backends/common.py(44)get_scheduler() 43 from dask.utils import effective_get ---> 44 actual_get = effective_get(get, collection) ``` There are two separate problems here: dask recently changed API from `get(get=callable)` to `get(scheduler=str)`. Should we just increase the minimum version of dask (I doubt anybody will complain) go through the hoops of dynamically invoking a different API depending on the dask version :sweat: silence the warning now, and then increase the minimum version of dask the day that dask removes the old API entirely (risky)? xarray is calling dask even when it's unnecessary, as none of the variables in the example Dataset had a dask backend. I don't think there are any CI suites for NetCDF without dask. I'm also wondering if they would bring any actual added value, as dask is small, has no exotic dependencies, and is pure Python; so I doubt anybody will have problems installing it whatever his setup is. @shoyer opinion?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2273/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
172290413	MDU6SXNzdWUxNzIyOTA0MTM=	978	broadcast() broken on dask backend	crusaderky 6213168	closed	4	2016-08-20T20:56:33Z	2016-12-09T20:28:42Z	2016-12-09T20:28:42Z	MEMBER	``` python a = xarray.DataArray([1,2]).chunk(1) a <xarray.DataArray (dim_0: 2)> dask.array<xarray-..., shape=(2,), dtype=int64, chunksize=(1,)> Coordinates: * dim_0 (dim_0) int64 0 1 xarray.broadcast(a) (<xarray.DataArray (dim_0: 2)> array([1, 2]) Coordinates: * dim_0 (dim_0) int64 0 1,) ``` The problem is actually somewhere in the constructor of DataArray. In alignment.py:362, we have `return DataArray(data, ...)` where data is a Variable with dask backend. The returned DataArray object has a numpy backend. As a workaround, changing that line to `return DataArray(data.data, ...)` (thus passing a dask array) fixes the problem. After that however there's a new issue: whenever broadcast adds a dimension to an array, it creates it in a single chunk, as opposed to copying the chunking of the other arrays. This can easily call a host to go out of memory, and makes it harder to work with the arrays afterwards because chunks won't match.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/978/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);