github: issues: 8 rows where user = 1322974 sorted by updated

8 rows where user = 1322974 sorted by updated_at descending

Search:

✖

descending

id	node_id	number	title	user	state	comments	created_at	updated_at ▲	closed_at	author_association	draft	pull_request	body	reactions	state_reason	repo	type
117039129	MDU6SXNzdWUxMTcwMzkxMjk=	659	groupby very slow compared to pandas	anntzer 1322974	closed	9	2015-11-16T02:43:57Z	2022-05-15T02:38:30Z	2022-05-15T02:38:30Z	CONTRIBUTOR			``` import timeit import numpy as np from pandas import DataFrame from xray import Dataset, DataArray df = DataFrame({"a": np.r_[np.arange(500.), np.arange(500.)], "b": np.arange(1000.)}) print(timeit.repeat('df.groupby("a").agg("mean")', globals={"df": df}, number=10)) print(timeit.repeat('df.groupby("a").agg(np.mean)', globals={"df": df, "np": np}, number=10)) ds = Dataset({"a": DataArray(np.r_[np.arange(500.), np.arange(500.)]), "b": DataArray(np.arange(1000.))}) print(timeit.repeat('ds.groupby("a").mean()', globals={"ds": ds}, number=10)) ``` This outputs `[0.010462284000823274, 0.009770361997652799, 0.01081446700845845] [0.02622630601399578, 0.024328112005605362, 0.018717073995503597] [2.2804569930012804, 2.1666158599982737, 2.2688316510029836]` i.e. xray's groupby is ~100 times slower than pandas' one (and 200 times slower than passing `"mean"` to pandas' groupby, which I assume involves some specialization). (This is the actual order or magnitude of the data size and redundancy I want to handle, i.e. thousands of points with very limited duplication.)	{ "url": "https://api.github.com/repos/pydata/xarray/issues/659/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
111795064	MDU6SXNzdWUxMTE3OTUwNjQ=	627	string coordinate gets converted to object coordinate upon addition of variable to dataset	anntzer 1322974	closed	10	2015-10-16T09:29:58Z	2021-03-27T21:19:33Z	2021-03-27T21:19:33Z	CONTRIBUTOR			With the current HEAD, consider ``` import numpy as np from xray import * ds = Dataset({"1": DataArray(np.zeros(3), dims=["a"], coords={"a": list("xyz")})}) print(ds) ds["2"] = DataArray(np.zeros(2), dims=["a"], coords={"a": list("xy")}) print(ds) ``` This outputs `<xray.Dataset> Dimensions: (a: 3) Coordinates: * a (a) <U1 'x' 'y' 'z' Data variables: 1 (a) float64 0.0 0.0 0.0 <xray.Dataset> Dimensions: (a: 3) Coordinates: * a (a) object 'x' 'y' 'z' Data variables: 1 (a) float64 0.0 0.0 0.0 2 (a) float64 0.0 0.0 nan` Note that the dtype of the `a` coordinate got changed after the assignment. Python3.5, numpy 1.10.1, xray master (6ea7eb2b388075cc838c5ddf0ddaa47020cfcb89) With 0.6.0 the coordinate is of object dtype both before and after. I forgot why I tried master but I must have had a good reason...	{ "url": "https://api.github.com/repos/pydata/xarray/issues/627/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
125708367	MDU6SXNzdWUxMjU3MDgzNjc=	712	DataArrays should display their coordinates in the natural order	anntzer 1322974	open	13	2016-01-08T22:33:05Z	2020-11-06T18:48:54Z		CONTRIBUTOR			Consider ``` from collections import * import numpy as np from xray import * d1 = DataArray(np.empty((2, 2)), coords=OrderedDict([("foo", [0, 1]), ("bar", [0, 1])])) d2 = DataArray(np.empty((2, 2)), coords=OrderedDict([("bar", [0, 1]), ("foo", [0, 1])])) ds = Dataset({"d1": d1, "d2": d2}) print(ds.d1) print(ds.d2) ``` This outputs `<xray.DataArray 'd1' (foo: 2, bar: 2)> array([[ 6.91516848e-310, 1.64244654e-316], [ 6.91516881e-310, 6.91516881e-310]]) Coordinates: * foo (foo) int64 0 1 * bar (bar) int64 0 1 <xray.DataArray 'd2' (bar: 2, foo: 2)> array([[ 1.59987863e-316, 6.91516883e-310], [ 6.91515690e-310, 2.12670320e-316]]) Coordinates: * foo (foo) int64 0 1 * bar (bar) int64 0 1` I understand that internally both DataArrays use the same coords object and thus the same coords order, but it would be helpful if, when printing d2 by itself, the coordinates were printed in the natural order ("bar", "foo"). In particular, when working interactively, the list of coordinates at the end of the repr is the most easy thing to spot, and thus most helpful to know how to format the call to `array.loc[...]`.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/712/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
112254767	MDU6SXNzdWUxMTIyNTQ3Njc=	631	Confusing error (or lack thereof) when coordinate and variable share the same name	anntzer 1322974	open	5	2015-10-19T23:39:22Z	2019-04-19T15:39:55Z		CONTRIBUTOR			It probably makes sense to prevent dataset to have variables sharing the names of coordinates (what would `dataset.varname` return?) but currently `Dataset({"a": DataArray(np.zeros((3, 4)), dims=["a", "b"], coords={"a": list("xyz"), "b": list("xyzt")})})` fails with `ValueError: an index variable must be defined with 1-dimensional data`, and `Dataset({"a": DataArray(np.zeros(3), coords={"a": list("xyz")})})` actually creates an empty dataset using `[0, 0, 0]` as values for the `a` coordinate instead of `x y z`: `<xray.Dataset> Dimensions: (a: 3) Coordinates: * a (a) float64 0.0 0.0 0.0 Data variables: empty`	{ "url": "https://api.github.com/repos/pydata/xarray/issues/631/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
112253425	MDU6SXNzdWUxMTIyNTM0MjU=	630	Whether a DataArray is copied when inserted into a Dataset depends on whether coordinates match exactly	anntzer 1322974	open	16	2015-10-19T23:27:15Z	2019-01-31T18:40:58Z		CONTRIBUTOR			Consider ``` import numpy as np from xray import * ds = Dataset({"a": DataArray(np.zeros((3, 4)))}) ds["b"] = b = DataArray(np.zeros((3, 4))) b[0, 0] = 1 print(ds["b"][0, 0]) # ==> prints 1 ds = Dataset({"a": DataArray(np.zeros((3, 4)))}) ds["b"] = b = DataArray(np.zeros((3, 3))) # !!! we implicitly fill the last column with nans. b[0, 0] = 1 print(ds["b"][0, 0]) # ==> prints 0 ``` In the first case, the dataset was modified when the dataarray was modified, but not in the second case.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/630/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	issue
114732169	MDU6SXNzdWUxMTQ3MzIxNjk=	643	"naive" iteration is very slow	anntzer 1322974	closed	2	2015-11-03T02:53:04Z	2019-01-15T21:09:07Z	2019-01-15T21:09:07Z	CONTRIBUTOR			``` $ ipython Python 3.5.0 (default, Sep 20 2015, 11:28:25) Type "copyright", "credits" or "license" for more information. IPython 4.0.0 -- An enhanced Interactive Python. ? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help -> Python's own help system. object? -> Details about 'object', use 'object??' for extra details. Using matplotlib backend: Qt4Agg In [1]: from xray import DataArray Iteration over a Python list In [2]: %%timeit t = list(range(10000)) for _ in t: pass ...: 10000 loops, best of 3: 87.3 µs per loop Iteration over a ndarray In [3]: %%timeit t = np.arange(10000) for _ in t: pass ...: 1000 loops, best of 3: 472 µs per loop Iteration over a DataArray In [4]: %%timeit t = DataArray(np.arange(10000)) for _ in t: pass ...: 1 loops, best of 3: 818 ms per loop ``` I'm not sure how much can be done about this as iterating over a DataArray needs to create a bunch of temporary objects (and I understand the emphasis is as usual on vectorized operations, etc.) but a >1500 fold difference certainly doesn't look good.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/643/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
170458908	MDU6SXNzdWUxNzA0NTg5MDg=	958	Test failure with matplotlib 2.0b3	anntzer 1322974	closed	1	2016-08-10T16:21:16Z	2018-10-26T23:12:28Z	2018-10-26T23:12:28Z	CONTRIBUTOR			mpl 2.0b3 / xarray HEAD Arch Linux, Python 3.5.2 ``` ============================================================================================= FAILURES ============================================================================================= ____________ TestPlot.test_subplot_kws _____________ self = <xarray.test.test_plot.TestPlot testMethod=test_subplot_kws> `def test_subplot_kws(self): a = easy_array((10, 15, 4)) d = DataArray(a, dims=['y', 'x', 'z']) d.coords['z'] = list('abcd') g = d.plot(x='x', y='y', col='z', col_wrap=2, cmap='cool', subplot_kws=dict(axisbg='r')) for ax in g.axes.flat:` `self.assertEqual(ax.get_axis_bgcolor(), 'r')` xarray/test/test_plot.py:148: self = <xarray.test.test_plot.TestPlot testMethod=test_subplot_kws>, a1 = (1.0, 0.0, 0.0, 1), a2 = 'r' `def assertEqual(self, a1, a2):` `assert a1 == a2 or (a1 != a1 and a2 != a2)` E AssertionError: assert ((1.0, 0.0, 0.0, 1) == 'r' or ((1.0, 0.0, 0.0, 1) != (1.0, 0.0, 0.0, 1))) xarray/test/init.py:164: AssertionError --------------------------------------------------------------------------------------- Captured stderr call --------------------------------------------------------------------------------------- /usr/lib/python3.5/site-packages/matplotlib/cbook.py:137: MatplotlibDeprecationWarning: The axisbg attribute was deprecated in version 2.0. Use facecolor instead. warnings.warn(message, mplDeprecation, stacklevel=1) /home/antony/src/extern/xarray/xarray/test/test_plot.py:148: MatplotlibDeprecationWarning: The get_axis_bgcolor function was deprecated in version 2.0. Use get_facecolor instead. self.assertEqual(ax.get_axis_bgcolor(), 'r') ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/958/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	completed	xarray 13221727	issue
117297089	MDExOlB1bGxSZXF1ZXN0NTA5MTEzMzQ=	661	Document pandas' better groupby performance.	anntzer 1322974	closed	1	2015-11-17T07:04:50Z	2015-11-17T09:10:04Z	2015-11-17T08:54:31Z	CONTRIBUTOR	0	pydata/xarray/pulls/661	cf. #659.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/661/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		xarray 13221727	pull

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

8 rows where user = 1322974 sorted by updated_at descending

Iteration over a Python list

Iteration over a ndarray

Iteration over a DataArray

Advanced export