github: issue_comments: 9 rows where issue = 168470276 sorted by updated

9 rows where issue = 168470276 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
241232663	https://github.com/pydata/xarray/issues/927#issuecomment-241232663	https://api.github.com/repos/pydata/xarray/issues/927	MDEyOklzc3VlQ29tbWVudDI0MTIzMjY2Mw==	shoyer 1217238	2016-08-21T01:00:27Z	2016-08-21T01:00:27Z	MEMBER	Fixed by #963	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	align() and broadcast() before concat() 168470276
239289914	https://github.com/pydata/xarray/issues/927#issuecomment-239289914	https://api.github.com/repos/pydata/xarray/issues/927	MDEyOklzc3VlQ29tbWVudDIzOTI4OTkxNA==	crusaderky 6213168	2016-08-11T20:57:00Z	2016-08-11T20:57:00Z	MEMBER	Finished first complete part, ready for merging: https://github.com/pydata/xarray/pull/963 Note: there is a failed unit test, TestDataset.test_broadcast_nocopy, which shows broadcast on dataset doing a data copy whereas it shouldn't. Could you look into it? I'm now moving to auto-calling broadcast inside concat()...	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	align() and broadcast() before concat() 168470276
238114831	https://github.com/pydata/xarray/issues/927#issuecomment-238114831	https://api.github.com/repos/pydata/xarray/issues/927	MDEyOklzc3VlQ29tbWVudDIzODExNDgzMQ==	shoyer 1217238	2016-08-07T23:14:44Z	2016-08-07T23:14:44Z	MEMBER	First of all -- thanks for diving into this! You are really getting into the messy guts here, so thanks for sticking with it. I can't put any sense in the skip_single_target hack or in the whole special treatment inside align() for when there is only one arg. What's the benefit of the whole thing? What would happen if we simply removed the special logic? I actually just added this hack in a few days ago to fix a regression in the v0.8.0 release (https://github.com/pydata/xarray/issues/943). The problem is that `__setitem__` assignment in xarray does automatic alignment, but arrays with non-unique indexes cannot be aligned. This meant that it was impossible to assign to an array with a non-unique index, even if the new array did not have any labels. I agree that the way I fixed this is really messy. It would be better simply not to align as part of merge if there is only a single labeled argument, but it wasn't obvious how to do last week when I was rushing to get the bug fix out (the logic is currently buried in `deep_align` and needs to be extracted). I'll take a look now and see if I can come up with a cleaner fix, but for now I would stick with a separate internal only `private_align` that allows the extra argument. DataArray.reindex(copy=False) still performs a copy, even if there's nothing to do. I'm a bit afraid to go and fix it right now as I don't want to trigger domino effects One issue issue here is that the purpose and functionality of the `copy` argument is currently poorly documented. It should say something like: If `copy=True`, the return value is independent of the input. If `copy=False` and reindexing is unnecessary, or can be performed with only slice operations, then the data may include references to arrays also found in the input. (I'll update this soon.) We should not be making any promises about returning exactly the same input object, and in fact for consistency we should (as we current do) always return a new Dataset or DataArray, because constructing these objects is pretty cheap. Rather, the issue is whether we need to copy `numpy.ndarray` input, which can be very expensive if the array is large. I'm experiencing a lot of grief because assertDatasetIdentical expects both the coords and the data vars to have the same order, which in some situations it's simply impossible to control without touching vast areas of your code. This shouldn't be true. `assertDatasetIdentical` calls the `Dataset.identical` method, which doesn't care about order: ``` In [29]: from collections import OrderedDict In [30]: ds1 = xr.Dataset(OrderedDict([('x', 0), ('y', 1)])) In [31]: ds2 = xr.Dataset(OrderedDict([('y', 1), ('x', 0)])) In [32]: ds1 Out[32]: <xarray.Dataset> Dimensions: () Coordinates: empty Data variables: x int64 0 y int64 1 In [33]: ds2 Out[33]: <xarray.Dataset> Dimensions: () Coordinates: empty Data variables: y int64 1 x int64 0 In [34]: ds1.identical(ds2) Out[34]: True ``` As a more general and fundamental point, I cannot understand what's the benefit of using OrderedDict instead of a plain dict for coords, attrs, and Dataset.data_vars? This is something we considered before (https://github.com/pydata/xarray/issues/75). I think the main advantage is that OrderedDict is more intuitive (for people) and makes it easier to see at a glance how data changes. One other thing to note: `concat` already handles broadcasting for each variable (independently) with `ensure_common_dims`. This would definitely be handled more cleanly with support for an `exclude` argument to `broadcast_variables`. It wouldn't work to use the main `broadcast` function for this because it would broadcast unnecessary variables when concatenating datasets.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	align() and broadcast() before concat() 168470276
238081972	https://github.com/pydata/xarray/issues/927#issuecomment-238081972	https://api.github.com/repos/pydata/xarray/issues/927	MDEyOklzc3VlQ29tbWVudDIzODA4MTk3Mg==	crusaderky 6213168	2016-08-07T13:19:12Z	2016-08-07T13:19:12Z	MEMBER	Started working on it. Still a lot of cleanup to be done. https://github.com/crusaderky/xarray/commit/8830437865c43e472101ca91a12f714ee43546cc Observations: - I can't put any sense in the skip_single_target hack or in the whole special treatment inside align() for when there is only one arg. What's the benefit of the whole thing? What would happen if we simply removed the special logic? - DataArray.reindex(copy=False) still performs a copy, even if there's nothing to do. I'm a bit afraid to go and fix it right now as I don't want to trigger domino effects - I'm experiencing a lot of grief because assertDatasetIdentical expects both the coords and the data vars to have the same order, which in some situations it's simply impossible to control without touching vast areas of your code. As a more general and fundamental point, I cannot understand what's the benefit of using OrderedDict instead of a plain dict for coords, attrs, and Dataset.data_vars?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	align() and broadcast() before concat() 168470276
236919314	https://github.com/pydata/xarray/issues/927#issuecomment-236919314	https://api.github.com/repos/pydata/xarray/issues/927	MDEyOklzc3VlQ29tbWVudDIzNjkxOTMxNA==	shoyer 1217238	2016-08-02T14:21:06Z	2016-08-02T14:21:06Z	MEMBER	Awesome! We usually stick to outer joins by default so we don't inadvertantly drop data. If datasets are already aligned, calling align with copy=False should have minimal performance costs. On Tue, Aug 2, 2016 at 12:20 AM crusaderky notifications@github.com wrote: I can work on it. Should we go for a default outer join + broadcast inside concat? If the input already arrive aligned, or if the user wants a different type of join, this will slow things down with useless code though. What's your policy on this? — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/927#issuecomment-236822408, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKS1uZ5XlGSWrDfp3MRrKxPFd2zlD_aks5qbu-ugaJpZM4JY0b1 .	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	align() and broadcast() before concat() 168470276
236822408	https://github.com/pydata/xarray/issues/927#issuecomment-236822408	https://api.github.com/repos/pydata/xarray/issues/927	MDEyOklzc3VlQ29tbWVudDIzNjgyMjQwOA==	crusaderky 6213168	2016-08-02T07:20:13Z	2016-08-02T07:20:13Z	MEMBER	I can work on it. Should we go for a default outer join + broadcast inside concat? If the input already arrive aligned, or if the user wants a different type of join, this will slow things down with useless code though. What's your policy on this?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	align() and broadcast() before concat() 168470276
236767844	https://github.com/pydata/xarray/issues/927#issuecomment-236767844	https://api.github.com/repos/pydata/xarray/issues/927	MDEyOklzc3VlQ29tbWVudDIzNjc2Nzg0NA==	shoyer 1217238	2016-08-02T01:43:45Z	2016-08-02T01:43:45Z	MEMBER	Indeed, we could probably replace `align` with `partial_align`, and use this inside `concat` (see #930 for an example of that). This is probably worth doing. I didn't add `exclude` to `align` before mostly because I wasn't sure if the functionality would be useful to users, and I wanted to avoid making the mistake of expanding the API prematurely (it's harder to remove features than add them). Also, I didn't write tests or a good docstring for `partial_align` :).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	align() and broadcast() before concat() 168470276
236544958	https://github.com/pydata/xarray/issues/927#issuecomment-236544958	https://api.github.com/repos/pydata/xarray/issues/927	MDEyOklzc3VlQ29tbWVudDIzNjU0NDk1OA==	crusaderky 6213168	2016-08-01T10:26:41Z	2016-08-01T10:40:27Z	MEMBER	I'm now facing the equivalent problem with broadcast(). I need to concat 3 arrays: `python a = xarray.DataArray([[1,2],[3,4]], dims=['time', 'scenario'], coords={'time': ['t1', 't2']}) b = xarray.DataArray([5,6], dims=['time'], coords={'time': ['t3', 't4']}) c = xarray.DataArray(7, coords={'time': 't5'})` I want to broadcast dimension 'scenario' and concatenate on dimension 'time'. However, I can't invoke broadcast() because 1. a and b have misaligned time 2. c has a time coord, but not a time dim - if you broadcast a and c, you will get incorrect results Again the solution would be to have an optional parameter `exclude=['time']`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	align() and broadcast() before concat() 168470276
236541952	https://github.com/pydata/xarray/issues/927#issuecomment-236541952	https://api.github.com/repos/pydata/xarray/issues/927	MDEyOklzc3VlQ29tbWVudDIzNjU0MTk1Mg==	crusaderky 6213168	2016-08-01T10:10:53Z	2016-08-01T10:10:53Z	MEMBER	Just found xarray.core.alignment.partial_align(), which does exactly that. Not sure why the functionality is not part of the public API? Looks like the solution is to simply merge partial_align() into align()?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	align() and broadcast() before concat() 168470276

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);