github: issue_comments: 9 rows where author_association = "CONTRIBUTOR" and issue = 329575874 sorted by updated

9 rows where author_association = "CONTRIBUTOR" and issue = 329575874 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
407547050	https://github.com/pydata/xarray/issues/2217#issuecomment-407547050	https://api.github.com/repos/pydata/xarray/issues/2217	MDEyOklzc3VlQ29tbWVudDQwNzU0NzA1MA==	WeatherGod 291576	2018-07-24T20:48:53Z	2018-07-24T20:48:53Z	CONTRIBUTOR	I have created a PR for my work-in-progress: pandas-dev/pandas#22043	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	tolerance for alignment 329575874
400043753	https://github.com/pydata/xarray/issues/2217#issuecomment-400043753	https://api.github.com/repos/pydata/xarray/issues/2217	MDEyOklzc3VlQ29tbWVudDQwMDA0Mzc1Mw==	WeatherGod 291576	2018-06-25T18:07:49Z	2018-06-25T18:07:49Z	CONTRIBUTOR	Do we want to dive straight to that? Or, would it make more sense to first submit some PRs piping the support for a tolerance kwarg through more of the API? Or perhaps we should propose that a "tolerance" attribute should be an optional attribute that methods like `get_indexer()` and such could always check for? Not being a pandas dev, I am not sure how piecemeal we should approach this. In addition, we are likely going to have to implement a decent chunk of code ourselves for compatibility's sake, I think.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	tolerance for alignment 329575874
399612490	https://github.com/pydata/xarray/issues/2217#issuecomment-399612490	https://api.github.com/repos/pydata/xarray/issues/2217	MDEyOklzc3VlQ29tbWVudDM5OTYxMjQ5MA==	WeatherGod 291576	2018-06-22T23:56:41Z	2018-06-22T23:56:41Z	CONTRIBUTOR	I am not concerned about the non-commutativeness of the indexer itself. There is no way around that. At some point, you have to choose values, whether it is done by an indexer or done by some particular set operation. As for the different sizes, that happens when the tolerance is greater than half the smallest delta. I figure a final implementation would enforce such a constraint on the tolerance. On Fri, Jun 22, 2018 at 5:56 PM, Stephan Hoyer notifications@github.com wrote: @WeatherGod https://github.com/WeatherGod One problem with your definition of tolerance is that it isn't commutative, even if both indexes have the same tolerance: a = ImpreciseIndex([0.1, 0.2, 0.3, 0.4]) a.tolerance = 0.1 b = ImpreciseIndex([0.301, 0.401, 0.501, 0.601]) b.tolerance = 0.1print(a.union(b)) # ImpreciseIndex([0.1, 0.2, 0.3, 0.4, 0.501, 0.601], dtype='float64')print(b.union(a)) # ImpreciseIndex([0.1, 0.2, 0.301, 0.401, 0.501, 0.601], dtype='float64') If you try a little harder, you could even have cases where the result has a different size, e.g., a = ImpreciseIndex([1, 2, 3]) a.tolerance = 0.5 b = ImpreciseIndex([1, 1.9, 2.1, 3]) b.tolerance = 0.5print(a.union(b)) # ImpreciseIndex([1.0, 2.0, 3.0], dtype='float64')print(b.union(a)) # ImpreciseIndex([1.0, 1.9, 2.1, 3.0], dtype='float64') Maybe these aren't really problems in practice, but it's at least a little strange/surprising. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/2217#issuecomment-399593224, or mute the thread https://github.com/notifications/unsubscribe-auth/AARy-BUsm4Pcs-LC7s1iNAhPvCVRrGtwks5t_WgDgaJpZM4UbV3q .	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	tolerance for alignment 329575874
399584169	https://github.com/pydata/xarray/issues/2217#issuecomment-399584169	https://api.github.com/repos/pydata/xarray/issues/2217	MDEyOklzc3VlQ29tbWVudDM5OTU4NDE2OQ==	WeatherGod 291576	2018-06-22T21:15:06Z	2018-06-22T21:15:06Z	CONTRIBUTOR	Actually, I disagree. Pandas's set operations methods are mostly index-based. For union and intersection, they have an optimization that dives down into some c-code when the Indexes are monotonic, but everywhere else, it all works off of results from `get_indexer()`. I have made a quick toy demo code that seems to work. Note, I didn't know how to properly make a constructor for a subclassed Index, so I added the `tolerance` attribute after construction just for the purposes of this demo. ``` python from future import print_function import warnings from pandas import Index import numpy as np from pandas.indexes.base import is_object_dtype, algos, is_dtype_equal from pandas.indexes.base import _ensure_index, _concat, _values_from_object, _unsortable_types from pandas.indexes.numeric import Float64Index def _choose_tolerance(this, that, tolerance): if tolerance is None: tolerance = max(this.tolerance, getattr(that, 'tolerance', 0.0)) return tolerance class ImpreciseIndex(Float64Index): def astype(self, dtype, copy=True): return ImpreciseIndex(self.values.astype(dtype=dtype, copy=copy), name=self.name, dtype=dtype) @property def tolerance(self): return self._tolerance @tolerance.setter def tolerance(self, tolerance): self._tolerance = self._convert_tolerance(tolerance) def union(self, other, tolerance=None): self._assert_can_do_setop(other) other = _ensure_index(other) if len(other) == 0 or self.equals(other, tolerance=tolerance): return self._get_consensus_name(other) if len(self) == 0: return other._get_consensus_name(self) if not is_dtype_equal(self.dtype, other.dtype): this = self.astype('O') other = other.astype('O') return this.union(other, tolerance=tolerance) tolerance = _choose_tolerance(self, other, tolerance) indexer = self.get_indexer(other, tolerance=tolerance) indexer, = (indexer == -1).nonzero() if len(indexer) > 0: other_diff = algos.take_nd(other._values, indexer, allow_fill=False) result = _concat._concat_compat((self._values, other_diff)) try: self._values[0] < other_diff[0] except TypeError as e: warnings.warn("%s, sort order is undefined for " "incomparable objects" % e, RuntimeWarning, stacklevel=3) else: types = frozenset((self.inferred_type, other.inferred_type)) if not types & _unsortable_types: result.sort() else: result = self._values try: result = np.sort(result) except TypeError as e: warnings.warn("%s, sort order is undefined for " "incomparable objects" % e, RuntimeWarning, stacklevel=3) # for subclasses return self._wrap_union_result(other, result) def equals(self, other, tolerance=None): if self.is_(other): return True if not isinstance(other, Index): return False if is_object_dtype(self) and not is_object_dtype(other): # if other is not object, use other's logic for coercion if isinstance(other, ImpreciseIndex): return other.equals(self, tolerance=tolerance) else: return other.equals(self) if len(self) != len(other): return False tolerance = _choose_tolerance(self, other, tolerance) diff = np.abs(_values_from_object(self) - _values_from_object(other)) return np.all(diff < tolerance) def intersection(self, other, tolerance=None): self._assert_can_do_setop(other) other = _ensure_index(other) if self.equals(other, tolerance=tolerance): return self._get_consensus_name(other) if not is_dtype_equal(self.dtype, other.dtype): this = self.astype('O') other = other.astype('O') return this.intersection(other, tolerance=tolerance) tolerance = _choose_tolerance(self, other, tolerance) try: indexer = self.get_indexer(other._values, tolerance=tolerance) indexer = indexer.take((indexer != -1).nonzero()[0]) except: # duplicates # FIXME: get_indexer_non_unique() doesn't take a tolerance argument indexer = Index(self._values).get_indexer_non_unique( other._values)[0].unique() indexer = indexer[indexer != -1] taken = self.take(indexer) if self.name != other.name: taken.name = None return taken # TODO: Do I need to re-implement _get_unique_index()? def get_loc(self, key, method=None, tolerance=None): if tolerance is None: tolerance = self.tolerance if tolerance > 0 and method is None: method = 'nearest' return super(ImpreciseIndex, self).get_loc(key, method, tolerance) def get_indexer(self, target, method=None, limit=None, tolerance=None): if tolerance is None: tolerance = self.tolerance if tolerance > 0 and method is None: method = 'nearest' return super(ImpreciseIndex, self).get_indexer(target, method, limit, tolerance) if name == 'main': a = ImpreciseIndex([0.1, 0.2, 0.3, 0.4]) a.tolerance = 0.01 b = ImpreciseIndex([0.301, 0.401, 0.501, 0.601]) b.tolerance = 0.025 print(a, b) print("a \| b :", a.union(b)) print("a & b :", a.intersection(b)) print("a.get_indexer(b):", a.get_indexer(b)) print("b.get_indexer(a):", b.get_indexer(a)) ``` Run this and get the following results: `ImpreciseIndex([0.1, 0.2, 0.3, 0.4], dtype='float64') ImpreciseIndex([0.301, 0.401, 0.501, 0.601], dtype='float64') a \| b : ImpreciseIndex([0.1, 0.2, 0.3, 0.4, 0.501, 0.601], dtype='float64') a & b : ImpreciseIndex([0.3, 0.4], dtype='float64') a.get_indexer(b): [ 2 3 -1 -1] b.get_indexer(a): [-1 -1 0 1]` This is mostly lifted from the `Index` base class methods, just with me taking out the monotonic optimization path, and supplying the tolerance argument to the respective calls to `get_indexer`. The choice of tolerance for a given operation is that unless provided as a keyword argument, then use the larger tolerance of the two objects being compared (with a failback if the other isn't an ImpreciseIndex).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	tolerance for alignment 329575874
399522595	https://github.com/pydata/xarray/issues/2217#issuecomment-399522595	https://api.github.com/repos/pydata/xarray/issues/2217	MDEyOklzc3VlQ29tbWVudDM5OTUyMjU5NQ==	WeatherGod 291576	2018-06-22T17:42:29Z	2018-06-22T17:42:29Z	CONTRIBUTOR	Ok, I see how you implemented it for pandas's reindex. You essentially inserted an inexact filter within `.get_indexer()`. And the `intersection()` and `union()` uses these methods, so, in theory, one could pipe a tolerance argument through them (as well as for the other set operations). The work needs to be expanded a bit, though, as `get_indexer_non_unique()` needs the tolerance parameter, too, I think. For xarray, though, I think we can work around backwards compatibility by having Dataset hold specialized subclasses of Index for floating-point data types that would have the needed changes to the Index class. We can have this specialized class have some default tolerance (say 100*finfo(dtype).resolution?), and it would have its methods use the stored tolerance by default, so it should be completely transparent to the end-user (hopefully). This way, `xr.open_mfdataset()` would "just work".	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	tolerance for alignment 329575874
399286310	https://github.com/pydata/xarray/issues/2217#issuecomment-399286310	https://api.github.com/repos/pydata/xarray/issues/2217	MDEyOklzc3VlQ29tbWVudDM5OTI4NjMxMA==	WeatherGod 291576	2018-06-22T00:45:19Z	2018-06-22T00:45:19Z	CONTRIBUTOR	@shoyer, I am thinking your original intuition was right about needing to introduce improve the Index classes to perhaps work with an optional epsilon argument to its constructor. How receptive do you think pandas would be to that? And even if they would accept such a feature, we probably would need to implement it a bit ourselves in situations where older pandas versions are used.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	tolerance for alignment 329575874
399285369	https://github.com/pydata/xarray/issues/2217#issuecomment-399285369	https://api.github.com/repos/pydata/xarray/issues/2217	MDEyOklzc3VlQ29tbWVudDM5OTI4NTM2OQ==	WeatherGod 291576	2018-06-22T00:38:34Z	2018-06-22T00:38:34Z	CONTRIBUTOR	Well, I need this to work for join='outer', so, it is gonna happen one way or another... One concept I was toying with today was a distinction between aligning coords (which is what it does now) and aligning bounding boxes.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	tolerance for alignment 329575874
399254317	https://github.com/pydata/xarray/issues/2217#issuecomment-399254317	https://api.github.com/repos/pydata/xarray/issues/2217	MDEyOklzc3VlQ29tbWVudDM5OTI1NDMxNw==	WeatherGod 291576	2018-06-21T21:48:28Z	2018-06-21T21:48:28Z	CONTRIBUTOR	To be clear, my use-case would not be solved by `join='override'` (isn't that just `join='left'`?). I have moving nests of coordinates that can have some floating-point noise in them, but are otherwise identical.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	tolerance for alignment 329575874
399253493	https://github.com/pydata/xarray/issues/2217#issuecomment-399253493	https://api.github.com/repos/pydata/xarray/issues/2217	MDEyOklzc3VlQ29tbWVudDM5OTI1MzQ5Mw==	WeatherGod 291576	2018-06-21T21:44:58Z	2018-06-21T21:44:58Z	CONTRIBUTOR	I was just pointed to this issue yesterday, and I have an immediate need for this feature in xarray for a work project. I'll take responsibility to implement this feature tomorrow.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	tolerance for alignment 329575874

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);