github: issue_comments: 77 rows where user = 291576 sorted by updated

77 rows where user = 291576 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
738189796	https://github.com/pydata/xarray/issues/2004#issuecomment-738189796	https://api.github.com/repos/pydata/xarray/issues/2004	MDEyOklzc3VlQ29tbWVudDczODE4OTc5Ng==	WeatherGod 291576	2020-12-03T18:15:35Z	2020-12-03T18:15:35Z	CONTRIBUTOR	I think so, at least in terms of my original problem.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Slicing DataArray can take longer than not slicing 307318224
642253287	https://github.com/pydata/xarray/issues/4142#issuecomment-642253287	https://api.github.com/repos/pydata/xarray/issues/4142	MDEyOklzc3VlQ29tbWVudDY0MjI1MzI4Nw==	WeatherGod 291576	2020-06-10T20:55:32Z	2020-06-10T20:55:32Z	CONTRIBUTOR	So, one important difference I see off the bat is that zarr already had a DataStore implementation, while rasterio does not. I take it that implementing one would be the preferred approach?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Should we make "rasterio" an engine option? 636493109
451626366	https://github.com/pydata/xarray/pull/2648#issuecomment-451626366	https://api.github.com/repos/pydata/xarray/issues/2648	MDEyOklzc3VlQ29tbWVudDQ1MTYyNjM2Ng==	WeatherGod 291576	2019-01-05T04:18:50Z	2019-01-05T04:18:50Z	CONTRIBUTOR	I completely forgotten about that little quirk of cpython. I try to ignore implementation details like that. Heck, I still don't fully trust dictionaries to be ordered! I removed the WIP. We can deal with the concat dim default object separately, including turning it into a ReprObject (not exactly sure what the advantage of it is over just using the string, but, meh).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Change an `==` to an `is`. Fix tests so that this won't happen again. 396008054
451583970	https://github.com/pydata/xarray/pull/2648#issuecomment-451583970	https://api.github.com/repos/pydata/xarray/issues/2648	MDEyOklzc3VlQ29tbWVudDQ1MTU4Mzk3MA==	WeatherGod 291576	2019-01-04T22:12:44Z	2019-01-04T22:12:44Z	CONTRIBUTOR	Is the following statement True or False: "The user should be allowed to explicitly declare that they want the concatenation dimension to be inferred by passing a keyword argument". If this is True, then you need to test equivalence. If it is False, then there is nothing more I need to do for the PR, as changing this to use a ReprObject is orthogonal to these changes.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Change an `==` to an `is`. Fix tests so that this won't happen again. 396008054
451581103	https://github.com/pydata/xarray/pull/2648#issuecomment-451581103	https://api.github.com/repos/pydata/xarray/issues/2648	MDEyOklzc3VlQ29tbWVudDQ1MTU4MTEwMw==	WeatherGod 291576	2019-01-04T22:00:10Z	2019-01-04T22:00:10Z	CONTRIBUTOR	ok, so we use the ReprObject for the default, and then test if `concat_dim` is of type `ReprObject and then test its equivalance?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Change an `==` to an `is`. Fix tests so that this won't happen again. 396008054
451504997	https://github.com/pydata/xarray/issues/2647#issuecomment-451504997	https://api.github.com/repos/pydata/xarray/issues/2647	MDEyOklzc3VlQ29tbWVudDQ1MTUwNDk5Nw==	WeatherGod 291576	2019-01-04T17:06:50Z	2019-01-04T17:06:50Z	CONTRIBUTOR	scratch that... the test was an `or`, not a `and`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	getting a "truth value of an array" error when supplying my own `concat_dim`. 395994055
451504462	https://github.com/pydata/xarray/issues/2647#issuecomment-451504462	https://api.github.com/repos/pydata/xarray/issues/2647	MDEyOklzc3VlQ29tbWVudDQ1MTUwNDQ2Mg==	WeatherGod 291576	2019-01-04T17:05:00Z	2019-01-04T17:05:00Z	CONTRIBUTOR	actually, we could simplify the conditional to be just `concat_dim is _CONCAT_DIM_DEFAULT` and not bother with the `None` test.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	getting a "truth value of an array" error when supplying my own `concat_dim`. 395994055
451504141	https://github.com/pydata/xarray/issues/2647#issuecomment-451504141	https://api.github.com/repos/pydata/xarray/issues/2647	MDEyOklzc3VlQ29tbWVudDQ1MTUwNDE0MQ==	WeatherGod 291576	2019-01-04T17:03:54Z	2019-01-04T17:03:54Z	CONTRIBUTOR	ah! that's why it snuck through! I have been raking my brain on this for the past hour! shall I go ahead and make a PR?	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	getting a "truth value of an array" error when supplying my own `concat_dim`. 395994055
451501740	https://github.com/pydata/xarray/issues/2647#issuecomment-451501740	https://api.github.com/repos/pydata/xarray/issues/2647	MDEyOklzc3VlQ29tbWVudDQ1MTUwMTc0MA==	WeatherGod 291576	2019-01-04T16:55:40Z	2019-01-04T16:55:40Z	CONTRIBUTOR	To be more explicit, the issue is that `concat_dim == _CONCAT_DIM_DEFAULT` is ill-advised because the type of `concat_dim` is not guaranteed to be a scalar. In fact, the elif of that area of code in api.py explicitly tests if concat_dim is or is not a list.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	getting a "truth value of an array" error when supplying my own `concat_dim`. 395994055
425224969	https://github.com/pydata/xarray/issues/2227#issuecomment-425224969	https://api.github.com/repos/pydata/xarray/issues/2227	MDEyOklzc3VlQ29tbWVudDQyNTIyNDk2OQ==	WeatherGod 291576	2018-09-27T20:05:05Z	2018-09-27T20:05:05Z	CONTRIBUTOR	It would be ten files opened via xr.open_mfdataset() concatenated across a time dimension, each one looking like: ``` netcdf convect_gust_20180301_0000 { dimensions: latitude = 3502 ; longitude = 7002 ; variables: double latitude(latitude) ; latitude:_FillValue = NaN ; latitude:_Storage = "contiguous" ; latitude:_Endianness = "little" ; double longitude(longitude) ; longitude:_FillValue = NaN ; longitude:_Storage = "contiguous" ; longitude:_Endianness = "little" ; float gust(latitude, longitude) ; gust:_FillValue = NaNf ; gust:units = "m/s" ; gust:description = "gust winds" ; gust:_Storage = "chunked" ; gust:_ChunkSizes = 701, 1401 ; gust:_DeflateLevel = 8 ; gust:_Shuffle = "true" ; gust:_Endianness = "little" ; // global attributes: :start_date = "03/01/2018 00:00" ; :end_date = "03/01/2018 01:00" ; :interval = "half-open" ; :init_date = "02/28/2018 22:00" ; :history = "Created 2018-09-12 15:53:44.468144" ; :description = "Convective Downscaling, format V2.0" ; :_NCProperties = "version=1\|netcdflibversion=4.6.1\|hdf5libversion=1.10.1" ; :_SuperblockVersion = 0 ; :_IsNetcdf4 = 1 ; :_Format = "netCDF-4" ; ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Slow performance of isel 331668890
424795330	https://github.com/pydata/xarray/issues/2227#issuecomment-424795330	https://api.github.com/repos/pydata/xarray/issues/2227	MDEyOklzc3VlQ29tbWVudDQyNDc5NTMzMA==	WeatherGod 291576	2018-09-26T17:06:44Z	2018-09-26T17:06:44Z	CONTRIBUTOR	No, it does not make a difference. The example above peaks at around 5GB of memory (a bit much, but manageable). And it peaks similarly if we chunk it like you suggested.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Slow performance of isel 331668890
424485235	https://github.com/pydata/xarray/issues/2227#issuecomment-424485235	https://api.github.com/repos/pydata/xarray/issues/2227	MDEyOklzc3VlQ29tbWVudDQyNDQ4NTIzNQ==	WeatherGod 291576	2018-09-25T20:14:02Z	2018-09-25T20:14:02Z	CONTRIBUTOR	Yeah, it looks like if `da` is backed by a dask array, and you do a `.isel(win=window.compute())` because otherwise isel barfs on dask indexers, it seems, then the memory usage shoots through the roof. Note that in my case, the dask chunks are (1, 3000, 7000). If I do a `window.load()` prior to `window.isel()`, then the memory usage is perfectly reasonable.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Slow performance of isel 331668890
424479421	https://github.com/pydata/xarray/issues/2227#issuecomment-424479421	https://api.github.com/repos/pydata/xarray/issues/2227	MDEyOklzc3VlQ29tbWVudDQyNDQ3OTQyMQ==	WeatherGod 291576	2018-09-25T19:54:59Z	2018-09-25T19:54:59Z	CONTRIBUTOR	Just for posterity, though, here is my simplified (working!) example: ``` import numpy as np import xarray as xr da = xr.DataArray(np.random.randn(10, 3000, 7000), dims=('time', 'latitude', 'longitude')) window = da.rolling(time=2).construct('win') indexes = window.argmax(dim='win') result = window.isel(win=indexes) ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Slow performance of isel 331668890
424477465	https://github.com/pydata/xarray/issues/2227#issuecomment-424477465	https://api.github.com/repos/pydata/xarray/issues/2227	MDEyOklzc3VlQ29tbWVudDQyNDQ3NzQ2NQ==	WeatherGod 291576	2018-09-25T19:48:20Z	2018-09-25T19:48:20Z	CONTRIBUTOR	Huh, strange... I just tried a simplified version of what I was doing (particularly, no dask arrays), and everything worked fine. I'll have to investigate further.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Slow performance of isel 331668890
424470752	https://github.com/pydata/xarray/issues/2227#issuecomment-424470752	https://api.github.com/repos/pydata/xarray/issues/2227	MDEyOklzc3VlQ29tbWVudDQyNDQ3MDc1Mg==	WeatherGod 291576	2018-09-25T19:27:28Z	2018-09-25T19:27:28Z	CONTRIBUTOR	I am looking into a similar performance issue with isel, but it seems that the issue is that it is creating arrays that are much bigger than needed. For my multidimensional case (time/x/y/window), what should end up only taking a few hundred MB is spiking up to 10's of GB of used RAM. Don't know if this might be a possible source of performance issues.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Slow performance of isel 331668890
407547050	https://github.com/pydata/xarray/issues/2217#issuecomment-407547050	https://api.github.com/repos/pydata/xarray/issues/2217	MDEyOklzc3VlQ29tbWVudDQwNzU0NzA1MA==	WeatherGod 291576	2018-07-24T20:48:53Z	2018-07-24T20:48:53Z	CONTRIBUTOR	I have created a PR for my work-in-progress: pandas-dev/pandas#22043	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	tolerance for alignment 329575874
400043753	https://github.com/pydata/xarray/issues/2217#issuecomment-400043753	https://api.github.com/repos/pydata/xarray/issues/2217	MDEyOklzc3VlQ29tbWVudDQwMDA0Mzc1Mw==	WeatherGod 291576	2018-06-25T18:07:49Z	2018-06-25T18:07:49Z	CONTRIBUTOR	Do we want to dive straight to that? Or, would it make more sense to first submit some PRs piping the support for a tolerance kwarg through more of the API? Or perhaps we should propose that a "tolerance" attribute should be an optional attribute that methods like `get_indexer()` and such could always check for? Not being a pandas dev, I am not sure how piecemeal we should approach this. In addition, we are likely going to have to implement a decent chunk of code ourselves for compatibility's sake, I think.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	tolerance for alignment 329575874
399612490	https://github.com/pydata/xarray/issues/2217#issuecomment-399612490	https://api.github.com/repos/pydata/xarray/issues/2217	MDEyOklzc3VlQ29tbWVudDM5OTYxMjQ5MA==	WeatherGod 291576	2018-06-22T23:56:41Z	2018-06-22T23:56:41Z	CONTRIBUTOR	I am not concerned about the non-commutativeness of the indexer itself. There is no way around that. At some point, you have to choose values, whether it is done by an indexer or done by some particular set operation. As for the different sizes, that happens when the tolerance is greater than half the smallest delta. I figure a final implementation would enforce such a constraint on the tolerance. On Fri, Jun 22, 2018 at 5:56 PM, Stephan Hoyer notifications@github.com wrote: @WeatherGod https://github.com/WeatherGod One problem with your definition of tolerance is that it isn't commutative, even if both indexes have the same tolerance: a = ImpreciseIndex([0.1, 0.2, 0.3, 0.4]) a.tolerance = 0.1 b = ImpreciseIndex([0.301, 0.401, 0.501, 0.601]) b.tolerance = 0.1print(a.union(b)) # ImpreciseIndex([0.1, 0.2, 0.3, 0.4, 0.501, 0.601], dtype='float64')print(b.union(a)) # ImpreciseIndex([0.1, 0.2, 0.301, 0.401, 0.501, 0.601], dtype='float64') If you try a little harder, you could even have cases where the result has a different size, e.g., a = ImpreciseIndex([1, 2, 3]) a.tolerance = 0.5 b = ImpreciseIndex([1, 1.9, 2.1, 3]) b.tolerance = 0.5print(a.union(b)) # ImpreciseIndex([1.0, 2.0, 3.0], dtype='float64')print(b.union(a)) # ImpreciseIndex([1.0, 1.9, 2.1, 3.0], dtype='float64') Maybe these aren't really problems in practice, but it's at least a little strange/surprising. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/2217#issuecomment-399593224, or mute the thread https://github.com/notifications/unsubscribe-auth/AARy-BUsm4Pcs-LC7s1iNAhPvCVRrGtwks5t_WgDgaJpZM4UbV3q .	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	tolerance for alignment 329575874
399584169	https://github.com/pydata/xarray/issues/2217#issuecomment-399584169	https://api.github.com/repos/pydata/xarray/issues/2217	MDEyOklzc3VlQ29tbWVudDM5OTU4NDE2OQ==	WeatherGod 291576	2018-06-22T21:15:06Z	2018-06-22T21:15:06Z	CONTRIBUTOR	Actually, I disagree. Pandas's set operations methods are mostly index-based. For union and intersection, they have an optimization that dives down into some c-code when the Indexes are monotonic, but everywhere else, it all works off of results from `get_indexer()`. I have made a quick toy demo code that seems to work. Note, I didn't know how to properly make a constructor for a subclassed Index, so I added the `tolerance` attribute after construction just for the purposes of this demo. ``` python from future import print_function import warnings from pandas import Index import numpy as np from pandas.indexes.base import is_object_dtype, algos, is_dtype_equal from pandas.indexes.base import _ensure_index, _concat, _values_from_object, _unsortable_types from pandas.indexes.numeric import Float64Index def _choose_tolerance(this, that, tolerance): if tolerance is None: tolerance = max(this.tolerance, getattr(that, 'tolerance', 0.0)) return tolerance class ImpreciseIndex(Float64Index): def astype(self, dtype, copy=True): return ImpreciseIndex(self.values.astype(dtype=dtype, copy=copy), name=self.name, dtype=dtype) @property def tolerance(self): return self._tolerance @tolerance.setter def tolerance(self, tolerance): self._tolerance = self._convert_tolerance(tolerance) def union(self, other, tolerance=None): self._assert_can_do_setop(other) other = _ensure_index(other) if len(other) == 0 or self.equals(other, tolerance=tolerance): return self._get_consensus_name(other) if len(self) == 0: return other._get_consensus_name(self) if not is_dtype_equal(self.dtype, other.dtype): this = self.astype('O') other = other.astype('O') return this.union(other, tolerance=tolerance) tolerance = _choose_tolerance(self, other, tolerance) indexer = self.get_indexer(other, tolerance=tolerance) indexer, = (indexer == -1).nonzero() if len(indexer) > 0: other_diff = algos.take_nd(other._values, indexer, allow_fill=False) result = _concat._concat_compat((self._values, other_diff)) try: self._values[0] < other_diff[0] except TypeError as e: warnings.warn("%s, sort order is undefined for " "incomparable objects" % e, RuntimeWarning, stacklevel=3) else: types = frozenset((self.inferred_type, other.inferred_type)) if not types & _unsortable_types: result.sort() else: result = self._values try: result = np.sort(result) except TypeError as e: warnings.warn("%s, sort order is undefined for " "incomparable objects" % e, RuntimeWarning, stacklevel=3) # for subclasses return self._wrap_union_result(other, result) def equals(self, other, tolerance=None): if self.is_(other): return True if not isinstance(other, Index): return False if is_object_dtype(self) and not is_object_dtype(other): # if other is not object, use other's logic for coercion if isinstance(other, ImpreciseIndex): return other.equals(self, tolerance=tolerance) else: return other.equals(self) if len(self) != len(other): return False tolerance = _choose_tolerance(self, other, tolerance) diff = np.abs(_values_from_object(self) - _values_from_object(other)) return np.all(diff < tolerance) def intersection(self, other, tolerance=None): self._assert_can_do_setop(other) other = _ensure_index(other) if self.equals(other, tolerance=tolerance): return self._get_consensus_name(other) if not is_dtype_equal(self.dtype, other.dtype): this = self.astype('O') other = other.astype('O') return this.intersection(other, tolerance=tolerance) tolerance = _choose_tolerance(self, other, tolerance) try: indexer = self.get_indexer(other._values, tolerance=tolerance) indexer = indexer.take((indexer != -1).nonzero()[0]) except: # duplicates # FIXME: get_indexer_non_unique() doesn't take a tolerance argument indexer = Index(self._values).get_indexer_non_unique( other._values)[0].unique() indexer = indexer[indexer != -1] taken = self.take(indexer) if self.name != other.name: taken.name = None return taken # TODO: Do I need to re-implement _get_unique_index()? def get_loc(self, key, method=None, tolerance=None): if tolerance is None: tolerance = self.tolerance if tolerance > 0 and method is None: method = 'nearest' return super(ImpreciseIndex, self).get_loc(key, method, tolerance) def get_indexer(self, target, method=None, limit=None, tolerance=None): if tolerance is None: tolerance = self.tolerance if tolerance > 0 and method is None: method = 'nearest' return super(ImpreciseIndex, self).get_indexer(target, method, limit, tolerance) if name == 'main': a = ImpreciseIndex([0.1, 0.2, 0.3, 0.4]) a.tolerance = 0.01 b = ImpreciseIndex([0.301, 0.401, 0.501, 0.601]) b.tolerance = 0.025 print(a, b) print("a \| b :", a.union(b)) print("a & b :", a.intersection(b)) print("a.get_indexer(b):", a.get_indexer(b)) print("b.get_indexer(a):", b.get_indexer(a)) ``` Run this and get the following results: `ImpreciseIndex([0.1, 0.2, 0.3, 0.4], dtype='float64') ImpreciseIndex([0.301, 0.401, 0.501, 0.601], dtype='float64') a \| b : ImpreciseIndex([0.1, 0.2, 0.3, 0.4, 0.501, 0.601], dtype='float64') a & b : ImpreciseIndex([0.3, 0.4], dtype='float64') a.get_indexer(b): [ 2 3 -1 -1] b.get_indexer(a): [-1 -1 0 1]` This is mostly lifted from the `Index` base class methods, just with me taking out the monotonic optimization path, and supplying the tolerance argument to the respective calls to `get_indexer`. The choice of tolerance for a given operation is that unless provided as a keyword argument, then use the larger tolerance of the two objects being compared (with a failback if the other isn't an ImpreciseIndex).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	tolerance for alignment 329575874
399522595	https://github.com/pydata/xarray/issues/2217#issuecomment-399522595	https://api.github.com/repos/pydata/xarray/issues/2217	MDEyOklzc3VlQ29tbWVudDM5OTUyMjU5NQ==	WeatherGod 291576	2018-06-22T17:42:29Z	2018-06-22T17:42:29Z	CONTRIBUTOR	Ok, I see how you implemented it for pandas's reindex. You essentially inserted an inexact filter within `.get_indexer()`. And the `intersection()` and `union()` uses these methods, so, in theory, one could pipe a tolerance argument through them (as well as for the other set operations). The work needs to be expanded a bit, though, as `get_indexer_non_unique()` needs the tolerance parameter, too, I think. For xarray, though, I think we can work around backwards compatibility by having Dataset hold specialized subclasses of Index for floating-point data types that would have the needed changes to the Index class. We can have this specialized class have some default tolerance (say 100*finfo(dtype).resolution?), and it would have its methods use the stored tolerance by default, so it should be completely transparent to the end-user (hopefully). This way, `xr.open_mfdataset()` would "just work".	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	tolerance for alignment 329575874
399286310	https://github.com/pydata/xarray/issues/2217#issuecomment-399286310	https://api.github.com/repos/pydata/xarray/issues/2217	MDEyOklzc3VlQ29tbWVudDM5OTI4NjMxMA==	WeatherGod 291576	2018-06-22T00:45:19Z	2018-06-22T00:45:19Z	CONTRIBUTOR	@shoyer, I am thinking your original intuition was right about needing to introduce improve the Index classes to perhaps work with an optional epsilon argument to its constructor. How receptive do you think pandas would be to that? And even if they would accept such a feature, we probably would need to implement it a bit ourselves in situations where older pandas versions are used.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	tolerance for alignment 329575874
399285369	https://github.com/pydata/xarray/issues/2217#issuecomment-399285369	https://api.github.com/repos/pydata/xarray/issues/2217	MDEyOklzc3VlQ29tbWVudDM5OTI4NTM2OQ==	WeatherGod 291576	2018-06-22T00:38:34Z	2018-06-22T00:38:34Z	CONTRIBUTOR	Well, I need this to work for join='outer', so, it is gonna happen one way or another... One concept I was toying with today was a distinction between aligning coords (which is what it does now) and aligning bounding boxes.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	tolerance for alignment 329575874
399254317	https://github.com/pydata/xarray/issues/2217#issuecomment-399254317	https://api.github.com/repos/pydata/xarray/issues/2217	MDEyOklzc3VlQ29tbWVudDM5OTI1NDMxNw==	WeatherGod 291576	2018-06-21T21:48:28Z	2018-06-21T21:48:28Z	CONTRIBUTOR	To be clear, my use-case would not be solved by `join='override'` (isn't that just `join='left'`?). I have moving nests of coordinates that can have some floating-point noise in them, but are otherwise identical.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	tolerance for alignment 329575874
399253493	https://github.com/pydata/xarray/issues/2217#issuecomment-399253493	https://api.github.com/repos/pydata/xarray/issues/2217	MDEyOklzc3VlQ29tbWVudDM5OTI1MzQ5Mw==	WeatherGod 291576	2018-06-21T21:44:58Z	2018-06-21T21:44:58Z	CONTRIBUTOR	I was just pointed to this issue yesterday, and I have an immediate need for this feature in xarray for a work project. I'll take responsibility to implement this feature tomorrow.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	tolerance for alignment 329575874
380241636	https://github.com/pydata/xarray/pull/2048#issuecomment-380241636	https://api.github.com/repos/pydata/xarray/issues/2048	MDEyOklzc3VlQ29tbWVudDM4MDI0MTYzNg==	WeatherGod 291576	2018-04-10T20:48:25Z	2018-04-10T20:48:25Z	CONTRIBUTOR	What's new entry added.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	concat_dim for auto_combine for a single object is now respected 312998259
380203653	https://github.com/pydata/xarray/pull/2048#issuecomment-380203653	https://api.github.com/repos/pydata/xarray/issues/2048	MDEyOklzc3VlQ29tbWVudDM4MDIwMzY1Mw==	WeatherGod 291576	2018-04-10T18:34:32Z	2018-04-10T18:34:32Z	CONTRIBUTOR	Travis failures seem to be unrelated?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	concat_dim for auto_combine for a single object is now respected 312998259
380137124	https://github.com/pydata/xarray/issues/1988#issuecomment-380137124	https://api.github.com/repos/pydata/xarray/issues/1988	MDEyOklzc3VlQ29tbWVudDM4MDEzNzEyNA==	WeatherGod 291576	2018-04-10T15:12:05Z	2018-04-10T15:12:05Z	CONTRIBUTOR	Yup... looks like that did the trick (for auto_combine and open_mfdataset). I even have a simple test to demonstrate it. PR coming shortly.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() on a single file drops the concat_dim 305327479
379939574	https://github.com/pydata/xarray/issues/1988#issuecomment-379939574	https://api.github.com/repos/pydata/xarray/issues/1988	MDEyOklzc3VlQ29tbWVudDM3OTkzOTU3NA==	WeatherGod 291576	2018-04-10T00:55:48Z	2018-04-10T00:55:48Z	CONTRIBUTOR	I'll give it a go tomorrow. My work has gotten to this point now, and I have some unit tests that happen to exercise this edge case. On a somewhat related note, would a `allow_missing` feature be welcomed in `open_mfdataset()`? I have written up some code that expects a `concat_dim`, and a list of filenames. It will then pass to `open_mfdataset()` only the files (and corresponding concat_dim values) that exists, and then calls `reindex()` with the original concat_dim to have a nan-filled slab where-ever there was a missing file. Any interest?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() on a single file drops the concat_dim 305327479
379901414	https://github.com/pydata/xarray/issues/1988#issuecomment-379901414	https://api.github.com/repos/pydata/xarray/issues/1988	MDEyOklzc3VlQ29tbWVudDM3OTkwMTQxNA==	WeatherGod 291576	2018-04-09T21:35:11Z	2018-04-09T21:35:11Z	CONTRIBUTOR	Could the fix be as simple as `if len(datasets) == 1 and dim is None:`?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	open_mfdataset() on a single file drops the concat_dim 305327479
375056363	https://github.com/pydata/xarray/issues/2004#issuecomment-375056363	https://api.github.com/repos/pydata/xarray/issues/2004	MDEyOklzc3VlQ29tbWVudDM3NTA1NjM2Mw==	WeatherGod 291576	2018-03-21T18:50:58Z	2018-03-21T18:50:58Z	CONTRIBUTOR	Ah, nevermind, I see that our examples only had one greater-than-one stride	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Slicing DataArray can take longer than not slicing 307318224
375056077	https://github.com/pydata/xarray/issues/2004#issuecomment-375056077	https://api.github.com/repos/pydata/xarray/issues/2004	MDEyOklzc3VlQ29tbWVudDM3NTA1NjA3Nw==	WeatherGod 291576	2018-03-21T18:50:01Z	2018-03-21T18:50:01Z	CONTRIBUTOR	Dunno. I can't seem to get that engine working on my system. Reading through that thread, I wonder if the optimization they added only applies if there is only one stride greater than one?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Slicing DataArray can take longer than not slicing 307318224
375036951	https://github.com/pydata/xarray/issues/2004#issuecomment-375036951	https://api.github.com/repos/pydata/xarray/issues/2004	MDEyOklzc3VlQ29tbWVudDM3NTAzNjk1MQ==	WeatherGod 291576	2018-03-21T17:51:54Z	2018-03-21T17:51:54Z	CONTRIBUTOR	This might be relevant: https://github.com/Unidata/netcdf4-python/issues/680 Still reading through the thread.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Slicing DataArray can take longer than not slicing 307318224
375034973	https://github.com/pydata/xarray/issues/2004#issuecomment-375034973	https://api.github.com/repos/pydata/xarray/issues/2004	MDEyOklzc3VlQ29tbWVudDM3NTAzNDk3Mw==	WeatherGod 291576	2018-03-21T17:46:09Z	2018-03-21T17:46:09Z	CONTRIBUTOR	my bet is probably netCDF4-python. Don't want to write up the C code though to confirm it. Sigh... this isn't going to be a fun one to track down. Shall I open a bug report over there?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Slicing DataArray can take longer than not slicing 307318224
375014480	https://github.com/pydata/xarray/issues/2004#issuecomment-375014480	https://api.github.com/repos/pydata/xarray/issues/2004	MDEyOklzc3VlQ29tbWVudDM3NTAxNDQ4MA==	WeatherGod 291576	2018-03-21T16:50:59Z	2018-03-21T16:56:13Z	CONTRIBUTOR	Yeah, good example. Eliminates a lot of possible variables such as problems with netcdf4 compression and such. Probably should see if it happens in v0.10.0 to see if the changes to the indexing system caused this.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Slicing DataArray can take longer than not slicing 307318224
373840044	https://github.com/pydata/xarray/issues/1997#issuecomment-373840044	https://api.github.com/repos/pydata/xarray/issues/1997	MDEyOklzc3VlQ29tbWVudDM3Mzg0MDA0NA==	WeatherGod 291576	2018-03-16T20:45:39Z	2018-03-16T20:45:39Z	CONTRIBUTOR	MaskedArrays had a similar problem, IIRC, because it was blindly copying the NDArray docstrings. Not going to be easy to do, though. "we don't support `out`": Is that a general rule for xarray? Any notes on how to do what I want for clip? The function this was in was supposed to be general use (ndarrays and xarrays).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	can't do in-place clip() with DataArrays. 306067267
370986433	https://github.com/pydata/xarray/pull/1899#issuecomment-370986433	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM3MDk4NjQzMw==	WeatherGod 291576	2018-03-07T01:08:36Z	2018-03-07T01:08:36Z	CONTRIBUTOR	:tada:	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
367077311	https://github.com/pydata/xarray/pull/1899#issuecomment-367077311	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NzA3NzMxMQ==	WeatherGod 291576	2018-02-20T18:43:56Z	2018-02-20T18:43:56Z	CONTRIBUTOR	I did some more investigation into the memory usage problem I was having. I had assumed that the vectorized indexed result of a lazily indexed data array would be an in-memory array. So, when I then started to use the result, it was then doing a read of all the data at once, resulting in a near-complete load of the data into memory. I have adjusted my code to chunk out the indexing in order to keep the memory usage under control at reasonable performance penalty. I haven't looked into trying to identify the ideal chunking scheme to follow for an arbitrary dataarray and indexing. Perhaps we can make that a task for another day. At this point, I am satisfied with the features (negative step-sizes aside, of course).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366379465	https://github.com/pydata/xarray/pull/1899#issuecomment-366379465	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjM3OTQ2NQ==	WeatherGod 291576	2018-02-16T22:40:06Z	2018-02-16T22:40:06Z	CONTRIBUTOR	Ah-hah! Ok, so, the problem isn't some weird difference between the two examples I gave. The issue is that calling `np.asarray(foo)` triggered a full loading of the data!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366376400	https://github.com/pydata/xarray/pull/1899#issuecomment-366376400	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjM3NjQwMA==	WeatherGod 291576	2018-02-16T22:25:59Z	2018-02-16T22:25:59Z	CONTRIBUTOR	huh... now I am not so sure about that... must be something else triggering the load.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366374917	https://github.com/pydata/xarray/pull/1899#issuecomment-366374917	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjM3NDkxNw==	WeatherGod 291576	2018-02-16T22:19:08Z	2018-02-16T22:19:08Z	CONTRIBUTOR	also, at this point, I don't know if this is limited to the netcdf4 backend, as this type of indexing was only done on a variable I have in a netcdf file. I don't have 4-D variables in other file types.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366374041	https://github.com/pydata/xarray/pull/1899#issuecomment-366374041	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjM3NDA0MQ==	WeatherGod 291576	2018-02-16T22:14:49Z	2018-02-16T22:14:49Z	CONTRIBUTOR	`CD` by the way, has dimensions of `scales, latitude, longitude, wind_direction`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366373479	https://github.com/pydata/xarray/pull/1899#issuecomment-366373479	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjM3MzQ3OQ==	WeatherGod 291576	2018-02-16T22:12:18Z	2018-02-16T22:12:18Z	CONTRIBUTOR	Ah, not a change in behavior, but a possible bug exposed by a tiny change on my part. So, I have a 4D data array, `CD` and a data array for indexing, `wind_inds`. The following does not trigger a full loading: `CD[0][wind_direction=wind_inds]`, which is good! But, this does: `CD[scales=0, wind_direction=wind_inds]`, which is bad. So, somehow, the indexing system is effectively treating these two things as different.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366363419	https://github.com/pydata/xarray/pull/1899#issuecomment-366363419	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjM2MzQxOQ==	WeatherGod 291576	2018-02-16T21:28:09Z	2018-02-16T21:28:09Z	CONTRIBUTOR	correction... the problem isn't with pynio... it is in the netcdf4 backend	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366360382	https://github.com/pydata/xarray/pull/1899#issuecomment-366360382	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjM2MDM4Mg==	WeatherGod 291576	2018-02-16T21:15:17Z	2018-02-16T21:15:17Z	CONTRIBUTOR	Something changed. Now the indexing for pynio is forcing a full loading of the data.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366059694	https://github.com/pydata/xarray/pull/1899#issuecomment-366059694	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjA1OTY5NA==	WeatherGod 291576	2018-02-15T20:59:20Z	2018-02-15T20:59:20Z	CONTRIBUTOR	I can confirm that with the latest changes, the pynio tests now pass locally for me. Now, as to whether or not the tests in there are actually exercising anything useful is a different question.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
365734783	https://github.com/pydata/xarray/issues/1910#issuecomment-365734783	https://api.github.com/repos/pydata/xarray/issues/1910	MDEyOklzc3VlQ29tbWVudDM2NTczNDc4Mw==	WeatherGod 291576	2018-02-14T20:27:38Z	2018-02-14T20:27:38Z	CONTRIBUTOR	Looking through the travis logs, I do see that pynio is getting installed.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pynio tests are being skipped on TravisCI 297227247
365734285	https://github.com/pydata/xarray/issues/1910#issuecomment-365734285	https://api.github.com/repos/pydata/xarray/issues/1910	MDEyOklzc3VlQ29tbWVudDM2NTczNDI4NQ==	WeatherGod 291576	2018-02-14T20:25:52Z	2018-02-14T20:25:52Z	CONTRIBUTOR	Zarr tests and pydap tests are also being skipped	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pynio tests are being skipped on TravisCI 297227247
365729433	https://github.com/pydata/xarray/pull/1899#issuecomment-365729433	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NTcyOTQzMw==	WeatherGod 291576	2018-02-14T20:07:55Z	2018-02-14T20:07:55Z	CONTRIBUTOR	I am working on re-activating those tests. I think PyNio is now available for python3, too. On Wed, Feb 14, 2018 at 2:59 PM, Joe Hamman notifications@github.com wrote: @WeatherGod https://github.com/weathergod - you are right, all the pynio tests are being skipped on travis. I'll open a separate issue for that. Yikes! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/pull/1899#issuecomment-365727175, or mute the thread https://github.com/notifications/unsubscribe-auth/AARy-PE0F4-EugBO18rhnrogkZN1MLUOks5tUzssgaJpZM4R_x5o .	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
365722413	https://github.com/pydata/xarray/pull/1899#issuecomment-365722413	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NTcyMjQxMw==	WeatherGod 291576	2018-02-14T19:43:07Z	2018-02-14T19:43:07Z	CONTRIBUTOR	It looks like the pynio backend isn't regularly tested, as several of them currently fail when I run the tests locally. Some of them are failing because they are asserting NotImplementedErrors that are now implemented.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
365708385	https://github.com/pydata/xarray/pull/1899#issuecomment-365708385	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NTcwODM4NQ==	WeatherGod 291576	2018-02-14T18:55:43Z	2018-02-14T18:55:43Z	CONTRIBUTOR	Just did some more debugging, putting in some debug statements within `NioArrayWrapper.__getitem__()`: ``` diff --git a/xarray/backends/pynio_.py b/xarray/backends/pynio_.py index c7e0ddf..b9f7151 100644 --- a/xarray/backends/pynio_.py +++ b/xarray/backends/pynio_.py @@ -27,16 +27,24 @@ class NioArrayWrapper(BackendArray): return self.datastore.ds.variables[self.variable_name] `def __getitem__(self, key):` import logging logger = logging.getLogger(name) logger.addHandler(logging.NullHandler()) logger.debug("initial key: %s", key) key, np_inds = indexing.decompose_indexer(key, self.shape, mode='outer') logger.debug("Decomposed indexers:\n%s\n%s", key, np_inds) `with self.datastore.ensure_open(autoclose=True): array = self.get_array()` logger.debug("initial array: %r", array) if key == () and self.ndim == 0: return array.get_value() `for ind in np_inds:` logger.debug("indexer: %s", ind) array = indexing.NumpyIndexingAdapter(array)[ind] logger.debug("intermediate array: %r", array) return array ``` And here is the test script (data not included): `import logging import xarray as xr logging.basicConfig(level=logging.DEBUG) fname1 = '../hrrr.t12z.wrfnatf02.grib2' ds = xr.open_dataset(fname1, engine='pynio') subset_isel = ds.isel(lv_HYBL0=7) sp = subset_isel['UGRD_P0_L105_GLC0'].values.shape` And here is the relevant output: DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((slice(None, None, None),)) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((slice(None, None, None),)) () DEBUG:xarray.backends.pynio_:initial array: <Nio.NioVariable object at 0x7f0f3c339210> DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((slice(None, None, None),)) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((slice(None, None, None),)) () DEBUG:xarray.backends.pynio_:initial array: <Nio.NioVariable object at 0x7f0f3c339b90> DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((slice(None, None, None),)) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((slice(None, None, None),)) () DEBUG:xarray.backends.pynio_:initial array: <Nio.NioVariable object at 0x7f0f3c339d50> DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((slice(None, None, None),)) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((slice(None, None, None),)) () DEBUG:xarray.backends.pynio_:initial array: <Nio.NioVariable object at 0x7f0f3c339d90> DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((7, slice(None, None, None), slice(None, None, None))) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((7, slice(None, None, None), slice(None, None, None))) () DEBUG:xarray.backends.pynio_:initial array: <Nio.NioVariable object at 0x7f0f3c339190> DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((7, slice(None, None, None), slice(None, None, None))) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((7, slice(None, None, None), slice(None, None, None))) () DEBUG:xarray.backends.pynio_:initial array: <Nio.NioVariable object at 0x7f0f3c339190> (50, 1059, 1799) So, the `BasicIndexer((7, slice(None, None, None), slice(None, None, None)))` isn't getting decomposed correctly, it looks like?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
365692868	https://github.com/pydata/xarray/pull/1899#issuecomment-365692868	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NTY5Mjg2OA==	WeatherGod 291576	2018-02-14T18:02:17Z	2018-02-14T18:06:24Z	CONTRIBUTOR	Ah, interesting... so, this dataset was created by doing an isel() on the original: ``` ds['UGRD_P0_L105_GLC0'] <xarray.DataArray 'UGRD_P0_L105_GLC0' (lv_HYBL0: 50, ygrid_0: 1059, xgrid_0: 1799)> [95257050 values with dtype=float32] Coordinates: * lv_HYBL0 (lv_HYBL0) float32 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 ... gridlat_0 (ygrid_0, xgrid_0) float32 ... gridlon_0 (ygrid_0, xgrid_0) float32 ... Dimensions without coordinates: ygrid_0, xgrid_0 `` So, the original data has a 50x1059x1799 grid, and the new indexer isn't properly composing the indexer so that it fetches [7, slice(None), slice(None)] when I grab it's.values`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
365689883	https://github.com/pydata/xarray/pull/1899#issuecomment-365689883	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NTY4OTg4Mw==	WeatherGod 291576	2018-02-14T17:52:24Z	2018-02-14T17:52:24Z	CONTRIBUTOR	I can also confirm that the shape comes out correctly using master, so this is definitely isolated to this PR.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
365689003	https://github.com/pydata/xarray/pull/1899#issuecomment-365689003	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NTY4OTAwMw==	WeatherGod 291576	2018-02-14T17:49:20Z	2018-02-14T17:49:20Z	CONTRIBUTOR	Hmm, came across a bug with the pynio backend. Working on making a reproducible example, but just for your own inspection, here is some logging output: `<xarray.Dataset> Dimensions: (xgrid_0: 1799, ygrid_0: 1059) Coordinates: lv_HYBL0 float32 8.0 longitude (ygrid_0, xgrid_0) float32 ... latitude (ygrid_0, xgrid_0) float32 ... Dimensions without coordinates: xgrid_0, ygrid_0 Data variables: UGRD (ygrid_0, xgrid_0) float32 ... VGRD (ygrid_0, xgrid_0) float32 ... DEBUG:hiresWind.downscale:shape of a data: (50, 1059, 1799)` The first bit is the repr of my DataSet. The last line is output of `ds['UGRD'].values.shape`. It is supposed to be 3D, not 2D. If I revert back to v0.10.0, then the shape is (1059, 1799}, just as expected.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
365657502	https://github.com/pydata/xarray/pull/1899#issuecomment-365657502	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NTY1NzUwMg==	WeatherGod 291576	2018-02-14T16:13:16Z	2018-02-14T16:13:16Z	CONTRIBUTOR	Oh, wow... this worked like a charm for the netcdf4 backend! I have a ~13GB (uncompressed) 4-D netcdf4 variable that was giving me trouble for slicing a 2D surface out of. Here is a snippet where I am grabbing data at random indices in the last dimension. First for a specific latitude, then for the entire domain. ``` CD_subset = rough['CD'][0] wind_inds_decorated <xarray.DataArray (latitude: 3501, longitude: 7001)> array([[33, 15, 25, ..., 52, 66, 35], [ 6, 8, 55, ..., 59, 6, 50], [54, 2, 40, ..., 32, 19, 9], ..., [53, 18, 23, ..., 19, 3, 43], [ 9, 11, 66, ..., 51, 39, 58], [21, 54, 37, ..., 3, 0, 65]]) Dimensions without coordinates: latitude, longitude foo = CD_subset.isel(latitude=0, wind_direction=wind_inds_decorated[0]) foo <xarray.DataArray 'CD' (longitude: 7001)> array([ 0.004052, 0.005915, 0.002771, ..., 0.005604, 0.004715, 0.002756], dtype=float32) Coordinates: scales int16 60 latitude float64 54.99 * longitude (longitude) float64 -130.0 -130.0 -130.0 -130.0 -130.0 ... wind_direction (longitude) int16 165 75 125 5 235 345 315 175 85 35 290 ... foo = CD_subset.isel(wind_direction=wind_inds_decorated) foo <xarray.DataArray 'CD' (latitude: 3501, longitude: 7001)> [24510501 values with dtype=float32] Coordinates: scales int16 60 * latitude (latitude) float64 54.99 54.98 54.97 54.96 54.95 54.95 ... * longitude (longitude) float64 -130.0 -130.0 -130.0 -130.0 -130.0 ... wind_direction (latitude, longitude) int64 165 75 125 5 235 345 315 175 ... ``` All previous attempts at this would result in having to load the entire 13GB array into memory just to get 93.5 MB out. Or, I would try to fetch each individual point, which took way too long. This worked faster than loading the entire thing into memory, and it used less memory, too (I think I maxed out at about 1.2GB of total usage, which is totally acceptable for my use case). I will try out similar things with the pynio and rasterio backends, and get back to you. Thanks for this work!	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
345310488	https://github.com/pydata/xarray/issues/1720#issuecomment-345310488	https://api.github.com/repos/pydata/xarray/issues/1720	MDEyOklzc3VlQ29tbWVudDM0NTMxMDQ4OA==	WeatherGod 291576	2017-11-17T17:33:13Z	2017-11-17T17:33:13Z	CONTRIBUTOR	Awesome! Thanks!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Possible regression with PyNIO data not being lazily loaded 274308380
345124033	https://github.com/pydata/xarray/issues/1720#issuecomment-345124033	https://api.github.com/repos/pydata/xarray/issues/1720	MDEyOklzc3VlQ29tbWVudDM0NTEyNDAzMw==	WeatherGod 291576	2017-11-17T02:08:50Z	2017-11-17T02:08:50Z	CONTRIBUTOR	Is there a convenient sentinel I can check for loaded-ness? The only reason I noticed this was I was debugging another problem with my processing of HRRR files (~600mb each) and the memory usage shot up (did you know that `top` will report memory usage as fractions of terabytes when you get high enough?). I could test this with some smaller netcdf4 files if I could just loop through the variables and assert some sentinal. On Thu, Nov 16, 2017 at 8:57 PM, Stephan Hoyer notifications@github.com wrote: @WeatherGod https://github.com/weathergod can you verify that you don't get immediate loading when loading netCDF files, e.g., with scipy or netCDF4-python? We did change how loading of data works with printing in this release ( 1532 https://github.com/pydata/xarray/pull/1532), but if anything the changes should go the other way, to do less loading of data. I'm having trouble debugging this locally because I can't seem to get a working version of pynio installed from conda-forge on OS X (running into various ABI incompatibility issues when I try this in a new conda environment). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/1720#issuecomment-345122204, or mute the thread https://github.com/notifications/unsubscribe-auth/AARy-MO7la8KSJnQoto8Kso5gBYedUKQks5s3OgSgaJpZM4Qflk- .	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Possible regression with PyNIO data not being lazily loaded 274308380
342576941	https://github.com/pydata/xarray/issues/475#issuecomment-342576941	https://api.github.com/repos/pydata/xarray/issues/475	MDEyOklzc3VlQ29tbWVudDM0MjU3Njk0MQ==	WeatherGod 291576	2017-11-07T18:29:12Z	2017-11-07T18:29:12Z	CONTRIBUTOR	Yeah, we need to move something forward, because the main benefit of xarray is the ability to manage datasets from multiple sources in a consistent way. And data from different sources will almost always be in different projections. My current problem that I need to solve right now is that I am ingesting model data that is in a LCC projection and ingesting radar data that is in a simple regular lat/lon grid. Both dataset objects have latitude and longitude coordinate arrays, I just need to get both datasets to have the same lat/lon grid. I guess I could continue using my old scipy-based solution (using map_coordinates() or RectBivariateSpline), but at the very least, it would make sense to have some documentation demonstrating how one might go about this very common problem, even if it is showing how to use the scipy-based tools with xarrays. If that is of interest, I can see what I can write up after I am done my immediate task.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	API design for pointwise indexing 95114700
342553465	https://github.com/pydata/xarray/issues/475#issuecomment-342553465	https://api.github.com/repos/pydata/xarray/issues/475	MDEyOklzc3VlQ29tbWVudDM0MjU1MzQ2NQ==	WeatherGod 291576	2017-11-07T17:11:49Z	2017-11-07T17:11:49Z	CONTRIBUTOR	So, what has become the consensus for performing regridding/resampling? I see a lot of suggestions, but I have no sense of what is mature enough to use in production-level code. I also haven't seen anything in the documentation about this topic, even if it just refers people to another project.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	API design for pointwise indexing 95114700
147797539	https://github.com/pydata/xarray/pull/459#issuecomment-147797539	https://api.github.com/repos/pydata/xarray/issues/459	MDEyOklzc3VlQ29tbWVudDE0Nzc5NzUzOQ==	WeatherGod 291576	2015-10-13T18:03:56Z	2015-10-13T18:03:56Z	CONTRIBUTOR	That's all the time I have at the moment. I do have some more notes from my old, incomplete implementation, though. I'll try to finish the review tomorrow.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	add pynio backend 94100328
146976549	https://github.com/pydata/xarray/issues/615#issuecomment-146976549	https://api.github.com/repos/pydata/xarray/issues/615	MDEyOklzc3VlQ29tbWVudDE0Njk3NjU0OQ==	WeatherGod 291576	2015-10-09T20:15:49Z	2015-10-09T20:15:49Z	CONTRIBUTOR	hmm, good point. I wish I knew why I ended up using `pd.to_timedelta()` in the first place. Did numpy not support converting timedelta objects at one point?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	operations with pd.to_timedelta() now fails 110726841
60429213	https://github.com/pydata/xarray/issues/268#issuecomment-60429213	https://api.github.com/repos/pydata/xarray/issues/268	MDEyOklzc3VlQ29tbWVudDYwNDI5MjEz	WeatherGod 291576	2014-10-24T18:27:30Z	2014-10-24T18:27:30Z	CONTRIBUTOR	Note, I mean that I at first thought that collapsing variables into scalars was a useful feature, not that it would happen only for datasets and not data arrays.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	groupby reduction sometimes collapses variables into scalars 46768521
60425242	https://github.com/pydata/xarray/issues/267#issuecomment-60425242	https://api.github.com/repos/pydata/xarray/issues/267	MDEyOklzc3VlQ29tbWVudDYwNDI1MjQy	WeatherGod 291576	2014-10-24T17:58:37Z	2014-10-24T17:58:37Z	CONTRIBUTOR	So, is the string approach I used above to grab a single day's data a bug or a feature? It is a nice short-hand, but I don't want to rely on it if it isn't intended to be a feature. Similarly, if I supply a Year-Month string, I get data for that month.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	can't use datetime or pandas datetime to index time dimension 46756880
60413505	https://github.com/pydata/xarray/issues/267#issuecomment-60413505	https://api.github.com/repos/pydata/xarray/issues/267	MDEyOklzc3VlQ29tbWVudDYwNDEzNTA1	WeatherGod 291576	2014-10-24T16:37:26Z	2014-10-24T16:37:26Z	CONTRIBUTOR	Gah, I am sorry, please disregard my last comment. I can't add/subtract...	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	can't use datetime or pandas datetime to index time dimension 46756880
60413356	https://github.com/pydata/xarray/issues/267#issuecomment-60413356	https://api.github.com/repos/pydata/xarray/issues/267	MDEyOklzc3VlQ29tbWVudDYwNDEzMzU2	WeatherGod 291576	2014-10-24T16:36:18Z	2014-10-24T16:36:18Z	CONTRIBUTOR	A bit of a further wrinkle is that date selection seems to be limited to local time only because of this limitation. Consider the following: ``` c['time'][:25] <xray.DataArray 'time' (time: 25)> array(['2013-01-01T06:15:00.000000000-0500', '2013-01-01T07:00:00.000000000-0500', '2013-01-01T08:00:00.000000000-0500', '2013-01-01T09:00:00.000000000-0500', '2013-01-01T10:00:00.000000000-0500', '2013-01-01T11:00:00.000000000-0500', '2013-01-01T12:00:00.000000000-0500', '2013-01-01T13:00:00.000000000-0500', '2013-01-01T14:00:00.000000000-0500', '2013-01-01T15:00:00.000000000-0500', '2013-01-01T16:00:00.000000000-0500', '2013-01-01T17:00:00.000000000-0500', '2013-01-01T18:00:00.000000000-0500', '2013-01-01T19:00:00.000000000-0500', '2013-01-01T20:00:00.000000000-0500', '2013-01-01T21:00:00.000000000-0500', '2013-01-01T22:00:00.000000000-0500', '2013-01-01T23:00:00.000000000-0500', '2013-01-02T00:00:00.000000000-0500', '2013-01-02T01:00:00.000000000-0500', '2013-01-02T02:00:00.000000000-0500', '2013-01-02T03:00:00.000000000-0500', '2013-01-02T04:00:00.000000000-0500', '2013-01-02T05:00:00.000000000-0500', '2013-01-02T06:00:00.000000000-0500'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] 2013-01-01T11:15:00 ... latitude float32 64.833 elevation float32 137.5 longitude float32 -147.6 c.sel(time='2013-01-01')['time'] <xray.DataArray 'time' (time: 13)> array(['2013-01-01T06:15:00.000000000-0500', '2013-01-01T07:00:00.000000000-0500', '2013-01-01T08:00:00.000000000-0500', '2013-01-01T09:00:00.000000000-0500', '2013-01-01T10:00:00.000000000-0500', '2013-01-01T11:00:00.000000000-0500', '2013-01-01T12:00:00.000000000-0500', '2013-01-01T13:00:00.000000000-0500', '2013-01-01T14:00:00.000000000-0500', '2013-01-01T15:00:00.000000000-0500', '2013-01-01T16:00:00.000000000-0500', '2013-01-01T17:00:00.000000000-0500', '2013-01-01T18:00:00.000000000-0500'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] 2013-01-01T11:15:00 ... latitude float32 64.833 elevation float32 137.5 longitude float32 -147.6 ``` I don't know how I would (easily) slice this data array such as to grab only data for a UTC day.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	can't use datetime or pandas datetime to index time dimension 46756880
60404650	https://github.com/pydata/xarray/issues/185#issuecomment-60404650	https://api.github.com/repos/pydata/xarray/issues/185	MDEyOklzc3VlQ29tbWVudDYwNDA0NjUw	WeatherGod 291576	2014-10-24T15:37:00Z	2014-10-24T15:37:00Z	CONTRIBUTOR	May I propose a name? xray.glasses	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Plot methods 38109425
60399616	https://github.com/pydata/xarray/issues/264#issuecomment-60399616	https://api.github.com/repos/pydata/xarray/issues/264	MDEyOklzc3VlQ29tbWVudDYwMzk5NjE2	WeatherGod 291576	2014-10-24T15:04:23Z	2014-10-24T15:04:23Z	CONTRIBUTOR	I should note that if an inner join is performed, then no NaNs are inserted and the arrays remain float32.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	align silently upcasts data arrays when NaNs are inserted 46745063
58570858	https://github.com/pydata/xarray/issues/214#issuecomment-58570858	https://api.github.com/repos/pydata/xarray/issues/214	MDEyOklzc3VlQ29tbWVudDU4NTcwODU4	WeatherGod 291576	2014-10-09T20:19:12Z	2014-10-09T20:19:12Z	CONTRIBUTOR	Ok, I think I got it (for reals this time...) ``` def bcast(spat_only, coord_names): coords = [] for i, n in enumerate(coord_names): if spat_only[n].ndim != len(spat_only.dims): # Needs new axes slices = [np.newaxis] * len(spat_only.dims) slices[i] = slice(None) else: slices = [slice(None)] * len(spat_only.dims) coords.append(spat_only[n].values[slices]) return np.broadcast_arrays(coords) def grid_to_points2(grid, points, coord_names): if not coord_names: raise ValueError("No coordinate names provided") spat_dims = {d for n in coord_names for d in grid[n].dims} not_spatial = set(grid.dims) - spat_dims spatial_selection = {n:0 for n in not_spatial} spat_only = grid.isel(spatial_selection) `coords = bcast(spat_only, coord_names) kd = KDTree(zip([c.ravel() for c in coords])) _, indx = kd.query(zip([points[n].values for n in coord_names])) indx = np.unravel_index(indx, coords[0].shape) return xray.concat( (grid.isel({n:j for n, j in zip(spat_only.dims, i)}) for i in zip(indx)), dim='station')` ``` Needs a lot more tests and comments and such, but I think this works. Best part is that it seems to do a very decent job of keeping memory usage low, and only operates upon the coordinates that I specify. Everything else is left alone. So, I have used this on 4-D data, picking out grid points at specified lat/lon positions, and get back a 3D result (time, level, station). And I have used this on just 2D data, getting back just a 1D result (dimension='station').	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pointwise indexing -- something like sel_points 40395257
58568933	https://github.com/pydata/xarray/issues/214#issuecomment-58568933	https://api.github.com/repos/pydata/xarray/issues/214	MDEyOklzc3VlQ29tbWVudDU4NTY4OTMz	WeatherGod 291576	2014-10-09T20:05:01Z	2014-10-09T20:05:01Z	CONTRIBUTOR	Consider the following Dataset: <xray.Dataset> Dimensions: (lv_HTGL1: 2, lv_HTGL3: 2, lv_HTGL5: 2, lv_HTGL6: 2, lv_ISBL0: 37, lv_SPDL2: 6, lv_SPDL4: 3, time: 9, xgrid_0: 451, ygrid_0: 337) Coordinates: * xgrid_0 (xgrid_0) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ... * ygrid_0 (ygrid_0) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ... * lv_ISBL0 (lv_ISBL0) float32 10000.0 12500.0 15000.0 17500.0 20000.0 ... * lv_HTGL6 (lv_HTGL6) float32 1000.0 4000.0 * lv_HTGL1 (lv_HTGL1) float32 2.0 80.0 * lv_HTGL3 (lv_HTGL3) float32 10.0 80.0 latitude (ygrid_0, xgrid_0) float32 16.281 16.3084 16.3356 16.3628 16.3898 ... longitude (ygrid_0, xgrid_0) float32 233.862 233.984 234.106 234.229 ... * lv_HTGL5 (lv_HTGL5) int64 0 1 * lv_SPDL2 (lv_SPDL2) int64 0 1 2 3 4 5 * lv_SPDL4 (lv_SPDL4) int64 0 1 2 * time (time) datetime64[ns] 2014-09-25T01:00:00 ... Variables: gridrot_0 (ygrid_0, xgrid_0) float32 -0.229676 -0.228775 -0.227873 ... TMP_P0_L103_GLC0 (time, lv_HTGL1, ygrid_0, xgrid_0) float64 295.8 295.7 295.7 295.7 ... The latitude and longitude variables are both dependent upon xgrid_0 and ygrid_0. Meanwhile... <xray.Dataset> Dimensions: (station: 120, time: 4) Coordinates: latitude (station) float32 34.805 34.795 34.585 36.705 34.245 34.915 34.195 36.075 ... * station (station) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 ... sixhourly (time) int64 0 1 2 3 longitude (station) float32 -98.025 -96.665 -99.335 -98.705 -95.665 -98.295 ... * time (time) datetime64[ns] 2014-10-07 2014-10-07T06:00:00 ... Variables: MaxGust (station, time) float64 7.794 7.47 8.675 4.788 7.071 7.903 8.641 5.533 ... the latitude and longitude variables are independent of each other (they are 1-D). The variable in the first one can not be accessed directly by lat/lon values, while the MaxGust variable in the second one can. This poses some difficulties.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pointwise indexing -- something like sel_points 40395257
58565934	https://github.com/pydata/xarray/issues/214#issuecomment-58565934	https://api.github.com/repos/pydata/xarray/issues/214	MDEyOklzc3VlQ29tbWVudDU4NTY1OTM0	WeatherGod 291576	2014-10-09T19:43:08Z	2014-10-09T19:43:08Z	CONTRIBUTOR	Hmmm, limitation that I just encountered. When there are dependent coordinates, the variables representing those coordinates are not the index arrays (and thus, are not "dimensions" either), so my solution is completely broken for dependent coordinates. If I were to go back to my DataArray-only solution, then I still need to correct the code to use the dimension names of the coordinate variables, and still need to fix the coordinates != dimensions issue.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pointwise indexing -- something like sel_points 40395257
58562506	https://github.com/pydata/xarray/issues/214#issuecomment-58562506	https://api.github.com/repos/pydata/xarray/issues/214	MDEyOklzc3VlQ29tbWVudDU4NTYyNTA2	WeatherGod 291576	2014-10-09T19:16:52Z	2014-10-09T19:16:52Z	CONTRIBUTOR	to/from_dateframe just ate up all my memory. I think I am going to stick with my broadcasting approach...	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pointwise indexing -- something like sel_points 40395257
58558069	https://github.com/pydata/xarray/issues/214#issuecomment-58558069	https://api.github.com/repos/pydata/xarray/issues/214	MDEyOklzc3VlQ29tbWVudDU4NTU4MDY5	WeatherGod 291576	2014-10-09T18:47:22Z	2014-10-09T18:47:22Z	CONTRIBUTOR	oooh, didn't realize that `dims` is different for DataSet and DataArray... Gonna have to fix that, too. I am checking out the broadcasting functions you pointed out. The one limitation I see right away with xray.core.variable.broadcast_variables is that it is limited to two variables (presumedly, I would be broadcasting N number of coordinates because the variables may or may not have extraneous dimensions that I don't care to broadcast)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pointwise indexing -- something like sel_points 40395257
58553935	https://github.com/pydata/xarray/issues/214#issuecomment-58553935	https://api.github.com/repos/pydata/xarray/issues/214	MDEyOklzc3VlQ29tbWVudDU4NTUzOTM1	WeatherGod 291576	2014-10-09T18:21:16Z	2014-10-09T18:21:16Z	CONTRIBUTOR	And, actually, the example I gave above has a bug in the dependent dimension case. This one should be much better (not fully tested yet, though): ``` def grid_to_points2(grid, points, coord_names): if not coord_names: raise ValueError("No coordinate names provided") not_spatial = set(grid.dims) - set(coord_names) spatial_selection = {n:0 for n in not_spatial} spat_only = grid.isel(spatial_selection) coords = [] for i, n in enumerate(spat_only.dims): if spat_only[n].ndim != len(spat_only.dims): # Needs new axes slices = [np.newaxis] len(spat_only.dims) slices[i] = slice(None) else: slices = [slice(None)] * len(spat_only.dims) coords.append(spat_only[n].values[slices]) coords = np.broadcast_arrays(coords) `kd = KDTree(zip([c.flatten() for c in coords])) _, indx = kd.query(zip([points[n].values for n in spat_only.dims])) indx = np.unravel_index(indx, coords[0].shape) return xray.concat( (grid.sel(*{n:c[i] for n, c in zip(spat_only.dims, coords)}) for i in zip(indx)), dim='station')` ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pointwise indexing -- something like sel_points 40395257
58551759	https://github.com/pydata/xarray/issues/214#issuecomment-58551759	https://api.github.com/repos/pydata/xarray/issues/214	MDEyOklzc3VlQ29tbWVudDU4NTUxNzU5	WeatherGod 291576	2014-10-09T18:06:56Z	2014-10-09T18:06:56Z	CONTRIBUTOR	And, I think I just realized how I could generalize it even more. Right now, `grid` can only be a DataArray, but I would like this to work for a DataSet as well. I bet if I use .sel() instead of .isel() and access the elements of the broadcasted arrays, I could make this work very nicely for both DataArray and DataSet.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pointwise indexing -- something like sel_points 40395257
58550741	https://github.com/pydata/xarray/issues/214#issuecomment-58550741	https://api.github.com/repos/pydata/xarray/issues/214	MDEyOklzc3VlQ29tbWVudDU4NTUwNzQx	WeatherGod 291576	2014-10-09T18:00:33Z	2014-10-09T18:00:33Z	CONTRIBUTOR	Oh, and it does take advantage of a bunch of python2.7 features such as dictionary comprehensions and generator statements, so...	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pointwise indexing -- something like sel_points 40395257
58550403	https://github.com/pydata/xarray/issues/214#issuecomment-58550403	https://api.github.com/repos/pydata/xarray/issues/214	MDEyOklzc3VlQ29tbWVudDU4NTUwNDAz	WeatherGod 291576	2014-10-09T17:58:25Z	2014-10-09T17:58:25Z	CONTRIBUTOR	Starting using the above snippet for more datasets, some with interdependent coordinates and some without (so the coordinates would be 1-d). I think I have generalized it significantly... ``` def grid_to_points(grid, points, coord_names): not_spatial = set(grid.dims) - set(coord_names) spatial_selection = {n:0 for n in not_spatial} spat_only = grid.isel(spatial_selection) coords = [] for i, n in enumerate(spat_only.dims): if spat_only[n].ndim != len(spat_only.dims): # Needs new axes slices = [np.newaxis] len(spat_only.dims) slices[i] = slice(None) else: slices = [slice(None)] * len(spat_only.dims) coords.append(spat_only[n].values[slices]) coords = [c.flatten() for c in np.broadcast_arrays(coords)] `kd = KDTree(zip(coords)) _, indx = kd.query(zip([points[n].values for n in spat_only.dims])) indx = np.unravel_index(indx, spat_only.shape) return xray.concat((grid.isel(*{n:j for n, j in zip(spat_only.dims, i)}) for i in zip(indx)), dim='station')` ``` I can still imagine some situations where this won't work, such as a requested set of dimensions that are a mix of dependent and independent variables. Currently, if the dimensions are independent, then the number of dimensions of each one is assumed to be 1 and np.newaxis is used for the others. Meanwhile, if the dimensions are dependent, then the number of dimensions for each one is assumed to be the same as the number of dependent variables and is merely flattened (the broadcast is essentially no-op). I should also note that this is technically not restricted to spatial coordinates even though the code says so. Just anything that can be represented in euclidean space.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pointwise indexing -- something like sel_points 40395257
57857522	https://github.com/pydata/xarray/issues/214#issuecomment-57857522	https://api.github.com/repos/pydata/xarray/issues/214	MDEyOklzc3VlQ29tbWVudDU3ODU3NTIy	WeatherGod 291576	2014-10-03T20:48:35Z	2014-10-03T20:48:35Z	CONTRIBUTOR	Just managed to implement this using your suggestion for my data: `from scipy.spatial import cKDTree as KDTree kd = KDTree(zip(model['longitude'].values.ravel(), model['latitude'].values.ravel())) dists, indx = kd.query(zip(obs['longitude'], obs['latitude'])) indx = np.unravel_index(indx, mod['longitude'].shape) mod_points = xray.concat([mod.isel(x=x, y=y) for y, x in zip(*indx)], dim='station')` Not entirely certain why I needed to reverse y and x in that last part, but, oh well...	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pointwise indexing -- something like sel_points 40395257
57847940	https://github.com/pydata/xarray/issues/214#issuecomment-57847940	https://api.github.com/repos/pydata/xarray/issues/214	MDEyOklzc3VlQ29tbWVudDU3ODQ3OTQw	WeatherGod 291576	2014-10-03T19:56:16Z	2014-10-03T19:56:16Z	CONTRIBUTOR	Unless I am missing something about xray, that selection operation could only work if `pts` had values that exactly matched coordinate values in `ds`. In most scenarios, that would not be the case. One would have to first build `pts` from a computation of nearest-neighbor indexs between the stations and the model grid.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pointwise indexing -- something like sel_points 40395257

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

77 rows where user = 291576 sorted by updated_at descending

1532 https://github.com/pydata/xarray/pull/1532), but if anything the

Advanced export