html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/2004#issuecomment-738189796,https://api.github.com/repos/pydata/xarray/issues/2004,738189796,MDEyOklzc3VlQ29tbWVudDczODE4OTc5Ng==,291576,2020-12-03T18:15:35Z,2020-12-03T18:15:35Z,CONTRIBUTOR,"I think so, at least in terms of my original problem.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/4142#issuecomment-642253287,https://api.github.com/repos/pydata/xarray/issues/4142,642253287,MDEyOklzc3VlQ29tbWVudDY0MjI1MzI4Nw==,291576,2020-06-10T20:55:32Z,2020-06-10T20:55:32Z,CONTRIBUTOR,"So, one important difference I see off the bat is that zarr already had a DataStore implementation, while rasterio does not. I take it that implementing one would be the preferred approach?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,636493109 https://github.com/pydata/xarray/pull/2648#issuecomment-451626366,https://api.github.com/repos/pydata/xarray/issues/2648,451626366,MDEyOklzc3VlQ29tbWVudDQ1MTYyNjM2Ng==,291576,2019-01-05T04:18:50Z,2019-01-05T04:18:50Z,CONTRIBUTOR,"I completely forgotten about that little quirk of cpython. I try to ignore implementation details like that. Heck, I still don't fully trust dictionaries to be ordered! I removed the WIP. We can deal with the concat dim default object separately, including turning it into a ReprObject (not exactly sure what the advantage of it is over just using the string, but, meh).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396008054 https://github.com/pydata/xarray/pull/2648#issuecomment-451583970,https://api.github.com/repos/pydata/xarray/issues/2648,451583970,MDEyOklzc3VlQ29tbWVudDQ1MTU4Mzk3MA==,291576,2019-01-04T22:12:44Z,2019-01-04T22:12:44Z,CONTRIBUTOR,"Is the following statement True or False: ""The user should be allowed to explicitly declare that they want the concatenation dimension to be inferred by passing a keyword argument"". If this is True, then you need to test equivalence. If it is False, then there is nothing more I need to do for the PR, as changing this to use a ReprObject is orthogonal to these changes.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396008054 https://github.com/pydata/xarray/pull/2648#issuecomment-451581103,https://api.github.com/repos/pydata/xarray/issues/2648,451581103,MDEyOklzc3VlQ29tbWVudDQ1MTU4MTEwMw==,291576,2019-01-04T22:00:10Z,2019-01-04T22:00:10Z,CONTRIBUTOR,"ok, so we use the ReprObject for the default, and then test if `concat_dim` is of type `ReprObject and then test its equivalance?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396008054 https://github.com/pydata/xarray/issues/2647#issuecomment-451504997,https://api.github.com/repos/pydata/xarray/issues/2647,451504997,MDEyOklzc3VlQ29tbWVudDQ1MTUwNDk5Nw==,291576,2019-01-04T17:06:50Z,2019-01-04T17:06:50Z,CONTRIBUTOR,"scratch that... the test was an `or`, not a `and`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,395994055 https://github.com/pydata/xarray/issues/2647#issuecomment-451504462,https://api.github.com/repos/pydata/xarray/issues/2647,451504462,MDEyOklzc3VlQ29tbWVudDQ1MTUwNDQ2Mg==,291576,2019-01-04T17:05:00Z,2019-01-04T17:05:00Z,CONTRIBUTOR,"actually, we could simplify the conditional to be just `concat_dim is _CONCAT_DIM_DEFAULT` and not bother with the `None` test.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,395994055 https://github.com/pydata/xarray/issues/2647#issuecomment-451504141,https://api.github.com/repos/pydata/xarray/issues/2647,451504141,MDEyOklzc3VlQ29tbWVudDQ1MTUwNDE0MQ==,291576,2019-01-04T17:03:54Z,2019-01-04T17:03:54Z,CONTRIBUTOR,ah! that's why it snuck through! I have been raking my brain on this for the past hour! shall I go ahead and make a PR?,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,395994055 https://github.com/pydata/xarray/issues/2647#issuecomment-451501740,https://api.github.com/repos/pydata/xarray/issues/2647,451501740,MDEyOklzc3VlQ29tbWVudDQ1MTUwMTc0MA==,291576,2019-01-04T16:55:40Z,2019-01-04T16:55:40Z,CONTRIBUTOR,"To be more explicit, the issue is that `concat_dim == _CONCAT_DIM_DEFAULT` is ill-advised because the type of `concat_dim` is not guaranteed to be a scalar. In fact, the elif of that area of code in api.py explicitly tests if concat_dim is or is not a list.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,395994055 https://github.com/pydata/xarray/issues/2227#issuecomment-425224969,https://api.github.com/repos/pydata/xarray/issues/2227,425224969,MDEyOklzc3VlQ29tbWVudDQyNTIyNDk2OQ==,291576,2018-09-27T20:05:05Z,2018-09-27T20:05:05Z,CONTRIBUTOR,"It would be ten files opened via xr.open_mfdataset() concatenated across a time dimension, each one looking like: ``` netcdf convect_gust_20180301_0000 { dimensions: latitude = 3502 ; longitude = 7002 ; variables: double latitude(latitude) ; latitude:_FillValue = NaN ; latitude:_Storage = ""contiguous"" ; latitude:_Endianness = ""little"" ; double longitude(longitude) ; longitude:_FillValue = NaN ; longitude:_Storage = ""contiguous"" ; longitude:_Endianness = ""little"" ; float gust(latitude, longitude) ; gust:_FillValue = NaNf ; gust:units = ""m/s"" ; gust:description = ""gust winds"" ; gust:_Storage = ""chunked"" ; gust:_ChunkSizes = 701, 1401 ; gust:_DeflateLevel = 8 ; gust:_Shuffle = ""true"" ; gust:_Endianness = ""little"" ; // global attributes: :start_date = ""03/01/2018 00:00"" ; :end_date = ""03/01/2018 01:00"" ; :interval = ""half-open"" ; :init_date = ""02/28/2018 22:00"" ; :history = ""Created 2018-09-12 15:53:44.468144"" ; :description = ""Convective Downscaling, format V2.0"" ; :_NCProperties = ""version=1|netcdflibversion=4.6.1|hdf5libversion=1.10.1"" ; :_SuperblockVersion = 0 ; :_IsNetcdf4 = 1 ; :_Format = ""netCDF-4"" ; ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,331668890 https://github.com/pydata/xarray/issues/2227#issuecomment-424795330,https://api.github.com/repos/pydata/xarray/issues/2227,424795330,MDEyOklzc3VlQ29tbWVudDQyNDc5NTMzMA==,291576,2018-09-26T17:06:44Z,2018-09-26T17:06:44Z,CONTRIBUTOR,"No, it does not make a difference. The example above peaks at around 5GB of memory (a bit much, but manageable). And it peaks similarly if we chunk it like you suggested.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,331668890 https://github.com/pydata/xarray/issues/2227#issuecomment-424485235,https://api.github.com/repos/pydata/xarray/issues/2227,424485235,MDEyOklzc3VlQ29tbWVudDQyNDQ4NTIzNQ==,291576,2018-09-25T20:14:02Z,2018-09-25T20:14:02Z,CONTRIBUTOR,"Yeah, it looks like if `da` is backed by a dask array, and you do a `.isel(win=window.compute())` because otherwise isel barfs on dask indexers, it seems, then the memory usage shoots through the roof. Note that in my case, the dask chunks are (1, 3000, 7000). If I do a `window.load()` prior to `window.isel()`, then the memory usage is perfectly reasonable.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,331668890 https://github.com/pydata/xarray/issues/2227#issuecomment-424479421,https://api.github.com/repos/pydata/xarray/issues/2227,424479421,MDEyOklzc3VlQ29tbWVudDQyNDQ3OTQyMQ==,291576,2018-09-25T19:54:59Z,2018-09-25T19:54:59Z,CONTRIBUTOR,"Just for posterity, though, here is my simplified (working!) example: ``` import numpy as np import xarray as xr da = xr.DataArray(np.random.randn(10, 3000, 7000), dims=('time', 'latitude', 'longitude')) window = da.rolling(time=2).construct('win') indexes = window.argmax(dim='win') result = window.isel(win=indexes) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,331668890 https://github.com/pydata/xarray/issues/2227#issuecomment-424477465,https://api.github.com/repos/pydata/xarray/issues/2227,424477465,MDEyOklzc3VlQ29tbWVudDQyNDQ3NzQ2NQ==,291576,2018-09-25T19:48:20Z,2018-09-25T19:48:20Z,CONTRIBUTOR,"Huh, strange... I just tried a simplified version of what I was doing (particularly, no dask arrays), and everything worked fine. I'll have to investigate further.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,331668890 https://github.com/pydata/xarray/issues/2227#issuecomment-424470752,https://api.github.com/repos/pydata/xarray/issues/2227,424470752,MDEyOklzc3VlQ29tbWVudDQyNDQ3MDc1Mg==,291576,2018-09-25T19:27:28Z,2018-09-25T19:27:28Z,CONTRIBUTOR,"I am looking into a similar performance issue with isel, but it seems that the issue is that it is creating arrays that are much bigger than needed. For my multidimensional case (time/x/y/window), what should end up only taking a few hundred MB is spiking up to 10's of GB of used RAM. Don't know if this might be a possible source of performance issues.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,331668890 https://github.com/pydata/xarray/issues/2217#issuecomment-407547050,https://api.github.com/repos/pydata/xarray/issues/2217,407547050,MDEyOklzc3VlQ29tbWVudDQwNzU0NzA1MA==,291576,2018-07-24T20:48:53Z,2018-07-24T20:48:53Z,CONTRIBUTOR,I have created a PR for my work-in-progress: pandas-dev/pandas#22043,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,329575874 https://github.com/pydata/xarray/issues/2217#issuecomment-400043753,https://api.github.com/repos/pydata/xarray/issues/2217,400043753,MDEyOklzc3VlQ29tbWVudDQwMDA0Mzc1Mw==,291576,2018-06-25T18:07:49Z,2018-06-25T18:07:49Z,CONTRIBUTOR,"Do we want to dive straight to that? Or, would it make more sense to first submit some PRs piping the support for a tolerance kwarg through more of the API? Or perhaps we should propose that a ""tolerance"" attribute should be an optional attribute that methods like `get_indexer()` and such could always check for? Not being a pandas dev, I am not sure how piecemeal we should approach this. In addition, we are likely going to have to implement a decent chunk of code ourselves for compatibility's sake, I think.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,329575874 https://github.com/pydata/xarray/issues/2217#issuecomment-399612490,https://api.github.com/repos/pydata/xarray/issues/2217,399612490,MDEyOklzc3VlQ29tbWVudDM5OTYxMjQ5MA==,291576,2018-06-22T23:56:41Z,2018-06-22T23:56:41Z,CONTRIBUTOR,"I am not concerned about the non-commutativeness of the indexer itself. There is no way around that. At some point, you have to choose values, whether it is done by an indexer or done by some particular set operation. As for the different sizes, that happens when the tolerance is greater than half the smallest delta. I figure a final implementation would enforce such a constraint on the tolerance. On Fri, Jun 22, 2018 at 5:56 PM, Stephan Hoyer wrote: > @WeatherGod One problem with your > definition of tolerance is that it isn't commutative, even if both indexes > have the same tolerance: > > a = ImpreciseIndex([0.1, 0.2, 0.3, 0.4]) > a.tolerance = 0.1 > b = ImpreciseIndex([0.301, 0.401, 0.501, 0.601]) > b.tolerance = 0.1print(a.union(b)) # ImpreciseIndex([0.1, 0.2, 0.3, 0.4, 0.501, 0.601], dtype='float64')print(b.union(a)) # ImpreciseIndex([0.1, 0.2, 0.301, 0.401, 0.501, 0.601], dtype='float64') > > If you try a little harder, you could even have cases where the result has > a different size, e.g., > > a = ImpreciseIndex([1, 2, 3]) > a.tolerance = 0.5 > b = ImpreciseIndex([1, 1.9, 2.1, 3]) > b.tolerance = 0.5print(a.union(b)) # ImpreciseIndex([1.0, 2.0, 3.0], dtype='float64')print(b.union(a)) # ImpreciseIndex([1.0, 1.9, 2.1, 3.0], dtype='float64') > > Maybe these aren't really problems in practice, but it's at least a little > strange/surprising. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > , or mute > the thread > > . > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,329575874 https://github.com/pydata/xarray/issues/2217#issuecomment-399584169,https://api.github.com/repos/pydata/xarray/issues/2217,399584169,MDEyOklzc3VlQ29tbWVudDM5OTU4NDE2OQ==,291576,2018-06-22T21:15:06Z,2018-06-22T21:15:06Z,CONTRIBUTOR,"Actually, I disagree. Pandas's set operations methods are mostly index-based. For union and intersection, they have an optimization that dives down into some c-code when the Indexes are monotonic, but everywhere else, it all works off of results from `get_indexer()`. I have made a quick toy demo code that seems to work. Note, I didn't know how to properly make a constructor for a subclassed Index, so I added the `tolerance` attribute after construction just for the purposes of this demo. ``` python from __future__ import print_function import warnings from pandas import Index import numpy as np from pandas.indexes.base import is_object_dtype, algos, is_dtype_equal from pandas.indexes.base import _ensure_index, _concat, _values_from_object, _unsortable_types from pandas.indexes.numeric import Float64Index def _choose_tolerance(this, that, tolerance): if tolerance is None: tolerance = max(this.tolerance, getattr(that, 'tolerance', 0.0)) return tolerance class ImpreciseIndex(Float64Index): def astype(self, dtype, copy=True): return ImpreciseIndex(self.values.astype(dtype=dtype, copy=copy), name=self.name, dtype=dtype) @property def tolerance(self): return self._tolerance @tolerance.setter def tolerance(self, tolerance): self._tolerance = self._convert_tolerance(tolerance) def union(self, other, tolerance=None): self._assert_can_do_setop(other) other = _ensure_index(other) if len(other) == 0 or self.equals(other, tolerance=tolerance): return self._get_consensus_name(other) if len(self) == 0: return other._get_consensus_name(self) if not is_dtype_equal(self.dtype, other.dtype): this = self.astype('O') other = other.astype('O') return this.union(other, tolerance=tolerance) tolerance = _choose_tolerance(self, other, tolerance) indexer = self.get_indexer(other, tolerance=tolerance) indexer, = (indexer == -1).nonzero() if len(indexer) > 0: other_diff = algos.take_nd(other._values, indexer, allow_fill=False) result = _concat._concat_compat((self._values, other_diff)) try: self._values[0] < other_diff[0] except TypeError as e: warnings.warn(""%s, sort order is undefined for "" ""incomparable objects"" % e, RuntimeWarning, stacklevel=3) else: types = frozenset((self.inferred_type, other.inferred_type)) if not types & _unsortable_types: result.sort() else: result = self._values try: result = np.sort(result) except TypeError as e: warnings.warn(""%s, sort order is undefined for "" ""incomparable objects"" % e, RuntimeWarning, stacklevel=3) # for subclasses return self._wrap_union_result(other, result) def equals(self, other, tolerance=None): if self.is_(other): return True if not isinstance(other, Index): return False if is_object_dtype(self) and not is_object_dtype(other): # if other is not object, use other's logic for coercion if isinstance(other, ImpreciseIndex): return other.equals(self, tolerance=tolerance) else: return other.equals(self) if len(self) != len(other): return False tolerance = _choose_tolerance(self, other, tolerance) diff = np.abs(_values_from_object(self) - _values_from_object(other)) return np.all(diff < tolerance) def intersection(self, other, tolerance=None): self._assert_can_do_setop(other) other = _ensure_index(other) if self.equals(other, tolerance=tolerance): return self._get_consensus_name(other) if not is_dtype_equal(self.dtype, other.dtype): this = self.astype('O') other = other.astype('O') return this.intersection(other, tolerance=tolerance) tolerance = _choose_tolerance(self, other, tolerance) try: indexer = self.get_indexer(other._values, tolerance=tolerance) indexer = indexer.take((indexer != -1).nonzero()[0]) except: # duplicates # FIXME: get_indexer_non_unique() doesn't take a tolerance argument indexer = Index(self._values).get_indexer_non_unique( other._values)[0].unique() indexer = indexer[indexer != -1] taken = self.take(indexer) if self.name != other.name: taken.name = None return taken # TODO: Do I need to re-implement _get_unique_index()? def get_loc(self, key, method=None, tolerance=None): if tolerance is None: tolerance = self.tolerance if tolerance > 0 and method is None: method = 'nearest' return super(ImpreciseIndex, self).get_loc(key, method, tolerance) def get_indexer(self, target, method=None, limit=None, tolerance=None): if tolerance is None: tolerance = self.tolerance if tolerance > 0 and method is None: method = 'nearest' return super(ImpreciseIndex, self).get_indexer(target, method, limit, tolerance) if __name__ == '__main__': a = ImpreciseIndex([0.1, 0.2, 0.3, 0.4]) a.tolerance = 0.01 b = ImpreciseIndex([0.301, 0.401, 0.501, 0.601]) b.tolerance = 0.025 print(a, b) print(""a | b :"", a.union(b)) print(""a & b :"", a.intersection(b)) print(""a.get_indexer(b):"", a.get_indexer(b)) print(""b.get_indexer(a):"", b.get_indexer(a)) ``` Run this and get the following results: ``` ImpreciseIndex([0.1, 0.2, 0.3, 0.4], dtype='float64') ImpreciseIndex([0.301, 0.401, 0.501, 0.601], dtype='float64') a | b : ImpreciseIndex([0.1, 0.2, 0.3, 0.4, 0.501, 0.601], dtype='float64') a & b : ImpreciseIndex([0.3, 0.4], dtype='float64') a.get_indexer(b): [ 2 3 -1 -1] b.get_indexer(a): [-1 -1 0 1] ``` This is mostly lifted from the `Index` base class methods, just with me taking out the monotonic optimization path, and supplying the tolerance argument to the respective calls to `get_indexer`. The choice of tolerance for a given operation is that unless provided as a keyword argument, then use the larger tolerance of the two objects being compared (with a failback if the other isn't an ImpreciseIndex).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,329575874 https://github.com/pydata/xarray/issues/2217#issuecomment-399522595,https://api.github.com/repos/pydata/xarray/issues/2217,399522595,MDEyOklzc3VlQ29tbWVudDM5OTUyMjU5NQ==,291576,2018-06-22T17:42:29Z,2018-06-22T17:42:29Z,CONTRIBUTOR,"Ok, I see how you implemented it for pandas's reindex. You essentially inserted an inexact filter within `.get_indexer()`. And the `intersection()` and `union()` uses these methods, so, in theory, one could pipe a tolerance argument through them (as well as for the other set operations). The work needs to be expanded a bit, though, as `get_indexer_non_unique()` needs the tolerance parameter, too, I think. For xarray, though, I think we can work around backwards compatibility by having Dataset hold specialized subclasses of Index for floating-point data types that would have the needed changes to the Index class. We can have this specialized class have some default tolerance (say 100*finfo(dtype).resolution?), and it would have its methods use the stored tolerance by default, so it should be completely transparent to the end-user (hopefully). This way, `xr.open_mfdataset()` would ""just work"".","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,329575874 https://github.com/pydata/xarray/issues/2217#issuecomment-399286310,https://api.github.com/repos/pydata/xarray/issues/2217,399286310,MDEyOklzc3VlQ29tbWVudDM5OTI4NjMxMA==,291576,2018-06-22T00:45:19Z,2018-06-22T00:45:19Z,CONTRIBUTOR,"@shoyer, I am thinking your original intuition was right about needing to introduce improve the Index classes to perhaps work with an optional epsilon argument to its constructor. How receptive do you think pandas would be to that? And even if they would accept such a feature, we probably would need to implement it a bit ourselves in situations where older pandas versions are used.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,329575874 https://github.com/pydata/xarray/issues/2217#issuecomment-399285369,https://api.github.com/repos/pydata/xarray/issues/2217,399285369,MDEyOklzc3VlQ29tbWVudDM5OTI4NTM2OQ==,291576,2018-06-22T00:38:34Z,2018-06-22T00:38:34Z,CONTRIBUTOR,"Well, I need this to work for join='outer', so, it is gonna happen one way or another... One concept I was toying with today was a distinction between aligning coords (which is what it does now) and aligning bounding boxes.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,329575874 https://github.com/pydata/xarray/issues/2217#issuecomment-399254317,https://api.github.com/repos/pydata/xarray/issues/2217,399254317,MDEyOklzc3VlQ29tbWVudDM5OTI1NDMxNw==,291576,2018-06-21T21:48:28Z,2018-06-21T21:48:28Z,CONTRIBUTOR,"To be clear, my use-case would not be solved by `join='override'` (isn't that just `join='left'`?). I have moving nests of coordinates that can have some floating-point noise in them, but are otherwise identical.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,329575874 https://github.com/pydata/xarray/issues/2217#issuecomment-399253493,https://api.github.com/repos/pydata/xarray/issues/2217,399253493,MDEyOklzc3VlQ29tbWVudDM5OTI1MzQ5Mw==,291576,2018-06-21T21:44:58Z,2018-06-21T21:44:58Z,CONTRIBUTOR,"I was just pointed to this issue yesterday, and I have an immediate need for this feature in xarray for a work project. I'll take responsibility to implement this feature tomorrow.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,329575874 https://github.com/pydata/xarray/pull/2048#issuecomment-380241636,https://api.github.com/repos/pydata/xarray/issues/2048,380241636,MDEyOklzc3VlQ29tbWVudDM4MDI0MTYzNg==,291576,2018-04-10T20:48:25Z,2018-04-10T20:48:25Z,CONTRIBUTOR,What's new entry added.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,312998259 https://github.com/pydata/xarray/pull/2048#issuecomment-380203653,https://api.github.com/repos/pydata/xarray/issues/2048,380203653,MDEyOklzc3VlQ29tbWVudDM4MDIwMzY1Mw==,291576,2018-04-10T18:34:32Z,2018-04-10T18:34:32Z,CONTRIBUTOR,Travis failures seem to be unrelated?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,312998259 https://github.com/pydata/xarray/issues/1988#issuecomment-380137124,https://api.github.com/repos/pydata/xarray/issues/1988,380137124,MDEyOklzc3VlQ29tbWVudDM4MDEzNzEyNA==,291576,2018-04-10T15:12:05Z,2018-04-10T15:12:05Z,CONTRIBUTOR,Yup... looks like that did the trick (for auto_combine and open_mfdataset). I even have a simple test to demonstrate it. PR coming shortly.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,305327479 https://github.com/pydata/xarray/issues/1988#issuecomment-379939574,https://api.github.com/repos/pydata/xarray/issues/1988,379939574,MDEyOklzc3VlQ29tbWVudDM3OTkzOTU3NA==,291576,2018-04-10T00:55:48Z,2018-04-10T00:55:48Z,CONTRIBUTOR,"I'll give it a go tomorrow. My work has gotten to this point now, and I have some unit tests that happen to exercise this edge case. On a somewhat related note, would a `allow_missing` feature be welcomed in `open_mfdataset()`? I have written up some code that expects a `concat_dim`, and a list of filenames. It will then pass to `open_mfdataset()` only the files (and corresponding concat_dim values) that exists, and then calls `reindex()` with the original concat_dim to have a nan-filled slab where-ever there was a missing file. Any interest?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,305327479 https://github.com/pydata/xarray/issues/1988#issuecomment-379901414,https://api.github.com/repos/pydata/xarray/issues/1988,379901414,MDEyOklzc3VlQ29tbWVudDM3OTkwMTQxNA==,291576,2018-04-09T21:35:11Z,2018-04-09T21:35:11Z,CONTRIBUTOR,Could the fix be as simple as `if len(datasets) == 1 and dim is None:`?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,305327479 https://github.com/pydata/xarray/issues/2004#issuecomment-375056363,https://api.github.com/repos/pydata/xarray/issues/2004,375056363,MDEyOklzc3VlQ29tbWVudDM3NTA1NjM2Mw==,291576,2018-03-21T18:50:58Z,2018-03-21T18:50:58Z,CONTRIBUTOR,"Ah, nevermind, I see that our examples only had one greater-than-one stride","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/2004#issuecomment-375056077,https://api.github.com/repos/pydata/xarray/issues/2004,375056077,MDEyOklzc3VlQ29tbWVudDM3NTA1NjA3Nw==,291576,2018-03-21T18:50:01Z,2018-03-21T18:50:01Z,CONTRIBUTOR,"Dunno. I can't seem to get that engine working on my system. Reading through that thread, I wonder if the optimization they added only applies if there is only one stride greater than one?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/2004#issuecomment-375036951,https://api.github.com/repos/pydata/xarray/issues/2004,375036951,MDEyOklzc3VlQ29tbWVudDM3NTAzNjk1MQ==,291576,2018-03-21T17:51:54Z,2018-03-21T17:51:54Z,CONTRIBUTOR,"This might be relevant: https://github.com/Unidata/netcdf4-python/issues/680 Still reading through the thread.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/2004#issuecomment-375034973,https://api.github.com/repos/pydata/xarray/issues/2004,375034973,MDEyOklzc3VlQ29tbWVudDM3NTAzNDk3Mw==,291576,2018-03-21T17:46:09Z,2018-03-21T17:46:09Z,CONTRIBUTOR,my bet is probably netCDF4-python. Don't want to write up the C code though to confirm it. Sigh... this isn't going to be a fun one to track down. Shall I open a bug report over there?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/2004#issuecomment-375014480,https://api.github.com/repos/pydata/xarray/issues/2004,375014480,MDEyOklzc3VlQ29tbWVudDM3NTAxNDQ4MA==,291576,2018-03-21T16:50:59Z,2018-03-21T16:56:13Z,CONTRIBUTOR,"Yeah, good example. Eliminates a lot of possible variables such as problems with netcdf4 compression and such. Probably should see if it happens in v0.10.0 to see if the changes to the indexing system caused this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,307318224 https://github.com/pydata/xarray/issues/1997#issuecomment-373840044,https://api.github.com/repos/pydata/xarray/issues/1997,373840044,MDEyOklzc3VlQ29tbWVudDM3Mzg0MDA0NA==,291576,2018-03-16T20:45:39Z,2018-03-16T20:45:39Z,CONTRIBUTOR,"MaskedArrays had a similar problem, IIRC, because it was blindly copying the NDArray docstrings. Not going to be easy to do, though. ""we don't support `out`"": Is that a general rule for xarray? Any notes on how to do what I want for clip? The function this was in was supposed to be general use (ndarrays and xarrays).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,306067267 https://github.com/pydata/xarray/pull/1899#issuecomment-370986433,https://api.github.com/repos/pydata/xarray/issues/1899,370986433,MDEyOklzc3VlQ29tbWVudDM3MDk4NjQzMw==,291576,2018-03-07T01:08:36Z,2018-03-07T01:08:36Z,CONTRIBUTOR,:tada: ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-367077311,https://api.github.com/repos/pydata/xarray/issues/1899,367077311,MDEyOklzc3VlQ29tbWVudDM2NzA3NzMxMQ==,291576,2018-02-20T18:43:56Z,2018-02-20T18:43:56Z,CONTRIBUTOR,"I did some more investigation into the memory usage problem I was having. I had assumed that the vectorized indexed result of a lazily indexed data array would be an in-memory array. So, when I then started to use the result, it was then doing a read of all the data at once, resulting in a near-complete load of the data into memory. I have adjusted my code to chunk out the indexing in order to keep the memory usage under control at reasonable performance penalty. I haven't looked into trying to identify the ideal chunking scheme to follow for an arbitrary dataarray and indexing. Perhaps we can make that a task for another day. At this point, I am satisfied with the features (negative step-sizes aside, of course).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-366379465,https://api.github.com/repos/pydata/xarray/issues/1899,366379465,MDEyOklzc3VlQ29tbWVudDM2NjM3OTQ2NQ==,291576,2018-02-16T22:40:06Z,2018-02-16T22:40:06Z,CONTRIBUTOR,"Ah-hah! Ok, so, the problem isn't some weird difference between the two examples I gave. The issue is that calling `np.asarray(foo)` triggered a full loading of the data!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-366376400,https://api.github.com/repos/pydata/xarray/issues/1899,366376400,MDEyOklzc3VlQ29tbWVudDM2NjM3NjQwMA==,291576,2018-02-16T22:25:59Z,2018-02-16T22:25:59Z,CONTRIBUTOR,huh... now I am not so sure about that... must be something else triggering the load.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-366374917,https://api.github.com/repos/pydata/xarray/issues/1899,366374917,MDEyOklzc3VlQ29tbWVudDM2NjM3NDkxNw==,291576,2018-02-16T22:19:08Z,2018-02-16T22:19:08Z,CONTRIBUTOR,"also, at this point, I don't know if this is limited to the netcdf4 backend, as this type of indexing was only done on a variable I have in a netcdf file. I don't have 4-D variables in other file types.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-366374041,https://api.github.com/repos/pydata/xarray/issues/1899,366374041,MDEyOklzc3VlQ29tbWVudDM2NjM3NDA0MQ==,291576,2018-02-16T22:14:49Z,2018-02-16T22:14:49Z,CONTRIBUTOR,"`CD` by the way, has dimensions of `scales, latitude, longitude, wind_direction`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-366373479,https://api.github.com/repos/pydata/xarray/issues/1899,366373479,MDEyOklzc3VlQ29tbWVudDM2NjM3MzQ3OQ==,291576,2018-02-16T22:12:18Z,2018-02-16T22:12:18Z,CONTRIBUTOR,"Ah, not a change in behavior, but a possible bug exposed by a tiny change on my part. So, I have a 4D data array, `CD` and a data array for indexing, `wind_inds`. The following does not trigger a full loading: `CD[0][wind_direction=wind_inds]`, which is good! But, this does: `CD[scales=0, wind_direction=wind_inds]`, which is bad. So, somehow, the indexing system is effectively treating these two things as different.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-366363419,https://api.github.com/repos/pydata/xarray/issues/1899,366363419,MDEyOklzc3VlQ29tbWVudDM2NjM2MzQxOQ==,291576,2018-02-16T21:28:09Z,2018-02-16T21:28:09Z,CONTRIBUTOR,correction... the problem isn't with pynio... it is in the netcdf4 backend,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-366360382,https://api.github.com/repos/pydata/xarray/issues/1899,366360382,MDEyOklzc3VlQ29tbWVudDM2NjM2MDM4Mg==,291576,2018-02-16T21:15:17Z,2018-02-16T21:15:17Z,CONTRIBUTOR,Something changed. Now the indexing for pynio is forcing a full loading of the data.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-366059694,https://api.github.com/repos/pydata/xarray/issues/1899,366059694,MDEyOklzc3VlQ29tbWVudDM2NjA1OTY5NA==,291576,2018-02-15T20:59:20Z,2018-02-15T20:59:20Z,CONTRIBUTOR,"I can confirm that with the latest changes, the pynio tests now pass locally for me. Now, as to whether or not the tests in there are actually exercising anything useful is a different question.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/issues/1910#issuecomment-365734783,https://api.github.com/repos/pydata/xarray/issues/1910,365734783,MDEyOklzc3VlQ29tbWVudDM2NTczNDc4Mw==,291576,2018-02-14T20:27:38Z,2018-02-14T20:27:38Z,CONTRIBUTOR,"Looking through the travis logs, I do see that pynio is getting installed.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,297227247 https://github.com/pydata/xarray/issues/1910#issuecomment-365734285,https://api.github.com/repos/pydata/xarray/issues/1910,365734285,MDEyOklzc3VlQ29tbWVudDM2NTczNDI4NQ==,291576,2018-02-14T20:25:52Z,2018-02-14T20:25:52Z,CONTRIBUTOR,Zarr tests and pydap tests are also being skipped,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,297227247 https://github.com/pydata/xarray/pull/1899#issuecomment-365729433,https://api.github.com/repos/pydata/xarray/issues/1899,365729433,MDEyOklzc3VlQ29tbWVudDM2NTcyOTQzMw==,291576,2018-02-14T20:07:55Z,2018-02-14T20:07:55Z,CONTRIBUTOR,"I am working on re-activating those tests. I think PyNio is now available for python3, too. On Wed, Feb 14, 2018 at 2:59 PM, Joe Hamman wrote: > @WeatherGod - you are right, all the > pynio tests are being skipped on travis. I'll open a separate issue for > that. Yikes! > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > , or mute > the thread > > . > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-365722413,https://api.github.com/repos/pydata/xarray/issues/1899,365722413,MDEyOklzc3VlQ29tbWVudDM2NTcyMjQxMw==,291576,2018-02-14T19:43:07Z,2018-02-14T19:43:07Z,CONTRIBUTOR,"It looks like the pynio backend isn't regularly tested, as several of them currently fail when I run the tests locally. Some of them are failing because they are asserting NotImplementedErrors that are now implemented.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-365708385,https://api.github.com/repos/pydata/xarray/issues/1899,365708385,MDEyOklzc3VlQ29tbWVudDM2NTcwODM4NQ==,291576,2018-02-14T18:55:43Z,2018-02-14T18:55:43Z,CONTRIBUTOR,"Just did some more debugging, putting in some debug statements within `NioArrayWrapper.__getitem__()`: ``` diff --git a/xarray/backends/pynio_.py b/xarray/backends/pynio_.py index c7e0ddf..b9f7151 100644 --- a/xarray/backends/pynio_.py +++ b/xarray/backends/pynio_.py @@ -27,16 +27,24 @@ class NioArrayWrapper(BackendArray): return self.datastore.ds.variables[self.variable_name] def __getitem__(self, key): + import logging + logger = logging.getLogger(__name__) + logger.addHandler(logging.NullHandler()) + logger.debug(""initial key: %s"", key) key, np_inds = indexing.decompose_indexer(key, self.shape, mode='outer') + logger.debug(""Decomposed indexers:\n%s\n%s"", key, np_inds) with self.datastore.ensure_open(autoclose=True): array = self.get_array() + logger.debug(""initial array: %r"", array) if key == () and self.ndim == 0: return array.get_value() for ind in np_inds: + logger.debug(""indexer: %s"", ind) array = indexing.NumpyIndexingAdapter(array)[ind] + logger.debug(""intermediate array: %r"", array) return array ``` And here is the test script (data not included): ``` import logging import xarray as xr logging.basicConfig(level=logging.DEBUG) fname1 = '../hrrr.t12z.wrfnatf02.grib2' ds = xr.open_dataset(fname1, engine='pynio') subset_isel = ds.isel(lv_HYBL0=7) sp = subset_isel['UGRD_P0_L105_GLC0'].values.shape ``` And here is the relevant output: ``` DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((slice(None, None, None),)) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((slice(None, None, None),)) () DEBUG:xarray.backends.pynio_:initial array: DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((slice(None, None, None),)) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((slice(None, None, None),)) () DEBUG:xarray.backends.pynio_:initial array: DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((slice(None, None, None),)) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((slice(None, None, None),)) () DEBUG:xarray.backends.pynio_:initial array: DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((slice(None, None, None),)) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((slice(None, None, None),)) () DEBUG:xarray.backends.pynio_:initial array: DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((7, slice(None, None, None), slice(None, None, None))) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((7, slice(None, None, None), slice(None, None, None))) () DEBUG:xarray.backends.pynio_:initial array: DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((7, slice(None, None, None), slice(None, None, None))) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((7, slice(None, None, None), slice(None, None, None))) () DEBUG:xarray.backends.pynio_:initial array: (50, 1059, 1799) ``` So, the `BasicIndexer((7, slice(None, None, None), slice(None, None, None)))` isn't getting decomposed correctly, it looks like?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-365692868,https://api.github.com/repos/pydata/xarray/issues/1899,365692868,MDEyOklzc3VlQ29tbWVudDM2NTY5Mjg2OA==,291576,2018-02-14T18:02:17Z,2018-02-14T18:06:24Z,CONTRIBUTOR,"Ah, interesting... so, this dataset was created by doing an isel() on the original: ``` >>> ds['UGRD_P0_L105_GLC0'] [95257050 values with dtype=float32] Coordinates: * lv_HYBL0 (lv_HYBL0) float32 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 ... gridlat_0 (ygrid_0, xgrid_0) float32 ... gridlon_0 (ygrid_0, xgrid_0) float32 ... Dimensions without coordinates: ygrid_0, xgrid_0 ``` So, the original data has a 50x1059x1799 grid, and the new indexer isn't properly composing the indexer so that it fetches [7, slice(None), slice(None)] when I grab it's `.values`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-365689883,https://api.github.com/repos/pydata/xarray/issues/1899,365689883,MDEyOklzc3VlQ29tbWVudDM2NTY4OTg4Mw==,291576,2018-02-14T17:52:24Z,2018-02-14T17:52:24Z,CONTRIBUTOR,"I can also confirm that the shape comes out correctly using master, so this is definitely isolated to this PR.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-365689003,https://api.github.com/repos/pydata/xarray/issues/1899,365689003,MDEyOklzc3VlQ29tbWVudDM2NTY4OTAwMw==,291576,2018-02-14T17:49:20Z,2018-02-14T17:49:20Z,CONTRIBUTOR,"Hmm, came across a bug with the pynio backend. Working on making a reproducible example, but just for your own inspection, here is some logging output: ``` Dimensions: (xgrid_0: 1799, ygrid_0: 1059) Coordinates: lv_HYBL0 float32 8.0 longitude (ygrid_0, xgrid_0) float32 ... latitude (ygrid_0, xgrid_0) float32 ... Dimensions without coordinates: xgrid_0, ygrid_0 Data variables: UGRD (ygrid_0, xgrid_0) float32 ... VGRD (ygrid_0, xgrid_0) float32 ... DEBUG:hiresWind.downscale:shape of a data: (50, 1059, 1799) ``` The first bit is the repr of my DataSet. The last line is output of `ds['UGRD'].values.shape`. It is supposed to be 3D, not 2D. If I revert back to v0.10.0, then the shape is (1059, 1799}, just as expected.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-365657502,https://api.github.com/repos/pydata/xarray/issues/1899,365657502,MDEyOklzc3VlQ29tbWVudDM2NTY1NzUwMg==,291576,2018-02-14T16:13:16Z,2018-02-14T16:13:16Z,CONTRIBUTOR,"Oh, wow... this worked like a charm for the netcdf4 backend! I have a ~13GB (uncompressed) 4-D netcdf4 variable that was giving me trouble for slicing a 2D surface out of. Here is a snippet where I am grabbing data at random indices in the last dimension. First for a specific latitude, then for the entire domain. ``` >>> CD_subset = rough['CD'][0] >>> wind_inds_decorated array([[33, 15, 25, ..., 52, 66, 35], [ 6, 8, 55, ..., 59, 6, 50], [54, 2, 40, ..., 32, 19, 9], ..., [53, 18, 23, ..., 19, 3, 43], [ 9, 11, 66, ..., 51, 39, 58], [21, 54, 37, ..., 3, 0, 65]]) Dimensions without coordinates: latitude, longitude >>> foo = CD_subset.isel(latitude=0, wind_direction=wind_inds_decorated[0]) >>> foo array([ 0.004052, 0.005915, 0.002771, ..., 0.005604, 0.004715, 0.002756], dtype=float32) Coordinates: scales int16 60 latitude float64 54.99 * longitude (longitude) float64 -130.0 -130.0 -130.0 -130.0 -130.0 ... wind_direction (longitude) int16 165 75 125 5 235 345 315 175 85 35 290 ... >>> foo = CD_subset.isel(wind_direction=wind_inds_decorated) >>> foo [24510501 values with dtype=float32] Coordinates: scales int16 60 * latitude (latitude) float64 54.99 54.98 54.97 54.96 54.95 54.95 ... * longitude (longitude) float64 -130.0 -130.0 -130.0 -130.0 -130.0 ... wind_direction (latitude, longitude) int64 165 75 125 5 235 345 315 175 ... ``` All previous attempts at this would result in having to load the entire 13GB array into memory just to get 93.5 MB out. Or, I would try to fetch each individual point, which took way too long. This worked faster than loading the entire thing into memory, and it used less memory, too (I think I maxed out at about 1.2GB of total usage, which is totally acceptable for my use case). I will try out similar things with the pynio and rasterio backends, and get back to you. Thanks for this work!","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/issues/1720#issuecomment-345310488,https://api.github.com/repos/pydata/xarray/issues/1720,345310488,MDEyOklzc3VlQ29tbWVudDM0NTMxMDQ4OA==,291576,2017-11-17T17:33:13Z,2017-11-17T17:33:13Z,CONTRIBUTOR,Awesome! Thanks!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,274308380 https://github.com/pydata/xarray/issues/1720#issuecomment-345124033,https://api.github.com/repos/pydata/xarray/issues/1720,345124033,MDEyOklzc3VlQ29tbWVudDM0NTEyNDAzMw==,291576,2017-11-17T02:08:50Z,2017-11-17T02:08:50Z,CONTRIBUTOR,"Is there a convenient sentinel I can check for loaded-ness? The only reason I noticed this was I was debugging another problem with my processing of HRRR files (~600mb each) and the memory usage shot up (did you know that `top` will report memory usage as fractions of terabytes when you get high enough?). I could test this with some smaller netcdf4 files if I could just loop through the variables and assert some sentinal. On Thu, Nov 16, 2017 at 8:57 PM, Stephan Hoyer wrote: > @WeatherGod can you verify that you don't > get immediate loading when loading netCDF files, e.g., with scipy or > netCDF4-python? > > We did change how loading of data works with printing in this release ( > #1532 ), but if anything the > changes should go the other way, to do less loading of data. > > I'm having trouble debugging this locally because I can't seem to get a > working version of pynio installed from conda-forge on OS X (running into > various ABI incompatibility issues when I try this in a new conda > environment). > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > , or mute > the thread > > . > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,274308380 https://github.com/pydata/xarray/issues/475#issuecomment-342576941,https://api.github.com/repos/pydata/xarray/issues/475,342576941,MDEyOklzc3VlQ29tbWVudDM0MjU3Njk0MQ==,291576,2017-11-07T18:29:12Z,2017-11-07T18:29:12Z,CONTRIBUTOR,"Yeah, we need to move something forward, because the main benefit of xarray is the ability to manage datasets from multiple sources in a consistent way. And data from different sources will almost always be in different projections. My current problem that I need to solve right now is that I am ingesting model data that is in a LCC projection and ingesting radar data that is in a simple regular lat/lon grid. Both dataset objects have latitude and longitude coordinate arrays, I just need to get both datasets to have the same lat/lon grid. I guess I could continue using my old scipy-based solution (using map_coordinates() or RectBivariateSpline), but at the very least, it would make sense to have some documentation demonstrating how one might go about this very common problem, even if it is showing how to use the scipy-based tools with xarrays. If that is of interest, I can see what I can write up after I am done my immediate task.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,95114700 https://github.com/pydata/xarray/issues/475#issuecomment-342553465,https://api.github.com/repos/pydata/xarray/issues/475,342553465,MDEyOklzc3VlQ29tbWVudDM0MjU1MzQ2NQ==,291576,2017-11-07T17:11:49Z,2017-11-07T17:11:49Z,CONTRIBUTOR,"So, what has become the consensus for performing regridding/resampling? I see a lot of suggestions, but I have no sense of what is mature enough to use in production-level code. I also haven't seen anything in the documentation about this topic, even if it just refers people to another project.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,95114700 https://github.com/pydata/xarray/pull/459#issuecomment-147797539,https://api.github.com/repos/pydata/xarray/issues/459,147797539,MDEyOklzc3VlQ29tbWVudDE0Nzc5NzUzOQ==,291576,2015-10-13T18:03:56Z,2015-10-13T18:03:56Z,CONTRIBUTOR,"That's all the time I have at the moment. I do have some more notes from my old, incomplete implementation, though. I'll try to finish the review tomorrow. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,94100328 https://github.com/pydata/xarray/issues/615#issuecomment-146976549,https://api.github.com/repos/pydata/xarray/issues/615,146976549,MDEyOklzc3VlQ29tbWVudDE0Njk3NjU0OQ==,291576,2015-10-09T20:15:49Z,2015-10-09T20:15:49Z,CONTRIBUTOR,"hmm, good point. I wish I knew why I ended up using `pd.to_timedelta()` in the first place. Did numpy not support converting timedelta objects at one point? ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,110726841 https://github.com/pydata/xarray/issues/268#issuecomment-60429213,https://api.github.com/repos/pydata/xarray/issues/268,60429213,MDEyOklzc3VlQ29tbWVudDYwNDI5MjEz,291576,2014-10-24T18:27:30Z,2014-10-24T18:27:30Z,CONTRIBUTOR,"Note, I mean that I at first thought that collapsing variables into scalars was a useful feature, not that it would happen only for datasets and not data arrays. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,46768521 https://github.com/pydata/xarray/issues/267#issuecomment-60425242,https://api.github.com/repos/pydata/xarray/issues/267,60425242,MDEyOklzc3VlQ29tbWVudDYwNDI1MjQy,291576,2014-10-24T17:58:37Z,2014-10-24T17:58:37Z,CONTRIBUTOR,"So, is the string approach I used above to grab a single day's data a bug or a feature? It is a nice short-hand, but I don't want to rely on it if it isn't intended to be a feature. Similarly, if I supply a Year-Month string, I get data for that month. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,46756880 https://github.com/pydata/xarray/issues/267#issuecomment-60413505,https://api.github.com/repos/pydata/xarray/issues/267,60413505,MDEyOklzc3VlQ29tbWVudDYwNDEzNTA1,291576,2014-10-24T16:37:26Z,2014-10-24T16:37:26Z,CONTRIBUTOR,"Gah, I am sorry, please disregard my last comment. I can't add/subtract... ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,46756880 https://github.com/pydata/xarray/issues/267#issuecomment-60413356,https://api.github.com/repos/pydata/xarray/issues/267,60413356,MDEyOklzc3VlQ29tbWVudDYwNDEzMzU2,291576,2014-10-24T16:36:18Z,2014-10-24T16:36:18Z,CONTRIBUTOR,"A bit of a further wrinkle is that date selection seems to be limited to local time only because of this limitation. Consider the following: ``` >>> c['time'][:25] array(['2013-01-01T06:15:00.000000000-0500', '2013-01-01T07:00:00.000000000-0500', '2013-01-01T08:00:00.000000000-0500', '2013-01-01T09:00:00.000000000-0500', '2013-01-01T10:00:00.000000000-0500', '2013-01-01T11:00:00.000000000-0500', '2013-01-01T12:00:00.000000000-0500', '2013-01-01T13:00:00.000000000-0500', '2013-01-01T14:00:00.000000000-0500', '2013-01-01T15:00:00.000000000-0500', '2013-01-01T16:00:00.000000000-0500', '2013-01-01T17:00:00.000000000-0500', '2013-01-01T18:00:00.000000000-0500', '2013-01-01T19:00:00.000000000-0500', '2013-01-01T20:00:00.000000000-0500', '2013-01-01T21:00:00.000000000-0500', '2013-01-01T22:00:00.000000000-0500', '2013-01-01T23:00:00.000000000-0500', '2013-01-02T00:00:00.000000000-0500', '2013-01-02T01:00:00.000000000-0500', '2013-01-02T02:00:00.000000000-0500', '2013-01-02T03:00:00.000000000-0500', '2013-01-02T04:00:00.000000000-0500', '2013-01-02T05:00:00.000000000-0500', '2013-01-02T06:00:00.000000000-0500'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] 2013-01-01T11:15:00 ... latitude float32 64.833 elevation float32 137.5 longitude float32 -147.6 >>> c.sel(time='2013-01-01')['time'] array(['2013-01-01T06:15:00.000000000-0500', '2013-01-01T07:00:00.000000000-0500', '2013-01-01T08:00:00.000000000-0500', '2013-01-01T09:00:00.000000000-0500', '2013-01-01T10:00:00.000000000-0500', '2013-01-01T11:00:00.000000000-0500', '2013-01-01T12:00:00.000000000-0500', '2013-01-01T13:00:00.000000000-0500', '2013-01-01T14:00:00.000000000-0500', '2013-01-01T15:00:00.000000000-0500', '2013-01-01T16:00:00.000000000-0500', '2013-01-01T17:00:00.000000000-0500', '2013-01-01T18:00:00.000000000-0500'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] 2013-01-01T11:15:00 ... latitude float32 64.833 elevation float32 137.5 longitude float32 -147.6 ``` I don't know how I would (easily) slice this data array such as to grab only data for a UTC day. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,46756880 https://github.com/pydata/xarray/issues/185#issuecomment-60404650,https://api.github.com/repos/pydata/xarray/issues/185,60404650,MDEyOklzc3VlQ29tbWVudDYwNDA0NjUw,291576,2014-10-24T15:37:00Z,2014-10-24T15:37:00Z,CONTRIBUTOR,"May I propose a name? xray.glasses ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,38109425 https://github.com/pydata/xarray/issues/264#issuecomment-60399616,https://api.github.com/repos/pydata/xarray/issues/264,60399616,MDEyOklzc3VlQ29tbWVudDYwMzk5NjE2,291576,2014-10-24T15:04:23Z,2014-10-24T15:04:23Z,CONTRIBUTOR,"I should note that if an inner join is performed, then no NaNs are inserted and the arrays remain float32. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,46745063 https://github.com/pydata/xarray/issues/214#issuecomment-58570858,https://api.github.com/repos/pydata/xarray/issues/214,58570858,MDEyOklzc3VlQ29tbWVudDU4NTcwODU4,291576,2014-10-09T20:19:12Z,2014-10-09T20:19:12Z,CONTRIBUTOR,"Ok, I think I got it (for reals this time...) ``` def bcast(spat_only, coord_names): coords = [] for i, n in enumerate(coord_names): if spat_only[n].ndim != len(spat_only.dims): # Needs new axes slices = [np.newaxis] * len(spat_only.dims) slices[i] = slice(None) else: slices = [slice(None)] * len(spat_only.dims) coords.append(spat_only[n].values[slices]) return np.broadcast_arrays(*coords) def grid_to_points2(grid, points, coord_names): if not coord_names: raise ValueError(""No coordinate names provided"") spat_dims = {d for n in coord_names for d in grid[n].dims} not_spatial = set(grid.dims) - spat_dims spatial_selection = {n:0 for n in not_spatial} spat_only = grid.isel(**spatial_selection) coords = bcast(spat_only, coord_names) kd = KDTree(zip(*[c.ravel() for c in coords])) _, indx = kd.query(zip(*[points[n].values for n in coord_names])) indx = np.unravel_index(indx, coords[0].shape) return xray.concat( (grid.isel(**{n:j for n, j in zip(spat_only.dims, i)}) for i in zip(*indx)), dim='station') ``` Needs a lot more tests and comments and such, but I think this works. Best part is that it seems to do a very decent job of keeping memory usage low, and only operates upon the coordinates that I specify. Everything else is left alone. So, I have used this on 4-D data, picking out grid points at specified lat/lon positions, and get back a 3D result (time, level, station). And I have used this on just 2D data, getting back just a 1D result (dimension='station'). ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,40395257 https://github.com/pydata/xarray/issues/214#issuecomment-58568933,https://api.github.com/repos/pydata/xarray/issues/214,58568933,MDEyOklzc3VlQ29tbWVudDU4NTY4OTMz,291576,2014-10-09T20:05:01Z,2014-10-09T20:05:01Z,CONTRIBUTOR,"Consider the following Dataset: ``` Dimensions: (lv_HTGL1: 2, lv_HTGL3: 2, lv_HTGL5: 2, lv_HTGL6: 2, lv_ISBL0: 37, lv_SPDL2: 6, lv_SPDL4: 3, time: 9, xgrid_0: 451, ygrid_0: 337) Coordinates: * xgrid_0 (xgrid_0) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ... * ygrid_0 (ygrid_0) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ... * lv_ISBL0 (lv_ISBL0) float32 10000.0 12500.0 15000.0 17500.0 20000.0 ... * lv_HTGL6 (lv_HTGL6) float32 1000.0 4000.0 * lv_HTGL1 (lv_HTGL1) float32 2.0 80.0 * lv_HTGL3 (lv_HTGL3) float32 10.0 80.0 latitude (ygrid_0, xgrid_0) float32 16.281 16.3084 16.3356 16.3628 16.3898 ... longitude (ygrid_0, xgrid_0) float32 233.862 233.984 234.106 234.229 ... * lv_HTGL5 (lv_HTGL5) int64 0 1 * lv_SPDL2 (lv_SPDL2) int64 0 1 2 3 4 5 * lv_SPDL4 (lv_SPDL4) int64 0 1 2 * time (time) datetime64[ns] 2014-09-25T01:00:00 ... Variables: gridrot_0 (ygrid_0, xgrid_0) float32 -0.229676 -0.228775 -0.227873 ... TMP_P0_L103_GLC0 (time, lv_HTGL1, ygrid_0, xgrid_0) float64 295.8 295.7 295.7 295.7 ... ``` The latitude and longitude variables are both dependent upon xgrid_0 and ygrid_0. Meanwhile... ``` Dimensions: (station: 120, time: 4) Coordinates: latitude (station) float32 34.805 34.795 34.585 36.705 34.245 34.915 34.195 36.075 ... * station (station) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 ... sixhourly (time) int64 0 1 2 3 longitude (station) float32 -98.025 -96.665 -99.335 -98.705 -95.665 -98.295 ... * time (time) datetime64[ns] 2014-10-07 2014-10-07T06:00:00 ... Variables: MaxGust (station, time) float64 7.794 7.47 8.675 4.788 7.071 7.903 8.641 5.533 ... ``` the latitude and longitude variables are independent of each other (they are 1-D). The variable in the first one can not be accessed directly by lat/lon values, while the MaxGust variable in the second one can. This poses some difficulties. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,40395257 https://github.com/pydata/xarray/issues/214#issuecomment-58565934,https://api.github.com/repos/pydata/xarray/issues/214,58565934,MDEyOklzc3VlQ29tbWVudDU4NTY1OTM0,291576,2014-10-09T19:43:08Z,2014-10-09T19:43:08Z,CONTRIBUTOR,"Hmmm, limitation that I just encountered. When there are dependent coordinates, the variables representing those coordinates are not the index arrays (and thus, are not ""dimensions"" either), so my solution is completely broken for dependent coordinates. If I were to go back to my DataArray-only solution, then I still need to correct the code to use the dimension names of the coordinate variables, and still need to fix the coordinates != dimensions issue. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,40395257 https://github.com/pydata/xarray/issues/214#issuecomment-58562506,https://api.github.com/repos/pydata/xarray/issues/214,58562506,MDEyOklzc3VlQ29tbWVudDU4NTYyNTA2,291576,2014-10-09T19:16:52Z,2014-10-09T19:16:52Z,CONTRIBUTOR,"to/from_dateframe just ate up all my memory. I think I am going to stick with my broadcasting approach... ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,40395257 https://github.com/pydata/xarray/issues/214#issuecomment-58558069,https://api.github.com/repos/pydata/xarray/issues/214,58558069,MDEyOklzc3VlQ29tbWVudDU4NTU4MDY5,291576,2014-10-09T18:47:22Z,2014-10-09T18:47:22Z,CONTRIBUTOR,"oooh, didn't realize that `dims` is different for DataSet and DataArray... Gonna have to fix that, too. I am checking out the broadcasting functions you pointed out. The one limitation I see right away with xray.core.variable.broadcast_variables is that it is limited to two variables (presumedly, I would be broadcasting N number of coordinates because the variables may or may not have extraneous dimensions that I don't care to broadcast) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,40395257 https://github.com/pydata/xarray/issues/214#issuecomment-58553935,https://api.github.com/repos/pydata/xarray/issues/214,58553935,MDEyOklzc3VlQ29tbWVudDU4NTUzOTM1,291576,2014-10-09T18:21:16Z,2014-10-09T18:21:16Z,CONTRIBUTOR,"And, actually, the example I gave above has a bug in the dependent dimension case. This one should be much better (not fully tested yet, though): ``` def grid_to_points2(grid, points, coord_names): if not coord_names: raise ValueError(""No coordinate names provided"") not_spatial = set(grid.dims) - set(coord_names) spatial_selection = {n:0 for n in not_spatial} spat_only = grid.isel(**spatial_selection) coords = [] for i, n in enumerate(spat_only.dims): if spat_only[n].ndim != len(spat_only.dims): # Needs new axes slices = [np.newaxis] * len(spat_only.dims) slices[i] = slice(None) else: slices = [slice(None)] * len(spat_only.dims) coords.append(spat_only[n].values[slices]) coords = np.broadcast_arrays(*coords) kd = KDTree(zip(*[c.flatten() for c in coords])) _, indx = kd.query(zip(*[points[n].values for n in spat_only.dims])) indx = np.unravel_index(indx, coords[0].shape) return xray.concat( (grid.sel(**{n:c[i] for n, c in zip(spat_only.dims, coords)}) for i in zip(*indx)), dim='station') ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,40395257 https://github.com/pydata/xarray/issues/214#issuecomment-58551759,https://api.github.com/repos/pydata/xarray/issues/214,58551759,MDEyOklzc3VlQ29tbWVudDU4NTUxNzU5,291576,2014-10-09T18:06:56Z,2014-10-09T18:06:56Z,CONTRIBUTOR,"And, I think I just realized how I could generalize it even more. Right now, `grid` can only be a DataArray, but I would like this to work for a DataSet as well. I bet if I use .sel() instead of .isel() and access the elements of the broadcasted arrays, I could make this work very nicely for both DataArray and DataSet. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,40395257 https://github.com/pydata/xarray/issues/214#issuecomment-58550741,https://api.github.com/repos/pydata/xarray/issues/214,58550741,MDEyOklzc3VlQ29tbWVudDU4NTUwNzQx,291576,2014-10-09T18:00:33Z,2014-10-09T18:00:33Z,CONTRIBUTOR,"Oh, and it does take advantage of a bunch of python2.7 features such as dictionary comprehensions and generator statements, so... ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,40395257 https://github.com/pydata/xarray/issues/214#issuecomment-58550403,https://api.github.com/repos/pydata/xarray/issues/214,58550403,MDEyOklzc3VlQ29tbWVudDU4NTUwNDAz,291576,2014-10-09T17:58:25Z,2014-10-09T17:58:25Z,CONTRIBUTOR,"Starting using the above snippet for more datasets, some with interdependent coordinates and some without (so the coordinates would be 1-d). I think I have generalized it significantly... ``` def grid_to_points(grid, points, coord_names): not_spatial = set(grid.dims) - set(coord_names) spatial_selection = {n:0 for n in not_spatial} spat_only = grid.isel(**spatial_selection) coords = [] for i, n in enumerate(spat_only.dims): if spat_only[n].ndim != len(spat_only.dims): # Needs new axes slices = [np.newaxis] * len(spat_only.dims) slices[i] = slice(None) else: slices = [slice(None)] * len(spat_only.dims) coords.append(spat_only[n].values[slices]) coords = [c.flatten() for c in np.broadcast_arrays(*coords)] kd = KDTree(zip(*coords)) _, indx = kd.query(zip(*[points[n].values for n in spat_only.dims])) indx = np.unravel_index(indx, spat_only.shape) return xray.concat((grid.isel(**{n:j for n, j in zip(spat_only.dims, i)}) for i in zip(*indx)), dim='station') ``` I can still imagine some situations where this won't work, such as a requested set of dimensions that are a mix of dependent and independent variables. Currently, if the dimensions are independent, then the number of dimensions of each one is assumed to be 1 and np.newaxis is used for the others. Meanwhile, if the dimensions are dependent, then the number of dimensions for each one is assumed to be the same as the number of dependent variables and is merely flattened (the broadcast is essentially no-op). I should also note that this is technically not restricted to spatial coordinates even though the code says so. Just anything that can be represented in euclidean space. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,40395257 https://github.com/pydata/xarray/issues/214#issuecomment-57857522,https://api.github.com/repos/pydata/xarray/issues/214,57857522,MDEyOklzc3VlQ29tbWVudDU3ODU3NTIy,291576,2014-10-03T20:48:35Z,2014-10-03T20:48:35Z,CONTRIBUTOR,"Just managed to implement this using your suggestion for my data: ``` from scipy.spatial import cKDTree as KDTree kd = KDTree(zip(model['longitude'].values.ravel(), model['latitude'].values.ravel())) dists, indx = kd.query(zip(obs['longitude'], obs['latitude'])) indx = np.unravel_index(indx, mod['longitude'].shape) mod_points = xray.concat([mod.isel(x=x, y=y) for y, x in zip(*indx)], dim='station') ``` Not entirely certain why I needed to reverse y and x in that last part, but, oh well... ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,40395257 https://github.com/pydata/xarray/issues/214#issuecomment-57847940,https://api.github.com/repos/pydata/xarray/issues/214,57847940,MDEyOklzc3VlQ29tbWVudDU3ODQ3OTQw,291576,2014-10-03T19:56:16Z,2014-10-03T19:56:16Z,CONTRIBUTOR,"Unless I am missing something about xray, that selection operation could only work if `pts` had values that exactly matched coordinate values in `ds`. In most scenarios, that would not be the case. One would have to first build `pts` from a computation of nearest-neighbor indexs between the stations and the model grid. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,40395257