html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/pull/1899#issuecomment-374351614,https://api.github.com/repos/pydata/xarray/issues/1899,374351614,MDEyOklzc3VlQ29tbWVudDM3NDM1MTYxNA==,221526,2018-03-19T20:01:29Z,2018-03-19T20:01:29Z,CONTRIBUTOR,"So did this remove/rename `LazilyIndexedArray` in 0.10.2? Because I'm getting an attribute in the custom xarray backend I wrote: https://github.com/Unidata/siphon/blob/master/siphon/cdmr/xarray_support.py I don't mind updating, but I wanted to make sure this was intentional.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-370986433,https://api.github.com/repos/pydata/xarray/issues/1899,370986433,MDEyOklzc3VlQ29tbWVudDM3MDk4NjQzMw==,291576,2018-03-07T01:08:36Z,2018-03-07T01:08:36Z,CONTRIBUTOR,:tada: ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-367077311,https://api.github.com/repos/pydata/xarray/issues/1899,367077311,MDEyOklzc3VlQ29tbWVudDM2NzA3NzMxMQ==,291576,2018-02-20T18:43:56Z,2018-02-20T18:43:56Z,CONTRIBUTOR,"I did some more investigation into the memory usage problem I was having. I had assumed that the vectorized indexed result of a lazily indexed data array would be an in-memory array. So, when I then started to use the result, it was then doing a read of all the data at once, resulting in a near-complete load of the data into memory. I have adjusted my code to chunk out the indexing in order to keep the memory usage under control at reasonable performance penalty. I haven't looked into trying to identify the ideal chunking scheme to follow for an arbitrary dataarray and indexing. Perhaps we can make that a task for another day. At this point, I am satisfied with the features (negative step-sizes aside, of course).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-366379465,https://api.github.com/repos/pydata/xarray/issues/1899,366379465,MDEyOklzc3VlQ29tbWVudDM2NjM3OTQ2NQ==,291576,2018-02-16T22:40:06Z,2018-02-16T22:40:06Z,CONTRIBUTOR,"Ah-hah! Ok, so, the problem isn't some weird difference between the two examples I gave. The issue is that calling `np.asarray(foo)` triggered a full loading of the data!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-366376400,https://api.github.com/repos/pydata/xarray/issues/1899,366376400,MDEyOklzc3VlQ29tbWVudDM2NjM3NjQwMA==,291576,2018-02-16T22:25:59Z,2018-02-16T22:25:59Z,CONTRIBUTOR,huh... now I am not so sure about that... must be something else triggering the load.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-366374917,https://api.github.com/repos/pydata/xarray/issues/1899,366374917,MDEyOklzc3VlQ29tbWVudDM2NjM3NDkxNw==,291576,2018-02-16T22:19:08Z,2018-02-16T22:19:08Z,CONTRIBUTOR,"also, at this point, I don't know if this is limited to the netcdf4 backend, as this type of indexing was only done on a variable I have in a netcdf file. I don't have 4-D variables in other file types.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-366374041,https://api.github.com/repos/pydata/xarray/issues/1899,366374041,MDEyOklzc3VlQ29tbWVudDM2NjM3NDA0MQ==,291576,2018-02-16T22:14:49Z,2018-02-16T22:14:49Z,CONTRIBUTOR,"`CD` by the way, has dimensions of `scales, latitude, longitude, wind_direction`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-366373479,https://api.github.com/repos/pydata/xarray/issues/1899,366373479,MDEyOklzc3VlQ29tbWVudDM2NjM3MzQ3OQ==,291576,2018-02-16T22:12:18Z,2018-02-16T22:12:18Z,CONTRIBUTOR,"Ah, not a change in behavior, but a possible bug exposed by a tiny change on my part. So, I have a 4D data array, `CD` and a data array for indexing, `wind_inds`. The following does not trigger a full loading: `CD[0][wind_direction=wind_inds]`, which is good! But, this does: `CD[scales=0, wind_direction=wind_inds]`, which is bad. So, somehow, the indexing system is effectively treating these two things as different.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-366363419,https://api.github.com/repos/pydata/xarray/issues/1899,366363419,MDEyOklzc3VlQ29tbWVudDM2NjM2MzQxOQ==,291576,2018-02-16T21:28:09Z,2018-02-16T21:28:09Z,CONTRIBUTOR,correction... the problem isn't with pynio... it is in the netcdf4 backend,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-366360382,https://api.github.com/repos/pydata/xarray/issues/1899,366360382,MDEyOklzc3VlQ29tbWVudDM2NjM2MDM4Mg==,291576,2018-02-16T21:15:17Z,2018-02-16T21:15:17Z,CONTRIBUTOR,Something changed. Now the indexing for pynio is forcing a full loading of the data.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-366059694,https://api.github.com/repos/pydata/xarray/issues/1899,366059694,MDEyOklzc3VlQ29tbWVudDM2NjA1OTY5NA==,291576,2018-02-15T20:59:20Z,2018-02-15T20:59:20Z,CONTRIBUTOR,"I can confirm that with the latest changes, the pynio tests now pass locally for me. Now, as to whether or not the tests in there are actually exercising anything useful is a different question.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-365729433,https://api.github.com/repos/pydata/xarray/issues/1899,365729433,MDEyOklzc3VlQ29tbWVudDM2NTcyOTQzMw==,291576,2018-02-14T20:07:55Z,2018-02-14T20:07:55Z,CONTRIBUTOR,"I am working on re-activating those tests. I think PyNio is now available for python3, too. On Wed, Feb 14, 2018 at 2:59 PM, Joe Hamman wrote: > @WeatherGod - you are right, all the > pynio tests are being skipped on travis. I'll open a separate issue for > that. Yikes! > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > , or mute > the thread > > . > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-365722413,https://api.github.com/repos/pydata/xarray/issues/1899,365722413,MDEyOklzc3VlQ29tbWVudDM2NTcyMjQxMw==,291576,2018-02-14T19:43:07Z,2018-02-14T19:43:07Z,CONTRIBUTOR,"It looks like the pynio backend isn't regularly tested, as several of them currently fail when I run the tests locally. Some of them are failing because they are asserting NotImplementedErrors that are now implemented.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-365708385,https://api.github.com/repos/pydata/xarray/issues/1899,365708385,MDEyOklzc3VlQ29tbWVudDM2NTcwODM4NQ==,291576,2018-02-14T18:55:43Z,2018-02-14T18:55:43Z,CONTRIBUTOR,"Just did some more debugging, putting in some debug statements within `NioArrayWrapper.__getitem__()`: ``` diff --git a/xarray/backends/pynio_.py b/xarray/backends/pynio_.py index c7e0ddf..b9f7151 100644 --- a/xarray/backends/pynio_.py +++ b/xarray/backends/pynio_.py @@ -27,16 +27,24 @@ class NioArrayWrapper(BackendArray): return self.datastore.ds.variables[self.variable_name] def __getitem__(self, key): + import logging + logger = logging.getLogger(__name__) + logger.addHandler(logging.NullHandler()) + logger.debug(""initial key: %s"", key) key, np_inds = indexing.decompose_indexer(key, self.shape, mode='outer') + logger.debug(""Decomposed indexers:\n%s\n%s"", key, np_inds) with self.datastore.ensure_open(autoclose=True): array = self.get_array() + logger.debug(""initial array: %r"", array) if key == () and self.ndim == 0: return array.get_value() for ind in np_inds: + logger.debug(""indexer: %s"", ind) array = indexing.NumpyIndexingAdapter(array)[ind] + logger.debug(""intermediate array: %r"", array) return array ``` And here is the test script (data not included): ``` import logging import xarray as xr logging.basicConfig(level=logging.DEBUG) fname1 = '../hrrr.t12z.wrfnatf02.grib2' ds = xr.open_dataset(fname1, engine='pynio') subset_isel = ds.isel(lv_HYBL0=7) sp = subset_isel['UGRD_P0_L105_GLC0'].values.shape ``` And here is the relevant output: ``` DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((slice(None, None, None),)) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((slice(None, None, None),)) () DEBUG:xarray.backends.pynio_:initial array: DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((slice(None, None, None),)) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((slice(None, None, None),)) () DEBUG:xarray.backends.pynio_:initial array: DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((slice(None, None, None),)) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((slice(None, None, None),)) () DEBUG:xarray.backends.pynio_:initial array: DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((slice(None, None, None),)) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((slice(None, None, None),)) () DEBUG:xarray.backends.pynio_:initial array: DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((7, slice(None, None, None), slice(None, None, None))) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((7, slice(None, None, None), slice(None, None, None))) () DEBUG:xarray.backends.pynio_:initial array: DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((7, slice(None, None, None), slice(None, None, None))) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((7, slice(None, None, None), slice(None, None, None))) () DEBUG:xarray.backends.pynio_:initial array: (50, 1059, 1799) ``` So, the `BasicIndexer((7, slice(None, None, None), slice(None, None, None)))` isn't getting decomposed correctly, it looks like?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-365692868,https://api.github.com/repos/pydata/xarray/issues/1899,365692868,MDEyOklzc3VlQ29tbWVudDM2NTY5Mjg2OA==,291576,2018-02-14T18:02:17Z,2018-02-14T18:06:24Z,CONTRIBUTOR,"Ah, interesting... so, this dataset was created by doing an isel() on the original: ``` >>> ds['UGRD_P0_L105_GLC0'] [95257050 values with dtype=float32] Coordinates: * lv_HYBL0 (lv_HYBL0) float32 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 ... gridlat_0 (ygrid_0, xgrid_0) float32 ... gridlon_0 (ygrid_0, xgrid_0) float32 ... Dimensions without coordinates: ygrid_0, xgrid_0 ``` So, the original data has a 50x1059x1799 grid, and the new indexer isn't properly composing the indexer so that it fetches [7, slice(None), slice(None)] when I grab it's `.values`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-365689883,https://api.github.com/repos/pydata/xarray/issues/1899,365689883,MDEyOklzc3VlQ29tbWVudDM2NTY4OTg4Mw==,291576,2018-02-14T17:52:24Z,2018-02-14T17:52:24Z,CONTRIBUTOR,"I can also confirm that the shape comes out correctly using master, so this is definitely isolated to this PR.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-365689003,https://api.github.com/repos/pydata/xarray/issues/1899,365689003,MDEyOklzc3VlQ29tbWVudDM2NTY4OTAwMw==,291576,2018-02-14T17:49:20Z,2018-02-14T17:49:20Z,CONTRIBUTOR,"Hmm, came across a bug with the pynio backend. Working on making a reproducible example, but just for your own inspection, here is some logging output: ``` Dimensions: (xgrid_0: 1799, ygrid_0: 1059) Coordinates: lv_HYBL0 float32 8.0 longitude (ygrid_0, xgrid_0) float32 ... latitude (ygrid_0, xgrid_0) float32 ... Dimensions without coordinates: xgrid_0, ygrid_0 Data variables: UGRD (ygrid_0, xgrid_0) float32 ... VGRD (ygrid_0, xgrid_0) float32 ... DEBUG:hiresWind.downscale:shape of a data: (50, 1059, 1799) ``` The first bit is the repr of my DataSet. The last line is output of `ds['UGRD'].values.shape`. It is supposed to be 3D, not 2D. If I revert back to v0.10.0, then the shape is (1059, 1799}, just as expected.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143 https://github.com/pydata/xarray/pull/1899#issuecomment-365657502,https://api.github.com/repos/pydata/xarray/issues/1899,365657502,MDEyOklzc3VlQ29tbWVudDM2NTY1NzUwMg==,291576,2018-02-14T16:13:16Z,2018-02-14T16:13:16Z,CONTRIBUTOR,"Oh, wow... this worked like a charm for the netcdf4 backend! I have a ~13GB (uncompressed) 4-D netcdf4 variable that was giving me trouble for slicing a 2D surface out of. Here is a snippet where I am grabbing data at random indices in the last dimension. First for a specific latitude, then for the entire domain. ``` >>> CD_subset = rough['CD'][0] >>> wind_inds_decorated array([[33, 15, 25, ..., 52, 66, 35], [ 6, 8, 55, ..., 59, 6, 50], [54, 2, 40, ..., 32, 19, 9], ..., [53, 18, 23, ..., 19, 3, 43], [ 9, 11, 66, ..., 51, 39, 58], [21, 54, 37, ..., 3, 0, 65]]) Dimensions without coordinates: latitude, longitude >>> foo = CD_subset.isel(latitude=0, wind_direction=wind_inds_decorated[0]) >>> foo array([ 0.004052, 0.005915, 0.002771, ..., 0.005604, 0.004715, 0.002756], dtype=float32) Coordinates: scales int16 60 latitude float64 54.99 * longitude (longitude) float64 -130.0 -130.0 -130.0 -130.0 -130.0 ... wind_direction (longitude) int16 165 75 125 5 235 345 315 175 85 35 290 ... >>> foo = CD_subset.isel(wind_direction=wind_inds_decorated) >>> foo [24510501 values with dtype=float32] Coordinates: scales int16 60 * latitude (latitude) float64 54.99 54.98 54.97 54.96 54.95 54.95 ... * longitude (longitude) float64 -130.0 -130.0 -130.0 -130.0 -130.0 ... wind_direction (latitude, longitude) int64 165 75 125 5 235 345 315 175 ... ``` All previous attempts at this would result in having to load the entire 13GB array into memory just to get 93.5 MB out. Or, I would try to fetch each individual point, which took way too long. This worked faster than loading the entire thing into memory, and it used less memory, too (I think I maxed out at about 1.2GB of total usage, which is totally acceptable for my use case). I will try out similar things with the pynio and rasterio backends, and get back to you. Thanks for this work!","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295838143