github: issue_comments: 37 rows where issue = 295838143 sorted by updated

37 rows where issue = 295838143 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
374422762	https://github.com/pydata/xarray/pull/1899#issuecomment-374422762	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM3NDQyMjc2Mg==	fujiisoup 6815844	2018-03-19T23:40:52Z	2018-03-19T23:40:52Z	MEMBER	Yes, LazilyIndexedArray was renamed to `LazilyOuterIndexedArray` and `LazilyVectorizedIndexedArray` was newly added. These two backend arrays are selected depending on what kind of indexer is used.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
374351614	https://github.com/pydata/xarray/pull/1899#issuecomment-374351614	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM3NDM1MTYxNA==	dopplershift 221526	2018-03-19T20:01:29Z	2018-03-19T20:01:29Z	CONTRIBUTOR	So did this remove/rename `LazilyIndexedArray` in 0.10.2? Because I'm getting an attribute in the custom xarray backend I wrote: https://github.com/Unidata/siphon/blob/master/siphon/cdmr/xarray_support.py I don't mind updating, but I wanted to make sure this was intentional.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
370986433	https://github.com/pydata/xarray/pull/1899#issuecomment-370986433	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM3MDk4NjQzMw==	WeatherGod 291576	2018-03-07T01:08:36Z	2018-03-07T01:08:36Z	CONTRIBUTOR	:tada:	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
370970309	https://github.com/pydata/xarray/pull/1899#issuecomment-370970309	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM3MDk3MDMwOQ==	fujiisoup 6815844	2018-03-06T23:45:13Z	2018-03-06T23:45:13Z	MEMBER	Thanks, @WeatherGod , for your feedback. This is finally merged!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
370944391	https://github.com/pydata/xarray/pull/1899#issuecomment-370944391	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM3MDk0NDM5MQ==	shoyer 1217238	2018-03-06T22:01:04Z	2018-03-06T22:01:04Z	MEMBER	OK, in it goes. Thanks @fujiisoup !	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
370125916	https://github.com/pydata/xarray/pull/1899#issuecomment-370125916	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM3MDEyNTkxNg==	fujiisoup 6815844	2018-03-03T07:11:24Z	2018-03-03T07:11:24Z	MEMBER	All done :)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
368385680	https://github.com/pydata/xarray/pull/1899#issuecomment-368385680	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2ODM4NTY4MA==	fujiisoup 6815844	2018-02-26T04:16:03Z	2018-02-26T04:16:03Z	MEMBER	I think it's ready :)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
368383877	https://github.com/pydata/xarray/pull/1899#issuecomment-368383877	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2ODM4Mzg3Nw==	jhamman 2443309	2018-02-26T04:00:24Z	2018-02-26T04:00:24Z	MEMBER	@fujiisoup - is this ready for a final review? I see you have all the tests passing 💯 !	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
367077311	https://github.com/pydata/xarray/pull/1899#issuecomment-367077311	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NzA3NzMxMQ==	WeatherGod 291576	2018-02-20T18:43:56Z	2018-02-20T18:43:56Z	CONTRIBUTOR	I did some more investigation into the memory usage problem I was having. I had assumed that the vectorized indexed result of a lazily indexed data array would be an in-memory array. So, when I then started to use the result, it was then doing a read of all the data at once, resulting in a near-complete load of the data into memory. I have adjusted my code to chunk out the indexing in order to keep the memory usage under control at reasonable performance penalty. I haven't looked into trying to identify the ideal chunking scheme to follow for an arbitrary dataarray and indexing. Perhaps we can make that a task for another day. At this point, I am satisfied with the features (negative step-sizes aside, of course).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366618866	https://github.com/pydata/xarray/pull/1899#issuecomment-366618866	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjYxODg2Ng==	fujiisoup 6815844	2018-02-19T08:30:01Z	2018-02-19T08:30:01Z	MEMBER	This looks some backends do not support negative step slices. I'm going to wrap this maybe this weekend.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366379465	https://github.com/pydata/xarray/pull/1899#issuecomment-366379465	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjM3OTQ2NQ==	WeatherGod 291576	2018-02-16T22:40:06Z	2018-02-16T22:40:06Z	CONTRIBUTOR	Ah-hah! Ok, so, the problem isn't some weird difference between the two examples I gave. The issue is that calling `np.asarray(foo)` triggered a full loading of the data!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366377467	https://github.com/pydata/xarray/pull/1899#issuecomment-366377467	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjM3NzQ2Nw==	fujiisoup 6815844	2018-02-16T22:30:32Z	2018-02-16T22:30:32Z	MEMBER	@WeatherGod, Thanks for testing. Can you share more detail? With your example, what does `wind_inds` look like? Can you share the shape and dimension names?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366376400	https://github.com/pydata/xarray/pull/1899#issuecomment-366376400	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjM3NjQwMA==	WeatherGod 291576	2018-02-16T22:25:59Z	2018-02-16T22:25:59Z	CONTRIBUTOR	huh... now I am not so sure about that... must be something else triggering the load.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366374917	https://github.com/pydata/xarray/pull/1899#issuecomment-366374917	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjM3NDkxNw==	WeatherGod 291576	2018-02-16T22:19:08Z	2018-02-16T22:19:08Z	CONTRIBUTOR	also, at this point, I don't know if this is limited to the netcdf4 backend, as this type of indexing was only done on a variable I have in a netcdf file. I don't have 4-D variables in other file types.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366373577	https://github.com/pydata/xarray/pull/1899#issuecomment-366373577	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjM3MzU3Nw==	fujiisoup 6815844	2018-02-16T22:12:44Z	2018-02-16T22:16:13Z	MEMBER	Can you share how you tested this? The test I added says it is still in memory after vectroized indexing. edit: wind_inds is a 1d-array? If this is the case, the both should trigger OuterIndexing. But in both cases it should be indexed lazily...	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366374041	https://github.com/pydata/xarray/pull/1899#issuecomment-366374041	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjM3NDA0MQ==	WeatherGod 291576	2018-02-16T22:14:49Z	2018-02-16T22:14:49Z	CONTRIBUTOR	`CD` by the way, has dimensions of `scales, latitude, longitude, wind_direction`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366373479	https://github.com/pydata/xarray/pull/1899#issuecomment-366373479	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjM3MzQ3OQ==	WeatherGod 291576	2018-02-16T22:12:18Z	2018-02-16T22:12:18Z	CONTRIBUTOR	Ah, not a change in behavior, but a possible bug exposed by a tiny change on my part. So, I have a 4D data array, `CD` and a data array for indexing, `wind_inds`. The following does not trigger a full loading: `CD[0][wind_direction=wind_inds]`, which is good! But, this does: `CD[scales=0, wind_direction=wind_inds]`, which is bad. So, somehow, the indexing system is effectively treating these two things as different.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366363419	https://github.com/pydata/xarray/pull/1899#issuecomment-366363419	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjM2MzQxOQ==	WeatherGod 291576	2018-02-16T21:28:09Z	2018-02-16T21:28:09Z	CONTRIBUTOR	correction... the problem isn't with pynio... it is in the netcdf4 backend	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366360382	https://github.com/pydata/xarray/pull/1899#issuecomment-366360382	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjM2MDM4Mg==	WeatherGod 291576	2018-02-16T21:15:17Z	2018-02-16T21:15:17Z	CONTRIBUTOR	Something changed. Now the indexing for pynio is forcing a full loading of the data.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
366059694	https://github.com/pydata/xarray/pull/1899#issuecomment-366059694	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NjA1OTY5NA==	WeatherGod 291576	2018-02-15T20:59:20Z	2018-02-15T20:59:20Z	CONTRIBUTOR	I can confirm that with the latest changes, the pynio tests now pass locally for me. Now, as to whether or not the tests in there are actually exercising anything useful is a different question.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
365729433	https://github.com/pydata/xarray/pull/1899#issuecomment-365729433	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NTcyOTQzMw==	WeatherGod 291576	2018-02-14T20:07:55Z	2018-02-14T20:07:55Z	CONTRIBUTOR	I am working on re-activating those tests. I think PyNio is now available for python3, too. On Wed, Feb 14, 2018 at 2:59 PM, Joe Hamman notifications@github.com wrote: @WeatherGod https://github.com/weathergod - you are right, all the pynio tests are being skipped on travis. I'll open a separate issue for that. Yikes! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/pull/1899#issuecomment-365727175, or mute the thread https://github.com/notifications/unsubscribe-auth/AARy-PE0F4-EugBO18rhnrogkZN1MLUOks5tUzssgaJpZM4R_x5o .	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
365727175	https://github.com/pydata/xarray/pull/1899#issuecomment-365727175	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NTcyNzE3NQ==	jhamman 2443309	2018-02-14T19:59:36Z	2018-02-14T19:59:36Z	MEMBER	@WeatherGod - you are right, all the pynio tests are being skipped on travis. I'll open a separate issue for that. Yikes!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
365722413	https://github.com/pydata/xarray/pull/1899#issuecomment-365722413	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NTcyMjQxMw==	WeatherGod 291576	2018-02-14T19:43:07Z	2018-02-14T19:43:07Z	CONTRIBUTOR	It looks like the pynio backend isn't regularly tested, as several of them currently fail when I run the tests locally. Some of them are failing because they are asserting NotImplementedErrors that are now implemented.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
365708385	https://github.com/pydata/xarray/pull/1899#issuecomment-365708385	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NTcwODM4NQ==	WeatherGod 291576	2018-02-14T18:55:43Z	2018-02-14T18:55:43Z	CONTRIBUTOR	Just did some more debugging, putting in some debug statements within `NioArrayWrapper.__getitem__()`: ``` diff --git a/xarray/backends/pynio_.py b/xarray/backends/pynio_.py index c7e0ddf..b9f7151 100644 --- a/xarray/backends/pynio_.py +++ b/xarray/backends/pynio_.py @@ -27,16 +27,24 @@ class NioArrayWrapper(BackendArray): return self.datastore.ds.variables[self.variable_name] `def __getitem__(self, key):` import logging logger = logging.getLogger(name) logger.addHandler(logging.NullHandler()) logger.debug("initial key: %s", key) key, np_inds = indexing.decompose_indexer(key, self.shape, mode='outer') logger.debug("Decomposed indexers:\n%s\n%s", key, np_inds) `with self.datastore.ensure_open(autoclose=True): array = self.get_array()` logger.debug("initial array: %r", array) if key == () and self.ndim == 0: return array.get_value() `for ind in np_inds:` logger.debug("indexer: %s", ind) array = indexing.NumpyIndexingAdapter(array)[ind] logger.debug("intermediate array: %r", array) return array ``` And here is the test script (data not included): `import logging import xarray as xr logging.basicConfig(level=logging.DEBUG) fname1 = '../hrrr.t12z.wrfnatf02.grib2' ds = xr.open_dataset(fname1, engine='pynio') subset_isel = ds.isel(lv_HYBL0=7) sp = subset_isel['UGRD_P0_L105_GLC0'].values.shape` And here is the relevant output: DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((slice(None, None, None),)) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((slice(None, None, None),)) () DEBUG:xarray.backends.pynio_:initial array: <Nio.NioVariable object at 0x7f0f3c339210> DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((slice(None, None, None),)) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((slice(None, None, None),)) () DEBUG:xarray.backends.pynio_:initial array: <Nio.NioVariable object at 0x7f0f3c339b90> DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((slice(None, None, None),)) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((slice(None, None, None),)) () DEBUG:xarray.backends.pynio_:initial array: <Nio.NioVariable object at 0x7f0f3c339d50> DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((slice(None, None, None),)) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((slice(None, None, None),)) () DEBUG:xarray.backends.pynio_:initial array: <Nio.NioVariable object at 0x7f0f3c339d90> DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((7, slice(None, None, None), slice(None, None, None))) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((7, slice(None, None, None), slice(None, None, None))) () DEBUG:xarray.backends.pynio_:initial array: <Nio.NioVariable object at 0x7f0f3c339190> DEBUG:xarray.backends.pynio_:initial key: BasicIndexer((7, slice(None, None, None), slice(None, None, None))) DEBUG:xarray.backends.pynio_:Decomposed indexers: BasicIndexer((7, slice(None, None, None), slice(None, None, None))) () DEBUG:xarray.backends.pynio_:initial array: <Nio.NioVariable object at 0x7f0f3c339190> (50, 1059, 1799) So, the `BasicIndexer((7, slice(None, None, None), slice(None, None, None)))` isn't getting decomposed correctly, it looks like?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
365692868	https://github.com/pydata/xarray/pull/1899#issuecomment-365692868	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NTY5Mjg2OA==	WeatherGod 291576	2018-02-14T18:02:17Z	2018-02-14T18:06:24Z	CONTRIBUTOR	Ah, interesting... so, this dataset was created by doing an isel() on the original: ``` ds['UGRD_P0_L105_GLC0'] <xarray.DataArray 'UGRD_P0_L105_GLC0' (lv_HYBL0: 50, ygrid_0: 1059, xgrid_0: 1799)> [95257050 values with dtype=float32] Coordinates: * lv_HYBL0 (lv_HYBL0) float32 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 ... gridlat_0 (ygrid_0, xgrid_0) float32 ... gridlon_0 (ygrid_0, xgrid_0) float32 ... Dimensions without coordinates: ygrid_0, xgrid_0 `` So, the original data has a 50x1059x1799 grid, and the new indexer isn't properly composing the indexer so that it fetches [7, slice(None), slice(None)] when I grab it's.values`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
365689883	https://github.com/pydata/xarray/pull/1899#issuecomment-365689883	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NTY4OTg4Mw==	WeatherGod 291576	2018-02-14T17:52:24Z	2018-02-14T17:52:24Z	CONTRIBUTOR	I can also confirm that the shape comes out correctly using master, so this is definitely isolated to this PR.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
365689003	https://github.com/pydata/xarray/pull/1899#issuecomment-365689003	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NTY4OTAwMw==	WeatherGod 291576	2018-02-14T17:49:20Z	2018-02-14T17:49:20Z	CONTRIBUTOR	Hmm, came across a bug with the pynio backend. Working on making a reproducible example, but just for your own inspection, here is some logging output: `<xarray.Dataset> Dimensions: (xgrid_0: 1799, ygrid_0: 1059) Coordinates: lv_HYBL0 float32 8.0 longitude (ygrid_0, xgrid_0) float32 ... latitude (ygrid_0, xgrid_0) float32 ... Dimensions without coordinates: xgrid_0, ygrid_0 Data variables: UGRD (ygrid_0, xgrid_0) float32 ... VGRD (ygrid_0, xgrid_0) float32 ... DEBUG:hiresWind.downscale:shape of a data: (50, 1059, 1799)` The first bit is the repr of my DataSet. The last line is output of `ds['UGRD'].values.shape`. It is supposed to be 3D, not 2D. If I revert back to v0.10.0, then the shape is (1059, 1799}, just as expected.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
365657502	https://github.com/pydata/xarray/pull/1899#issuecomment-365657502	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NTY1NzUwMg==	WeatherGod 291576	2018-02-14T16:13:16Z	2018-02-14T16:13:16Z	CONTRIBUTOR	Oh, wow... this worked like a charm for the netcdf4 backend! I have a ~13GB (uncompressed) 4-D netcdf4 variable that was giving me trouble for slicing a 2D surface out of. Here is a snippet where I am grabbing data at random indices in the last dimension. First for a specific latitude, then for the entire domain. ``` CD_subset = rough['CD'][0] wind_inds_decorated <xarray.DataArray (latitude: 3501, longitude: 7001)> array([[33, 15, 25, ..., 52, 66, 35], [ 6, 8, 55, ..., 59, 6, 50], [54, 2, 40, ..., 32, 19, 9], ..., [53, 18, 23, ..., 19, 3, 43], [ 9, 11, 66, ..., 51, 39, 58], [21, 54, 37, ..., 3, 0, 65]]) Dimensions without coordinates: latitude, longitude foo = CD_subset.isel(latitude=0, wind_direction=wind_inds_decorated[0]) foo <xarray.DataArray 'CD' (longitude: 7001)> array([ 0.004052, 0.005915, 0.002771, ..., 0.005604, 0.004715, 0.002756], dtype=float32) Coordinates: scales int16 60 latitude float64 54.99 * longitude (longitude) float64 -130.0 -130.0 -130.0 -130.0 -130.0 ... wind_direction (longitude) int16 165 75 125 5 235 345 315 175 85 35 290 ... foo = CD_subset.isel(wind_direction=wind_inds_decorated) foo <xarray.DataArray 'CD' (latitude: 3501, longitude: 7001)> [24510501 values with dtype=float32] Coordinates: scales int16 60 * latitude (latitude) float64 54.99 54.98 54.97 54.96 54.95 54.95 ... * longitude (longitude) float64 -130.0 -130.0 -130.0 -130.0 -130.0 ... wind_direction (latitude, longitude) int64 165 75 125 5 235 345 315 175 ... ``` All previous attempts at this would result in having to load the entire 13GB array into memory just to get 93.5 MB out. Or, I would try to fetch each individual point, which took way too long. This worked faster than loading the entire thing into memory, and it used less memory, too (I think I maxed out at about 1.2GB of total usage, which is totally acceptable for my use case). I will try out similar things with the pynio and rasterio backends, and get back to you. Thanks for this work!	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
364755370	https://github.com/pydata/xarray/pull/1899#issuecomment-364755370	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NDc1NTM3MA==	fujiisoup 6815844	2018-02-11T14:25:40Z	2018-02-11T19:49:04Z	MEMBER	Based on the suggestion, I implemented the lazy vectorized indexing with index-consolidation. Now, every backend is virtually compatible to all the indexer types, i.e. basic-, outer- and vectorized-indexers. It sometimes consume large amount of memory if the indexer is unable to decompose efficiently, but it is always better than loading the full slice. The drawback is the unpredictability of how many data will be loaded.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
364625973	https://github.com/pydata/xarray/pull/1899#issuecomment-364625973	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NDYyNTk3Mw==	fujiisoup 6815844	2018-02-10T04:47:04Z	2018-02-10T04:47:04Z	MEMBER	There are some obvious fail cases, e.g., if they want to pull out indices array[[1, -1], [1, -1]], in which case the entire array needs to be sliced. If the backend supports the orthogonal indexing (not only the basic indexing), we can do `array[[1, -1]][:, [1, -1]]`, load the 2x2 array, then apply the vectorized indexing `[[0, 1], [0, 1]]`. But if we want a full diagonal, we need a full slice anyway... Also, we would want to avoid separating basic/vectorized for backends that support efficient vectorized indexing (scipy and zarr). OK. Agreed. We may need a flag that can be accessed from the array wrapper.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
364625429	https://github.com/pydata/xarray/pull/1899#issuecomment-364625429	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NDYyNTQyOQ==	shoyer 1217238	2018-02-10T04:33:44Z	2018-02-10T04:33:44Z	MEMBER	in case we want to get three diagonal elements (1, 1), (2, 2), (3, 3) from a 1000x1000 array. What we want is array[[1, 2, 3], [1, 2, 3]]. It can be decomposed to array[1: 4, 1:4][[0, 1, 2], [0, 1, 2]]. We only need to load 3 x 3 part of the 1000 x 1000 array, then take its diagonal elements. OK, this is pretty clever. There are some obvious fail cases, e.g., if they want to pull out indices `array[[1, -1], [1, -1]]`, in which case the entire array needs to be sliced. I wonder if we should try to detect these with some heuristics, e.g., if the size of the result is much (maybe 10x or 100x) smaller than the size of sliced arrays. Also, we would want to avoid separating basic/vectorized for backends that support efficient vectorized indexing (scipy and zarr).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
364616100	https://github.com/pydata/xarray/pull/1899#issuecomment-364616100	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NDYxNjEwMA==	fujiisoup 6815844	2018-02-10T01:47:54Z	2018-02-10T01:47:54Z	MEMBER	I am inclined to the option 1, as there are some benefit even for backend without the vectorized-indexing support, e.g. in case we want to get three diagonal elements (1, 1), (2, 2), (3, 3) from a 1000x1000 array. What we want is `array[[1, 2, 3], [1, 2, 3]]`. It can be decomposed to `array[1: 4, 1:4][[0, 1, 2], [0, 1, 2]]`. We only need to load 3 x 3 part of the 1000 x 1000 array, then take its diagonal elements. A drawback is that it is difficult for users to predict how large memory is necessary.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
364583951	https://github.com/pydata/xarray/pull/1899#issuecomment-364583951	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NDU4Mzk1MQ==	shoyer 1217238	2018-02-09T22:10:43Z	2018-02-09T22:10:43Z	MEMBER	I think the design choice here really comes down to whether we want to enable VectorizedIndexing on arbitrary data on disk or not: Is it better to: 1. Always allow vectorized indexing by means of (lazily) loading all indexed data into memory as a single chunk. This could potentially be very expensive for IO or memory in hard to predict ways. 2. Or to only allow vectorized indexing if a backend supports it directly. This ensures that when vectorized indexing works it works efficiently. Vectorized indexing is still possibly but you have to explicitly write `.compute()`/`.load()`. I think I slightly prefer option (2) but I can see the merits in either decision.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
364573996	https://github.com/pydata/xarray/pull/1899#issuecomment-364573996	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NDU3Mzk5Ng==	shoyer 1217238	2018-02-09T21:30:40Z	2018-02-09T21:30:40Z	MEMBER	Reason 2 is the primary one. We want to load the minimum amount of data possible into memory, mostly because pulling data from disk is slow.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
364573328	https://github.com/pydata/xarray/pull/1899#issuecomment-364573328	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NDU3MzMyOA==	fujiisoup 6815844	2018-02-09T21:28:26Z	2018-02-09T21:28:26Z	MEMBER	Thanks, @shoyer Do you think it is possible to consolidate `transpose` also? We need it to keep our logic in `Variable._broadcast_indexing`. I am wondering what computation cost we want to avoid by the lazy indexing. 1. The indexing itself is expensive so we want to minimize the number of indexing operation? 2. The original data is too large to fit into memory, and we want to load the smallest subset of the original array by the lazy indexing? If the reason 2 is the common case, I think it is not a good idea to consolidate all the lazy indexing as `VectorizedIndexer`, since most of the backend does not support vectorized indexing, which means we need to load all the array into memory before any indexing operation. (But still it would be valuable to consolidate all the indexers after the first vectorized indexer, since we can decompose any VectorizedIndexer into successive outer- and smaller vectorized-indexers pair.) And I am also wondering as pointed out in #1725, what I am doing now was already implemented in dask.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
364529325	https://github.com/pydata/xarray/pull/1899#issuecomment-364529325	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NDUyOTMyNQ==	shoyer 1217238	2018-02-09T19:07:39Z	2018-02-09T19:07:39Z	MEMBER	I figured out how to consolidate two vectorized indexers, as long as they don't include any `slice` objects: ```python import numpy as np def index_vectorized_indexer(old_indexer, applied_indexer): return tuple(o[applied_indexer] for o in np.broadcast_arrays(*old_indexer)) for x, old, applied in [ (np.arange(10), (np.arange(2, 7),), (np.array([3, 2, 1]),)), (np.arange(10), (np.arange(6).reshape(2, 3),), (np.arange(2), np.arange(1, 3))), (-np.arange(1, 21).reshape(4, 5), (np.arange(3)[:, None], np.arange(4)[None, :]), (np.arange(3), np.arange(3))), ]: new_key = index_vectorized_indexer(old, applied) np.testing.assert_array_equal(x[old][applied], x[new_key]) ``` We could probably make this work with `VectorizedIndexer` if we converted the slice objects to arrays. I think we might even already have some code to do that conversion somewhere. So another option would be to convert `BasicIndexer` and `OuterIndexer` -> `VectorizedIndexer` if necessary and then use this path.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143
364442081	https://github.com/pydata/xarray/pull/1899#issuecomment-364442081	https://api.github.com/repos/pydata/xarray/issues/1899	MDEyOklzc3VlQ29tbWVudDM2NDQ0MjA4MQ==	fujiisoup 6815844	2018-02-09T14:04:16Z	2018-02-09T14:04:16Z	MEMBER	I noticed the lazy vectorized indexing can be (sometimes) optimized by decomposing the vectorized indexers into successive outer and vectorized indexers, so that the size of the array to be loaded into memory is minimized.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Vectorized lazy indexing 295838143

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);