github: issue_comments: 14 rows where issue = 1475567394 sorted by updated

14 rows where issue = 1475567394 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
1474176353	https://github.com/pydata/xarray/pull/7356#issuecomment-1474176353	https://api.github.com/repos/pydata/xarray/issues/7356	IC_kwDOAMm_X85X3iVh	dcherian 2448579	2023-03-17T17:30:51Z	2023-03-17T17:31:22Z	MEMBER	Because we have lazy data reading functionality ```python import xarray as xr ds = xr.tutorial.open_dataset("air_temperature") var = ds.air.variable print(type(var._data)) # memory cached array print(type(var._data.array.array)) # ah that's wrapping a lazy array, no data read in yet print(var._data.size) # can access size print(type(var._data.array.array)) # still a lazy array .data forces a disk load print(type(var.data)) # oops disk-load print(type(var._data)) # "still memory cached array" print(type(var._data.array.array)) # but that's wrapping numpy data in memory ``` `<class 'xarray.core.indexing.MemoryCachedArray'> <class 'xarray.core.indexing.LazilyIndexedArray'> 3869000 <class 'xarray.core.indexing.LazilyIndexedArray'> <class 'numpy.ndarray'> <class 'xarray.core.indexing.MemoryCachedArray'> <class 'numpy.ndarray'>`	{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Avoid loading entire dataset by getting the nbytes in an array 1475567394
1474149056	https://github.com/pydata/xarray/pull/7356#issuecomment-1474149056	https://api.github.com/repos/pydata/xarray/issues/7356	IC_kwDOAMm_X85X3brA	TomNicholas 35968931	2023-03-17T17:10:44Z	2023-03-17T17:10:44Z	MEMBER	This came up in the xarray office hours today, and I'm confused why this PR made any difference to the behavior at all? The `.data` property just points to `._data`, so why would it matter which one we check?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Avoid loading entire dataset by getting the nbytes in an array 1475567394
1362322800	https://github.com/pydata/xarray/pull/7356#issuecomment-1362322800	https://api.github.com/repos/pydata/xarray/issues/7356	IC_kwDOAMm_X85RM2Vw	hmaarrfk 90008	2022-12-22T02:40:59Z	2022-12-22T02:40:59Z	CONTRIBUTOR	Any chance of a release, this is quite breaking for large datasets that can only be out of memory.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Avoid loading entire dataset by getting the nbytes in an array 1475567394
1346924547	https://github.com/pydata/xarray/pull/7356#issuecomment-1346924547	https://api.github.com/repos/pydata/xarray/issues/7356	IC_kwDOAMm_X85QSHAD	hmaarrfk 90008	2022-12-12T17:27:47Z	2022-12-12T17:27:47Z	CONTRIBUTOR	👍🏾	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Avoid loading entire dataset by getting the nbytes in an array 1475567394
1339624818	https://github.com/pydata/xarray/pull/7356#issuecomment-1339624818	https://api.github.com/repos/pydata/xarray/issues/7356	IC_kwDOAMm_X85P2Q1y	hmaarrfk 90008	2022-12-06T16:19:19Z	2022-12-06T16:19:19Z	CONTRIBUTOR	Yes, without chunks of anything	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Avoid loading entire dataset by getting the nbytes in an array 1475567394
1339624418	https://github.com/pydata/xarray/pull/7356#issuecomment-1339624418	https://api.github.com/repos/pydata/xarray/issues/7356	IC_kwDOAMm_X85P2Qvi	hmaarrfk 90008	2022-12-06T16:18:59Z	2022-12-06T16:18:59Z	CONTRIBUTOR	Very smart test!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Avoid loading entire dataset by getting the nbytes in an array 1475567394
1339575144	https://github.com/pydata/xarray/pull/7356#issuecomment-1339575144	https://api.github.com/repos/pydata/xarray/issues/7356	IC_kwDOAMm_X85P2Eto	Illviljan 14371165	2022-12-06T15:44:01Z	2022-12-06T15:44:01Z	MEMBER	I'm not really opposed to this change, shape and dtype uses `self._data` aswell. Without using `chunks={}` in open_dataset? I just find it a little odd that it's not a duck_array, what type is `self._data`? This test just looked so similar to the tests in #6797. I think you can do a similar lazy test taking inspiration from: https://github.com/pydata/xarray/blob/ed60c6ccd3d6725cd91190b8796af4355f3085c2/xarray/tests/test_formatting.py#L715-L727	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Avoid loading entire dataset by getting the nbytes in an array 1475567394
1339457617	https://github.com/pydata/xarray/pull/7356#issuecomment-1339457617	https://api.github.com/repos/pydata/xarray/issues/7356	IC_kwDOAMm_X85P1oBR	hmaarrfk 90008	2022-12-06T14:18:11Z	2022-12-06T14:18:11Z	CONTRIBUTOR	The data is loaded from an NetCDF store through open_dataset	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Avoid loading entire dataset by getting the nbytes in an array 1475567394
1339452942	https://github.com/pydata/xarray/pull/7356#issuecomment-1339452942	https://api.github.com/repos/pydata/xarray/issues/7356	IC_kwDOAMm_X85P1m4O	hmaarrfk 90008	2022-12-06T14:14:57Z	2022-12-06T14:14:57Z	CONTRIBUTOR	No explicit test was added to ensure that the data wasn't loaded. I just experienced this bug enough (we would accidentally load 100GB files in our code base) that I knew exactly how to fix it. If you want i can add a test to ensure that future optimizations to nbytes do not trigger a data load. I was hoping the 1 line fix would be a shoe in.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Avoid loading entire dataset by getting the nbytes in an array 1475567394
1339423992	https://github.com/pydata/xarray/pull/7356#issuecomment-1339423992	https://api.github.com/repos/pydata/xarray/issues/7356	IC_kwDOAMm_X85P1fz4	Illviljan 14371165	2022-12-06T13:53:03Z	2022-12-06T13:53:03Z	MEMBER	Is that test targetting your issue with RAM crashing the laptop? Shouldn't there be some check if the values were loaded? How did you import your data? self.data looks like this: https://github.com/pydata/xarray/blob/ed60c6ccd3d6725cd91190b8796af4355f3085c2/xarray/core/variable.py#L420-L435 I was expecting your data to be a duck_array?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Avoid loading entire dataset by getting the nbytes in an array 1475567394
1336731702	https://github.com/pydata/xarray/pull/7356#issuecomment-1336731702	https://api.github.com/repos/pydata/xarray/issues/7356	IC_kwDOAMm_X85PrOg2	hmaarrfk 90008	2022-12-05T04:20:08Z	2022-12-05T04:20:08Z	CONTRIBUTOR	It seems that checking hasattr on the `_data` variable achieves both purposes.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Avoid loading entire dataset by getting the nbytes in an array 1475567394
1336711830	https://github.com/pydata/xarray/pull/7356#issuecomment-1336711830	https://api.github.com/repos/pydata/xarray/issues/7356	IC_kwDOAMm_X85PrJqW	hmaarrfk 90008	2022-12-05T03:58:50Z	2022-12-05T03:58:50Z	CONTRIBUTOR	I think that at the very lease, the current implementation works as well as the old one for arrays that are defined by the `sparse` package.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Avoid loading entire dataset by getting the nbytes in an array 1475567394
1336700669	https://github.com/pydata/xarray/pull/7356#issuecomment-1336700669	https://api.github.com/repos/pydata/xarray/issues/7356	IC_kwDOAMm_X85PrG79	hmaarrfk 90008	2022-12-05T03:36:31Z	2022-12-05T03:36:31Z	CONTRIBUTOR	Looking into the history a little more. I seem to be proposing to revert: https://github.com/pydata/xarray/commit/60f8c3d3488d377b0b21009422c6121e1c8f1f70 I think this is important since many users have arrays that are larger than memory. For me, I found this bug when trying to access the number of bytes in a 16GB dataset that I'm trying to load on my wimpy laptop. Not fun to start swapping. I feel like others might be hitting this too. xref: https://github.com/pydata/xarray/pull/6797 https://github.com/pydata/xarray/issues/4842	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Avoid loading entire dataset by getting the nbytes in an array 1475567394
1336696899	https://github.com/pydata/xarray/pull/7356#issuecomment-1336696899	https://api.github.com/repos/pydata/xarray/issues/7356	IC_kwDOAMm_X85PrGBD	hmaarrfk 90008	2022-12-05T03:30:31Z	2022-12-05T03:30:31Z	CONTRIBUTOR	I personally do not even think the `hasattr` is really that useful. You might as well use size and itemsize	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Avoid loading entire dataset by getting the nbytes in an array 1475567394

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

14 rows where issue = 1475567394 sorted by updated_at descending

.data forces a disk load

Advanced export