home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where author_association = "MEMBER" and issue = 1475567394 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 3

  • Illviljan 2
  • dcherian 1
  • TomNicholas 1

issue 1

  • Avoid loading entire dataset by getting the nbytes in an array · 4 ✖

author_association 1

  • MEMBER · 4 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1474176353 https://github.com/pydata/xarray/pull/7356#issuecomment-1474176353 https://api.github.com/repos/pydata/xarray/issues/7356 IC_kwDOAMm_X85X3iVh dcherian 2448579 2023-03-17T17:30:51Z 2023-03-17T17:31:22Z MEMBER

Because we have lazy data reading functionality ```python import xarray as xr ds = xr.tutorial.open_dataset("air_temperature") var = ds.air.variable

print(type(var._data)) # memory cached array print(type(var._data.array.array)) # ah that's wrapping a lazy array, no data read in yet print(var._data.size) # can access size print(type(var._data.array.array)) # still a lazy array

.data forces a disk load

print(type(var.data)) # oops disk-load print(type(var._data)) # "still memory cached array" print(type(var._data.array.array)) # but that's wrapping numpy data in memory ```

<class 'xarray.core.indexing.MemoryCachedArray'> <class 'xarray.core.indexing.LazilyIndexedArray'> 3869000 <class 'xarray.core.indexing.LazilyIndexedArray'> <class 'numpy.ndarray'> <class 'xarray.core.indexing.MemoryCachedArray'> <class 'numpy.ndarray'>

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Avoid loading entire dataset by getting the nbytes in an array 1475567394
1474149056 https://github.com/pydata/xarray/pull/7356#issuecomment-1474149056 https://api.github.com/repos/pydata/xarray/issues/7356 IC_kwDOAMm_X85X3brA TomNicholas 35968931 2023-03-17T17:10:44Z 2023-03-17T17:10:44Z MEMBER

This came up in the xarray office hours today, and I'm confused why this PR made any difference to the behavior at all? The .data property just points to ._data, so why would it matter which one we check?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Avoid loading entire dataset by getting the nbytes in an array 1475567394
1339575144 https://github.com/pydata/xarray/pull/7356#issuecomment-1339575144 https://api.github.com/repos/pydata/xarray/issues/7356 IC_kwDOAMm_X85P2Eto Illviljan 14371165 2022-12-06T15:44:01Z 2022-12-06T15:44:01Z MEMBER

I'm not really opposed to this change, shape and dtype uses self._data aswell.

Without using chunks={} in open_dataset? I just find it a little odd that it's not a duck_array, what type is self._data?

This test just looked so similar to the tests in #6797. I think you can do a similar lazy test taking inspiration from: https://github.com/pydata/xarray/blob/ed60c6ccd3d6725cd91190b8796af4355f3085c2/xarray/tests/test_formatting.py#L715-L727

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Avoid loading entire dataset by getting the nbytes in an array 1475567394
1339423992 https://github.com/pydata/xarray/pull/7356#issuecomment-1339423992 https://api.github.com/repos/pydata/xarray/issues/7356 IC_kwDOAMm_X85P1fz4 Illviljan 14371165 2022-12-06T13:53:03Z 2022-12-06T13:53:03Z MEMBER

Is that test targetting your issue with RAM crashing the laptop? Shouldn't there be some check if the values were loaded?

How did you import your data? self.data looks like this: https://github.com/pydata/xarray/blob/ed60c6ccd3d6725cd91190b8796af4355f3085c2/xarray/core/variable.py#L420-L435

I was expecting your data to be a duck_array?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Avoid loading entire dataset by getting the nbytes in an array 1475567394

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.912ms · About: xarray-datasette