home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where author_association = "MEMBER" and issue = 662505658 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 2

  • shoyer 3
  • kmuehlbauer 1

issue 1

  • jupyter repr caching deleted netcdf file · 4 ✖

author_association 1

  • MEMBER · 4 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
774774033 https://github.com/pydata/xarray/issues/4240#issuecomment-774774033 https://api.github.com/repos/pydata/xarray/issues/4240 MDEyOklzc3VlQ29tbWVudDc3NDc3NDAzMw== shoyer 1217238 2021-02-07T21:48:38Z 2021-02-07T21:48:38Z MEMBER

I have a tentative fix for this in https://github.com/pydata/xarray/pull/4879. It would be great if someone could give this a try to verify that it resolve the issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  jupyter repr caching deleted netcdf file 662505658
764726612 https://github.com/pydata/xarray/issues/4240#issuecomment-764726612 https://api.github.com/repos/pydata/xarray/issues/4240 MDEyOklzc3VlQ29tbWVudDc2NDcyNjYxMg== kmuehlbauer 5821660 2021-01-21T15:34:42Z 2021-01-21T15:46:36Z MEMBER

I've stumbled over this weird behaviour many times and was wondering why this happens. So AFAICT @shoyer hit the nail on the head but the root cause is that the Dataset is added to the notebook namespace somehow, if one just evaluates it in the cell.

This doesn't happen if you invoke the __repr__ via

python display(xr.open_dataset("saved_on_disk.nc"))

I've forced myself to use either print or display for xarray data. As this also happens if the Dataset is attached to a variable you would need to specifically delete (or .close()) the variable in question before opening again.

python try: del ds except NameError: pass ds = xr.open_dataset("saved_on_disk.nc")

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  jupyter repr caching deleted netcdf file 662505658
663794065 https://github.com/pydata/xarray/issues/4240#issuecomment-663794065 https://api.github.com/repos/pydata/xarray/issues/4240 MDEyOklzc3VlQ29tbWVudDY2Mzc5NDA2NQ== shoyer 1217238 2020-07-25T02:05:18Z 2020-07-25T02:05:18Z MEMBER

Probably the easiest work around is to call .close() on the original dataset. Failing that, the file is cached in xarray.backends.file_manager.FILE_CACHE, which you could muck around with.

I believe it only gets activated by repr() because array values from netCDF file are loaded lazily. Not 100% without more testing, though.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  jupyter repr caching deleted netcdf file 662505658
663790991 https://github.com/pydata/xarray/issues/4240#issuecomment-663790991 https://api.github.com/repos/pydata/xarray/issues/4240 MDEyOklzc3VlQ29tbWVudDY2Mzc5MDk5MQ== shoyer 1217238 2020-07-25T01:33:36Z 2020-07-25T01:33:36Z MEMBER

Thanks for the clear example!

This happens dues to xarray's caching logic for files: https://github.com/pydata/xarray/blob/b1c7e315e8a18e86c5751a0aa9024d41a42ca5e8/xarray/backends/file_manager.py#L50-L76

This means that when you open the same filename, xarray doesn't actually reopen the file from disk -- instead it points to the same file object already cached in memory.

I can see why this could be confusing. We do need this caching logic for files opened from the same backends.*DataStore class, but this could include some sort of unique identifier (i.e., from uuid) to ensure each separate call to xr.open_dataset results in a separately cached/opened file object: https://github.com/pydata/xarray/blob/b1c7e315e8a18e86c5751a0aa9024d41a42ca5e8/xarray/backends/netCDF4_.py#L355-L357

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  jupyter repr caching deleted netcdf file 662505658

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 11.988ms · About: xarray-datasette