home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

where author_association = "MEMBER", issue = 277538485 and user = 1217238 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

These facets timed out: author_association, issue

user 1

  • shoyer · 10 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
356390513 https://github.com/pydata/xarray/issues/1745#issuecomment-356390513 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1NjM5MDUxMw== shoyer 1217238 2018-01-09T19:36:10Z 2018-01-09T19:36:10Z MEMBER

Both the warning message and the upstream anaconda issue seem like good ideas to me.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
352152392 https://github.com/pydata/xarray/issues/1745#issuecomment-352152392 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1MjE1MjM5Mg== shoyer 1217238 2017-12-16T01:58:02Z 2017-12-16T01:58:02Z MEMBER

If upgrating to a newer version of netcdf4-python isn't an option we might need to figure out a workaround for xarray....

It seems that anaconda is still distributing netCDF4 1.2.4, which doesn't help here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
351788352 https://github.com/pydata/xarray/issues/1745#issuecomment-351788352 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1MTc4ODM1Mg== shoyer 1217238 2017-12-14T17:58:05Z 2017-12-14T17:58:05Z MEMBER

Can you reproduce this just using netCDF4-python?

Try: ``` import netCDF4 ds = netCDF4.Dataset(path)

print(ds)

print(ds.filepath()) ```

If so, it would be good to file a bug upstream.

Actually, it looks like this might be https://github.com/Unidata/netcdf4-python/issues/506

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
351783850 https://github.com/pydata/xarray/issues/1745#issuecomment-351783850 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1MTc4Mzg1MA== shoyer 1217238 2017-12-14T17:41:05Z 2017-12-14T17:41:11Z MEMBER

I think there is probably a bug buried inside the netCDF4.Dataset.filepath() method somewhere. For example, on netCDF4-python 1.2.4, this would crash if you have any non-ASCII characters in the path. But that doesn't seem to be the issue here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
351780487 https://github.com/pydata/xarray/issues/1745#issuecomment-351780487 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1MTc4MDQ4Nw== shoyer 1217238 2017-12-14T17:28:37Z 2017-12-14T17:28:37Z MEMBER

@braaannigan can you try adding print(repr(path)) to is_remote_uri() so we can see exactly what these offending strings look like?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
351779445 https://github.com/pydata/xarray/issues/1745#issuecomment-351779445 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1MTc3OTQ0NQ== shoyer 1217238 2017-12-14T17:24:40Z 2017-12-14T17:24:40Z MEMBER

re.match(pattern, string) is equivalent to re.search('^' + pattern, string), so arguably this is a cleaner solution anyways. But ideally I'd like to understand why this is a problem for you, so we can fix the underlying cause and not do it again.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
351765967 https://github.com/pydata/xarray/issues/1745#issuecomment-351765967 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1MTc2NTk2Nw== shoyer 1217238 2017-12-14T16:41:19Z 2017-12-14T16:41:19Z MEMBER

@braaannigan what about replacing re.search('^https?\://', path) with re.match('https?\://', path)? Can you share the output of running python -c 'import sys; print(sys.getfilesystemencoding())' at the command line? Also, please try engine='scipy' or engine='h5netcdf' with open_dataset. The output of xarray.show_versions() would also be helpful.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
351470450 https://github.com/pydata/xarray/issues/1745#issuecomment-351470450 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM1MTQ3MDQ1MA== shoyer 1217238 2017-12-13T17:54:54Z 2017-12-13T17:54:54Z MEMBER

@braaannigan Can you share the name of your problematic file?

One possibility is that re.search() is not thread-safe, even though I don't think we call is_remote_uri from multiple threads. We can test that by adding a lock, and seeing if that resolves the issue. Try replacing is_remote_uri with: ```python import threading

LOCK = threading.Lock()

def is_remote_uri(path): with LOCK: return bool(re.search('^https?\://', path)) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
347819491 https://github.com/pydata/xarray/issues/1745#issuecomment-347819491 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM0NzgxOTQ5MQ== shoyer 1217238 2017-11-29T10:34:25Z 2017-11-29T10:34:25Z MEMBER

(405*282*37)*20*8 bytes = 676 MB, so running out of memory here seems plausible to me.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485
347811473 https://github.com/pydata/xarray/issues/1745#issuecomment-347811473 https://api.github.com/repos/pydata/xarray/issues/1745 MDEyOklzc3VlQ29tbWVudDM0NzgxMTQ3Mw== shoyer 1217238 2017-11-29T10:03:51Z 2017-11-29T10:03:51Z MEMBER

I think this was introduced by https://github.com/pydata/xarray/pull/1551, where we started loading coordinates that are compared for equality into memory. This speeds up open_mfdataset, but does increase memory usage.

We might consider adding an option for reduced memory usage at the price of speed. @crusaderky @jhamman @rabernat any thoughts?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset() memory error in v0.10 277538485

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 4482.666ms · About: xarray-datasette