home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

6 rows where issue = 509285415 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 5

  • shoyer 2
  • max-sixty 1
  • crusaderky 1
  • ngreenwald 1
  • angelolab 1

author_association 2

  • MEMBER 4
  • NONE 2

issue 1

  • Cannot write array larger than 4GB with SciPy netCDF backend · 6 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
544312894 https://github.com/pydata/xarray/issues/3416#issuecomment-544312894 https://api.github.com/repos/pydata/xarray/issues/3416 MDEyOklzc3VlQ29tbWVudDU0NDMxMjg5NA== angelolab 56707780 2019-10-21T01:04:48Z 2019-10-21T01:04:48Z NONE

Got it, thanks everyone. I'll open this issue there. We'll try and work on getting our NetCDF4 compatibility issues addressed to avoid this space issue, as we are working with large imaging datasets.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Cannot write array larger than 4GB with SciPy netCDF backend 509285415
544292045 https://github.com/pydata/xarray/issues/3416#issuecomment-544292045 https://api.github.com/repos/pydata/xarray/issues/3416 MDEyOklzc3VlQ29tbWVudDU0NDI5MjA0NQ== crusaderky 6213168 2019-10-20T20:59:40Z 2019-10-20T21:00:07Z MEMBER

The error message is clearly improvable, however this ticket belongs to the scipy board. @angelolab please reopen it there. All ok with closing this?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Cannot write array larger than 4GB with SciPy netCDF backend 509285415
544291094 https://github.com/pydata/xarray/issues/3416#issuecomment-544291094 https://api.github.com/repos/pydata/xarray/issues/3416 MDEyOklzc3VlQ29tbWVudDU0NDI5MTA5NA== shoyer 1217238 2019-10-20T20:47:59Z 2019-10-20T20:47:59Z MEMBER

SciPy's netCDF module has an additional limitation that would likely be limiting for writing large files: it currently writes netCDF files by loading the entire file into memory at once. Ideally it would support incrementally writing data.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Cannot write array larger than 4GB with SciPy netCDF backend 509285415
544290771 https://github.com/pydata/xarray/issues/3416#issuecomment-544290771 https://api.github.com/repos/pydata/xarray/issues/3416 MDEyOklzc3VlQ29tbWVudDU0NDI5MDc3MQ== shoyer 1217238 2019-10-20T20:44:38Z 2019-10-20T20:44:38Z MEMBER

This array is 7.5 GB in size, which unfortunately is too big to be stored as a single record in the netCDF3 file format, even the 64-bit version: https://www.unidata.ucar.edu/software/netcdf/docs/file_format_specifications.html#offset_format_spec

We could potentially support writing these variables to disk with scipy, but it would require refactoring xarray's scipy netCDF backend to write record variables (with indeterminate size), which we currently don't use.

In the meanwhile, the best work-around is to use the netCDF4 file format, either via the netCDF4 Python package or h5netcdf.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Cannot write array larger than 4GB with SciPy netCDF backend 509285415
544183597 https://github.com/pydata/xarray/issues/3416#issuecomment-544183597 https://api.github.com/repos/pydata/xarray/issues/3416 MDEyOklzc3VlQ29tbWVudDU0NDE4MzU5Nw== ngreenwald 13770365 2019-10-19T18:19:08Z 2019-10-19T18:19:33Z NONE

Yes, you're right @max-sixty, seems like the actual IO is happening within scipy. Do you have any suggestions for how I might troubleshoot this further? Could I make a similar example that uses only the scipy netcdf functionality somehow to drill down into where the memory error is coming from?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Cannot write array larger than 4GB with SciPy netCDF backend 509285415
544179019 https://github.com/pydata/xarray/issues/3416#issuecomment-544179019 https://api.github.com/repos/pydata/xarray/issues/3416 MDEyOklzc3VlQ29tbWVudDU0NDE3OTAxOQ== max-sixty 5635139 2019-10-19T17:44:09Z 2019-10-19T17:44:09Z MEMBER

Thanks for the issue @angelolab

Are you sure this is an xarray error? It looks like the error is being raised from scipy

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Cannot write array larger than 4GB with SciPy netCDF backend 509285415

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 16.967ms · About: xarray-datasette