home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 216086274

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/838#issuecomment-216086274 https://api.github.com/repos/pydata/xarray/issues/838 216086274 MDEyOklzc3VlQ29tbWVudDIxNjA4NjI3NA== 1217238 2016-05-02T00:52:51Z 2016-05-02T00:52:51Z MEMBER

This is (arguably) a NumPy bug -- the problem is that the to_dataframe() call is trying to create an array with 8e30 elements!

ipdb> shape (72, 55, 60, 10, 4, 512, 51, 51, 12, 80, 3, 8, 6, 11, 5000, 25, 24, 6277, 24) ipdb> np.prod(shape) -8804073483760828416 ipdb> np.prod(np.asarray(shape, dtype=float)) 8.6981676921852312e+30

The problem is that these MADIS netCDFs have loads of dimensions, corresponding to strings (and other stuff, if I recall correctly):

<xarray.Dataset> Dimensions: (ICcheckNameLen: 72, ICcheckNum: 55, QCcheckNameLen: 60, QCcheckNum: 10, maxHomeWFOlen: 4, maxLDADmessageLen: 512, maxLDADtestLen: 51, maxNameLength: 51, maxProviderIdLen: 12, maxRemark: 80, maxSkyCover: 3, maxSkyLen: 8, maxStaIdLen: 6, maxStaTypeLen: 11, maxStaticIds: 5000, maxWeatherLen: 25, nInventoryBins: 24, recNum: 6277, totalIdLen: 24)

xarray here tries to multiple a MultiIndex for the DataFrame out of the outer product of all these dimensions. It would be nice to have a better fix here, but it's not immediately obvious to me what that would be.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  152040420
Powered by Datasette · Queries took 0.718ms · About: xarray-datasette