html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/838#issuecomment-216086274,https://api.github.com/repos/pydata/xarray/issues/838,216086274,MDEyOklzc3VlQ29tbWVudDIxNjA4NjI3NA==,1217238,2016-05-02T00:52:51Z,2016-05-02T00:52:51Z,MEMBER,"This is (arguably) a NumPy bug -- the problem is that the `to_dataframe()` call is trying to create an array with 8e30 elements! ``` ipdb> shape (72, 55, 60, 10, 4, 512, 51, 51, 12, 80, 3, 8, 6, 11, 5000, 25, 24, 6277, 24) ipdb> np.prod(shape) -8804073483760828416 ipdb> np.prod(np.asarray(shape, dtype=float)) 8.6981676921852312e+30 ``` The problem is that these MADIS netCDFs have loads of dimensions, corresponding to strings (and other stuff, if I recall correctly): ``` Dimensions: (ICcheckNameLen: 72, ICcheckNum: 55, QCcheckNameLen: 60, QCcheckNum: 10, maxHomeWFOlen: 4, maxLDADmessageLen: 512, maxLDADtestLen: 51, maxNameLength: 51, maxProviderIdLen: 12, maxRemark: 80, maxSkyCover: 3, maxSkyLen: 8, maxStaIdLen: 6, maxStaTypeLen: 11, maxStaticIds: 5000, maxWeatherLen: 25, nInventoryBins: 24, recNum: 6277, totalIdLen: 24) ``` xarray here tries to multiple a MultiIndex for the DataFrame out of the outer product of all these dimensions. It would be nice to have a better fix here, but it's not immediately obvious to me what that would be. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,152040420