home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 708594913

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/2139#issuecomment-708594913 https://api.github.com/repos/pydata/xarray/issues/2139 708594913 MDEyOklzc3VlQ29tbWVudDcwODU5NDkxMw== 145117 2020-10-14T18:52:38Z 2020-10-14T18:52:38Z CONTRIBUTOR

The issue is that if you pass in names = ['a','b','c'] to pd.read_csv and there are more columns than names, it takes all the columns without a name and creates a multi-index. That was a bug in my code that I had more columns than names, didn't want a multi-index, and didn't make use of usecols.

This multi-index came from a small 12 MB file - 5000 rows and 40 variables. When I then did df.to_xarray() it filled up my RAM. If I ran the code I provided above, it worked.

Now that I've figured all this out, I don't think that any bugs exist in xarray or pandas, just my code. As usual :). But if the fact that I can fill ram with df.to_xarray() but not with the 3 lines shown above sounds like an issue you want to explore, I'm happy to provide an MWE on a new ticket and tag you there. Let me know...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  323703742
Powered by Datasette · Queries took 1.092ms · About: xarray-datasette