html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/6561#issuecomment-1116350454,https://api.github.com/repos/pydata/xarray/issues/6561,1116350454,IC_kwDOAMm_X85Ciif2,5635139,2022-05-03T17:19:23Z,2022-05-03T17:19:23Z,MEMBER,"> Thanks for the feedback and explanation. It seems the poorly constructed netCDF file is fundamentally to blame for triggering this behavior. A warning is a good idea, though.
I'm not sure it's necessarily poorly constructed — it can be quite useful to structure data like this — having aligned data of different dimensions in a single dataset is great. But the attribute of the data that makes datasets a good format also makes it bad for a single table.
Probably what we'd want is `to_dataframes()`, which would create a dataframe for each combination of dimensions...","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1223031600
https://github.com/pydata/xarray/issues/6561#issuecomment-1115419268,https://api.github.com/repos/pydata/xarray/issues/6561,1115419268,IC_kwDOAMm_X85Ce_KE,5635139,2022-05-02T22:09:40Z,2022-05-02T22:09:40Z,MEMBER,"Great, thanks for the example @sgdecker .
I think this is happening because there are variables of different dimensions that are getting broadcast together:
```python
In [5]: ncdata[['lastChild']].to_dataframe()
Out[5]:
lastChild
station
0 127265.0
1 NaN
2 127492.0
3 124019.0
4 NaN
... ...
5016 124375.0
5017 126780.0
5018 126781.0
5019 124902.0
5020 93468.0
[5021 rows x 1 columns]
In [6]: ncdata[['lastChild','snowfall_amount']].to_dataframe()
Out[6]:
lastChild snowfall_amount
station recNum
0 0 127265.0 NaN
1 127265.0 NaN
2 127265.0 NaN
3 127265.0 NaN
4 127265.0 NaN
... ... ...
5020 127621 93468.0 NaN
127622 93468.0 NaN
127623 93468.0 NaN
127624 93468.0 NaN
127625 93468.0 NaN
[640810146 rows x 2 columns]
```
`640810146 rows` is the giveaway.
I'm not sure what we could do here — I don't think there's a way of producing a 2D dataframe without blowing this out?
We could offer a warning on this behavior beyond a certain size — we'd take a PR for that...","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1223031600