html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/6561#issuecomment-1116350454,https://api.github.com/repos/pydata/xarray/issues/6561,1116350454,IC_kwDOAMm_X85Ciif2,5635139,2022-05-03T17:19:23Z,2022-05-03T17:19:23Z,MEMBER,"> Thanks for the feedback and explanation. It seems the poorly constructed netCDF file is fundamentally to blame for triggering this behavior. A warning is a good idea, though.

I'm not sure it's necessarily poorly constructed — it can be quite useful to structure data like this — having aligned data of different dimensions in a single dataset is great. But the attribute of the data that makes datasets a good format also makes it bad for a single table.

Probably what we'd want is `to_dataframes()`, which would create a dataframe for each combination of dimensions...","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1223031600
https://github.com/pydata/xarray/issues/6561#issuecomment-1116344892,https://api.github.com/repos/pydata/xarray/issues/6561,1116344892,IC_kwDOAMm_X85CihI8,8419421,2022-05-03T17:13:02Z,2022-05-03T17:13:02Z,NONE,"Thanks for the feedback and explanation. It seems the poorly constructed netCDF file is fundamentally to blame for triggering this behavior. A warning is a good idea, though.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1223031600
https://github.com/pydata/xarray/issues/6561#issuecomment-1115419268,https://api.github.com/repos/pydata/xarray/issues/6561,1115419268,IC_kwDOAMm_X85Ce_KE,5635139,2022-05-02T22:09:40Z,2022-05-02T22:09:40Z,MEMBER,"Great, thanks for the example @sgdecker .

I think this is happening because there are variables of different dimensions that are getting broadcast together:

```python


In [5]: ncdata[['lastChild']].to_dataframe()
Out[5]:
         lastChild
station
0         127265.0
1              NaN
2         127492.0
3         124019.0
4              NaN
...            ...
5016      124375.0
5017      126780.0
5018      126781.0
5019      124902.0
5020       93468.0

[5021 rows x 1 columns]

In [6]: ncdata[['lastChild','snowfall_amount']].to_dataframe()
Out[6]:
                lastChild  snowfall_amount
station recNum
0       0        127265.0              NaN
        1        127265.0              NaN
        2        127265.0              NaN
        3        127265.0              NaN
        4        127265.0              NaN
...                   ...              ...
5020    127621    93468.0              NaN
        127622    93468.0              NaN
        127623    93468.0              NaN
        127624    93468.0              NaN
        127625    93468.0              NaN

[640810146 rows x 2 columns]

```

`640810146 rows` is the giveaway.

I'm not sure what we could do here — I don't think there's a way of producing a 2D dataframe without blowing this out? 

We could offer a warning on this behavior beyond a certain size — we'd take a PR for that...","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1223031600