html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/838#issuecomment-589823579,https://api.github.com/repos/pydata/xarray/issues/838,589823579,MDEyOklzc3VlQ29tbWVudDU4OTgyMzU3OQ==,210858,2020-02-21T20:32:01Z,2020-02-21T20:32:01Z,NONE,"Just to denote that the issue still happens today with numpy=1.18.1, xarray=0.15.0, pandas=1.0.1
```
>>> df = nc.to_dataframe()
Traceback (most recent call last):
File """", line 1, in
File ""/opt/miniconda3/envs/prod/lib/python3.8/site-packages/xarray/core/dataset.py"", line 4465, in to_dataframe
return self._to_dataframe(self.dims)
File ""/opt/miniconda3/envs/prod/lib/python3.8/site-packages/xarray/core/dataset.py"", line 4451, in _to_dataframe
data = [
File ""/opt/miniconda3/envs/prod/lib/python3.8/site-packages/xarray/core/dataset.py"", line 4452, in
self._variables[k].set_dims(ordered_dims).values.reshape(-1)
File ""/opt/miniconda3/envs/prod/lib/python3.8/site-packages/xarray/core/variable.py"", line 1345, in set_dims
expanded_data = duck_array_ops.broadcast_to(self.data, tmp_shape)
File ""/opt/miniconda3/envs/prod/lib/python3.8/site-packages/xarray/core/duck_array_ops.py"", line 47, in f
return wrapped(*args, **kwargs)
File ""<__array_function__ internals>"", line 5, in broadcast_to
File ""/opt/miniconda3/envs/prod/lib/python3.8/site-packages/numpy/lib/stride_tricks.py"", line 182, in broadcast_to
return _broadcast_to(array, shape, subok=subok, readonly=True)
File ""/opt/miniconda3/envs/prod/lib/python3.8/site-packages/numpy/lib/stride_tricks.py"", line 125, in _broadcast_to
it = np.nditer(
ValueError: iterator is too large
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,152040420
https://github.com/pydata/xarray/issues/838#issuecomment-589784874,https://api.github.com/repos/pydata/xarray/issues/838,589784874,MDEyOklzc3VlQ29tbWVudDU4OTc4NDg3NA==,26384082,2020-02-21T18:50:16Z,2020-02-21T18:50:16Z,NONE,"In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity
If this issue remains relevant, please comment here or remove the `stale` label; otherwise it will be marked as closed automatically
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,152040420
https://github.com/pydata/xarray/issues/838#issuecomment-375755201,https://api.github.com/repos/pydata/xarray/issues/838,375755201,MDEyOklzc3VlQ29tbWVudDM3NTc1NTIwMQ==,26440884,2018-03-23T18:12:26Z,2018-03-23T18:12:26Z,NONE,"Something maybe of interest.
I recently converted some tools we have to do the above from Python 2 to 3. When the files were read in the byte chars were not converted to strings. I couldn't actually get this to work on the xarray side and had to loop through the DataFrame columns with apply(.decode(""utf-8"")) to decode them properly. I'm assuming this might be in the NetCDF4 library, but not 100% sure.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,152040420
https://github.com/pydata/xarray/issues/838#issuecomment-216093339,https://api.github.com/repos/pydata/xarray/issues/838,216093339,MDEyOklzc3VlQ29tbWVudDIxNjA5MzMzOQ==,210858,2016-05-02T02:15:22Z,2016-05-02T02:15:22Z,NONE,"@mogismog Awesome, thanks so much for the workaround :)
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,152040420
https://github.com/pydata/xarray/issues/838#issuecomment-216092859,https://api.github.com/repos/pydata/xarray/issues/838,216092859,MDEyOklzc3VlQ29tbWVudDIxNjA5Mjg1OQ==,6079398,2016-05-02T02:07:47Z,2016-05-02T02:07:47Z,NONE,"Redeeming myself (only a little bit) from my previous message here:
@akrherz Was messing around with this a bit, this seems to work ok. This gets rid of unnecessary dimensions, concatenates string arrays, and turns it into a pandas DataFrame:
```
[In [1]: import xarray as xr
In [2]: ds = xr.open_dataset('20160430_1600.nc', decode_cf=True, mask_and_scale=False, decode_times=False) # xarray has issue decoding the times, so you'll have to do this in pandas.
In [3]: vars_to_drop = [k for k in ds.variables.iterkeys() if ('recNum' not in ds[k].dims)]
In [4]: ds = ds.drop(vars_to_drop)
In [5]: df = ds.to_dataframe()
In [6]: df.info()
Int64Index: 6277 entries, 0 to 6276
Data columns (total 93 columns):
invTime 6277 non-null int32
prevRecord 6277 non-null int32
isOverflow 6277 non-null int32
secondsStage1_2 6277 non-null int32
secondsStage3 6277 non-null int32
providerId 6277 non-null object
stationId 6277 non-null object
handbook5Id 6277 non-null object](url)
~snip~
```
A bit hacky, but it works.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,152040420
https://github.com/pydata/xarray/issues/838#issuecomment-216090677,https://api.github.com/repos/pydata/xarray/issues/838,216090677,MDEyOklzc3VlQ29tbWVudDIxNjA5MDY3Nw==,210858,2016-05-02T01:37:59Z,2016-05-02T01:37:59Z,NONE,"I thought of something, is the issue here with the `unlimited` record dimension?
```
netcdf \20160430_1600 {
dimensions:
....
recNum = UNLIMITED ; // (2845 currently)
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,152040420
https://github.com/pydata/xarray/issues/838#issuecomment-216090426,https://api.github.com/repos/pydata/xarray/issues/838,216090426,MDEyOklzc3VlQ29tbWVudDIxNjA5MDQyNg==,6079398,2016-05-02T01:34:38Z,2016-05-02T01:34:38Z,NONE,"@shoyer: You're right in that MADIS netCDF files are (imo) poorly formatted. There is also the issue of `xarray.decode_cf()` not concatenating the string arrays even after fixing the `_FillValue`, `missing_value` conflict (hence requiring passing `decode_cf=False` when opening up the MADIS netCDF file). After looking at the `decode_cf` code, though, I don't think this is a bug per se (some quick debugging revealed that it doesn't seem like any variable in this netCDF file gets [past this check](https://github.com/pydata/xarray/blob/master/xarray/conventions.py#L802)), though if you feel this may in fact be a bug, I can look a bit more into it.
Unfortunately, this does mean I have to do a lot of ""manual cleaning"" of the netCDF file before exporting as a DataFrame, but it is easy to write a set of functions to accomplish this for you. That said, I can't c/p the exact code (for work-related reasons). I'm not sure how helpful this is, but when working with MADIS netCDF data, I more or less do the following as a workaround:
1. Open up the MADIS netCDF file, fix the `_FillValue` and `missing_value` conflict in the variables.
2. Drop the variables I don't want (and there is _a lot_ of filler in MADIS netCDF files).
3. Concatenate the string arrays (e.g. `stationId`, `dataProvider`).
4. Turn into a pandas DataFrame.
Though reading over it, that is kind of a [draw the owl](http://knowyourmeme.com/memes/how-to-draw-an-owl)-esque response, though. :/
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,152040420