html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/838#issuecomment-589823579,https://api.github.com/repos/pydata/xarray/issues/838,589823579,MDEyOklzc3VlQ29tbWVudDU4OTgyMzU3OQ==,210858,2020-02-21T20:32:01Z,2020-02-21T20:32:01Z,NONE,"Just to denote that the issue still happens today with numpy=1.18.1, xarray=0.15.0, pandas=1.0.1 ``` >>> df = nc.to_dataframe() Traceback (most recent call last): File """", line 1, in File ""/opt/miniconda3/envs/prod/lib/python3.8/site-packages/xarray/core/dataset.py"", line 4465, in to_dataframe return self._to_dataframe(self.dims) File ""/opt/miniconda3/envs/prod/lib/python3.8/site-packages/xarray/core/dataset.py"", line 4451, in _to_dataframe data = [ File ""/opt/miniconda3/envs/prod/lib/python3.8/site-packages/xarray/core/dataset.py"", line 4452, in self._variables[k].set_dims(ordered_dims).values.reshape(-1) File ""/opt/miniconda3/envs/prod/lib/python3.8/site-packages/xarray/core/variable.py"", line 1345, in set_dims expanded_data = duck_array_ops.broadcast_to(self.data, tmp_shape) File ""/opt/miniconda3/envs/prod/lib/python3.8/site-packages/xarray/core/duck_array_ops.py"", line 47, in f return wrapped(*args, **kwargs) File ""<__array_function__ internals>"", line 5, in broadcast_to File ""/opt/miniconda3/envs/prod/lib/python3.8/site-packages/numpy/lib/stride_tricks.py"", line 182, in broadcast_to return _broadcast_to(array, shape, subok=subok, readonly=True) File ""/opt/miniconda3/envs/prod/lib/python3.8/site-packages/numpy/lib/stride_tricks.py"", line 125, in _broadcast_to it = np.nditer( ValueError: iterator is too large ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,152040420 https://github.com/pydata/xarray/issues/838#issuecomment-589784874,https://api.github.com/repos/pydata/xarray/issues/838,589784874,MDEyOklzc3VlQ29tbWVudDU4OTc4NDg3NA==,26384082,2020-02-21T18:50:16Z,2020-02-21T18:50:16Z,NONE,"In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity If this issue remains relevant, please comment here or remove the `stale` label; otherwise it will be marked as closed automatically ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,152040420 https://github.com/pydata/xarray/issues/838#issuecomment-375755201,https://api.github.com/repos/pydata/xarray/issues/838,375755201,MDEyOklzc3VlQ29tbWVudDM3NTc1NTIwMQ==,26440884,2018-03-23T18:12:26Z,2018-03-23T18:12:26Z,NONE,"Something maybe of interest. I recently converted some tools we have to do the above from Python 2 to 3. When the files were read in the byte chars were not converted to strings. I couldn't actually get this to work on the xarray side and had to loop through the DataFrame columns with apply(.decode(""utf-8"")) to decode them properly. I'm assuming this might be in the NetCDF4 library, but not 100% sure. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,152040420 https://github.com/pydata/xarray/issues/838#issuecomment-216093339,https://api.github.com/repos/pydata/xarray/issues/838,216093339,MDEyOklzc3VlQ29tbWVudDIxNjA5MzMzOQ==,210858,2016-05-02T02:15:22Z,2016-05-02T02:15:22Z,NONE,"@mogismog Awesome, thanks so much for the workaround :) ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,152040420 https://github.com/pydata/xarray/issues/838#issuecomment-216092859,https://api.github.com/repos/pydata/xarray/issues/838,216092859,MDEyOklzc3VlQ29tbWVudDIxNjA5Mjg1OQ==,6079398,2016-05-02T02:07:47Z,2016-05-02T02:07:47Z,NONE,"Redeeming myself (only a little bit) from my previous message here: @akrherz Was messing around with this a bit, this seems to work ok. This gets rid of unnecessary dimensions, concatenates string arrays, and turns it into a pandas DataFrame: ``` [In [1]: import xarray as xr In [2]: ds = xr.open_dataset('20160430_1600.nc', decode_cf=True, mask_and_scale=False, decode_times=False) # xarray has issue decoding the times, so you'll have to do this in pandas. In [3]: vars_to_drop = [k for k in ds.variables.iterkeys() if ('recNum' not in ds[k].dims)] In [4]: ds = ds.drop(vars_to_drop) In [5]: df = ds.to_dataframe() In [6]: df.info() Int64Index: 6277 entries, 0 to 6276 Data columns (total 93 columns): invTime 6277 non-null int32 prevRecord 6277 non-null int32 isOverflow 6277 non-null int32 secondsStage1_2 6277 non-null int32 secondsStage3 6277 non-null int32 providerId 6277 non-null object stationId 6277 non-null object handbook5Id 6277 non-null object](url) ~snip~ ``` A bit hacky, but it works. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,152040420 https://github.com/pydata/xarray/issues/838#issuecomment-216090677,https://api.github.com/repos/pydata/xarray/issues/838,216090677,MDEyOklzc3VlQ29tbWVudDIxNjA5MDY3Nw==,210858,2016-05-02T01:37:59Z,2016-05-02T01:37:59Z,NONE,"I thought of something, is the issue here with the `unlimited` record dimension? ``` netcdf \20160430_1600 { dimensions: .... recNum = UNLIMITED ; // (2845 currently) ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,152040420 https://github.com/pydata/xarray/issues/838#issuecomment-216090426,https://api.github.com/repos/pydata/xarray/issues/838,216090426,MDEyOklzc3VlQ29tbWVudDIxNjA5MDQyNg==,6079398,2016-05-02T01:34:38Z,2016-05-02T01:34:38Z,NONE,"@shoyer: You're right in that MADIS netCDF files are (imo) poorly formatted. There is also the issue of `xarray.decode_cf()` not concatenating the string arrays even after fixing the `_FillValue`, `missing_value` conflict (hence requiring passing `decode_cf=False` when opening up the MADIS netCDF file). After looking at the `decode_cf` code, though, I don't think this is a bug per se (some quick debugging revealed that it doesn't seem like any variable in this netCDF file gets [past this check](https://github.com/pydata/xarray/blob/master/xarray/conventions.py#L802)), though if you feel this may in fact be a bug, I can look a bit more into it. Unfortunately, this does mean I have to do a lot of ""manual cleaning"" of the netCDF file before exporting as a DataFrame, but it is easy to write a set of functions to accomplish this for you. That said, I can't c/p the exact code (for work-related reasons). I'm not sure how helpful this is, but when working with MADIS netCDF data, I more or less do the following as a workaround: 1. Open up the MADIS netCDF file, fix the `_FillValue` and `missing_value` conflict in the variables. 2. Drop the variables I don't want (and there is _a lot_ of filler in MADIS netCDF files). 3. Concatenate the string arrays (e.g. `stationId`, `dataProvider`). 4. Turn into a pandas DataFrame. Though reading over it, that is kind of a [draw the owl](http://knowyourmeme.com/memes/how-to-draw-an-owl)-esque response, though. :/ ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,152040420