html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/889#issuecomment-315783849,https://api.github.com/repos/pydata/xarray/issues/889,315783849,MDEyOklzc3VlQ29tbWVudDMxNTc4Mzg0OQ==,6079398,2017-07-17T15:09:11Z,2017-07-17T15:09:11Z,NONE,"@jhamman Sorry for the delayed response (and the even _more_ delayed PR)! I'd love to finish this up, apologies for having this fall to the wayside.
Lemme look at it a bit after work and see how much work it would take to resolve the merge conflicts, though it seems like the issue is only in the `test_conventions.py` file, so it may not be that difficult to resolve.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,161435547
https://github.com/pydata/xarray/issues/862#issuecomment-224949305,https://api.github.com/repos/pydata/xarray/issues/862,224949305,MDEyOklzc3VlQ29tbWVudDIyNDk0OTMwNQ==,6079398,2016-06-09T16:26:36Z,2016-06-09T16:26:36Z,NONE,"> This seems a little too magical to me. How would we know if the dataset dimension was added intentionally or not?
Yeah, that's a fair point. I'll put together something that uses an optional list of dimensions to concatenate over. Thanks!
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,157545837
https://github.com/pydata/xarray/issues/862#issuecomment-224787831,https://api.github.com/repos/pydata/xarray/issues/862,224787831,MDEyOklzc3VlQ29tbWVudDIyNDc4NzgzMQ==,6079398,2016-06-09T02:51:11Z,2016-06-09T02:51:52Z,NONE,"Hey @shoyer,
Sorry for the delayed response. Passing a list of dimensions over which to concatenate over seems like it would be the easiest workaround with the fewest questions asked. As you mentioned, every dimension gets a variable by the time it is a dataset, so another option (that I'll admit I haven't thought all the way through and may not even work) would be to first check if `decode_cf` is working on a `Dataset` or `AbstractDataStore` (e: which it already does anyway), and then decide whether to concatenate over a dimension or not. I could see the latter idea not working out so well, but I'd be curious about your thoughts.
Either way, I can put something together this week and open up a PR.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,157545837
https://github.com/pydata/xarray/issues/838#issuecomment-216092859,https://api.github.com/repos/pydata/xarray/issues/838,216092859,MDEyOklzc3VlQ29tbWVudDIxNjA5Mjg1OQ==,6079398,2016-05-02T02:07:47Z,2016-05-02T02:07:47Z,NONE,"Redeeming myself (only a little bit) from my previous message here:
@akrherz Was messing around with this a bit, this seems to work ok. This gets rid of unnecessary dimensions, concatenates string arrays, and turns it into a pandas DataFrame:
```
[In [1]: import xarray as xr
In [2]: ds = xr.open_dataset('20160430_1600.nc', decode_cf=True, mask_and_scale=False, decode_times=False) # xarray has issue decoding the times, so you'll have to do this in pandas.
In [3]: vars_to_drop = [k for k in ds.variables.iterkeys() if ('recNum' not in ds[k].dims)]
In [4]: ds = ds.drop(vars_to_drop)
In [5]: df = ds.to_dataframe()
In [6]: df.info()
Int64Index: 6277 entries, 0 to 6276
Data columns (total 93 columns):
invTime 6277 non-null int32
prevRecord 6277 non-null int32
isOverflow 6277 non-null int32
secondsStage1_2 6277 non-null int32
secondsStage3 6277 non-null int32
providerId 6277 non-null object
stationId 6277 non-null object
handbook5Id 6277 non-null object](url)
~snip~
```
A bit hacky, but it works.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,152040420
https://github.com/pydata/xarray/issues/838#issuecomment-216090426,https://api.github.com/repos/pydata/xarray/issues/838,216090426,MDEyOklzc3VlQ29tbWVudDIxNjA5MDQyNg==,6079398,2016-05-02T01:34:38Z,2016-05-02T01:34:38Z,NONE,"@shoyer: You're right in that MADIS netCDF files are (imo) poorly formatted. There is also the issue of `xarray.decode_cf()` not concatenating the string arrays even after fixing the `_FillValue`, `missing_value` conflict (hence requiring passing `decode_cf=False` when opening up the MADIS netCDF file). After looking at the `decode_cf` code, though, I don't think this is a bug per se (some quick debugging revealed that it doesn't seem like any variable in this netCDF file gets [past this check](https://github.com/pydata/xarray/blob/master/xarray/conventions.py#L802)), though if you feel this may in fact be a bug, I can look a bit more into it.
Unfortunately, this does mean I have to do a lot of ""manual cleaning"" of the netCDF file before exporting as a DataFrame, but it is easy to write a set of functions to accomplish this for you. That said, I can't c/p the exact code (for work-related reasons). I'm not sure how helpful this is, but when working with MADIS netCDF data, I more or less do the following as a workaround:
1. Open up the MADIS netCDF file, fix the `_FillValue` and `missing_value` conflict in the variables.
2. Drop the variables I don't want (and there is _a lot_ of filler in MADIS netCDF files).
3. Concatenate the string arrays (e.g. `stationId`, `dataProvider`).
4. Turn into a pandas DataFrame.
Though reading over it, that is kind of a [draw the owl](http://knowyourmeme.com/memes/how-to-draw-an-owl)-esque response, though. :/
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,152040420