home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 216090426

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/838#issuecomment-216090426 https://api.github.com/repos/pydata/xarray/issues/838 216090426 MDEyOklzc3VlQ29tbWVudDIxNjA5MDQyNg== 6079398 2016-05-02T01:34:38Z 2016-05-02T01:34:38Z NONE

@shoyer: You're right in that MADIS netCDF files are (imo) poorly formatted. There is also the issue of xarray.decode_cf() not concatenating the string arrays even after fixing the _FillValue, missing_value conflict (hence requiring passing decode_cf=False when opening up the MADIS netCDF file). After looking at the decode_cf code, though, I don't think this is a bug per se (some quick debugging revealed that it doesn't seem like any variable in this netCDF file gets past this check), though if you feel this may in fact be a bug, I can look a bit more into it.

Unfortunately, this does mean I have to do a lot of "manual cleaning" of the netCDF file before exporting as a DataFrame, but it is easy to write a set of functions to accomplish this for you. That said, I can't c/p the exact code (for work-related reasons). I'm not sure how helpful this is, but when working with MADIS netCDF data, I more or less do the following as a workaround: 1. Open up the MADIS netCDF file, fix the _FillValue and missing_value conflict in the variables. 2. Drop the variables I don't want (and there is a lot of filler in MADIS netCDF files). 3. Concatenate the string arrays (e.g. stationId, dataProvider). 4. Turn into a pandas DataFrame.

Though reading over it, that is kind of a draw the owl-esque response, though. :/

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  152040420
Powered by Datasette · Queries took 0.566ms · About: xarray-datasette