issues: 157545837
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
157545837 | MDU6SXNzdWUxNTc1NDU4Mzc= | 862 | decode_cf not concatenating string arrays | 6079398 | closed | 0 | 5 | 2016-05-30T19:05:49Z | 2019-02-26T19:51:17Z | 2019-02-26T19:51:17Z | NONE | TL;DR: OS: Tried on both OS X 11.10 and 11.11 xarray version: 0.7.2 installed via conda Python version: 2.7.11 Hey all, I'm not sure if this is a bug or the intended behavior, but running Specifically, MADIS netCDF files have
Doing this, though, doesn't result in 2D string arrays being concatenated: ``` In [50]: import xarray as xr In [51]: fname = '20160518_1200' In [52]: ds = xr.open_dataset(fname, decode_cf=False) In [53]: ds.stationId Out[53]: <xarray.DataArray 'stationId' (recNum: 126154, maxStaIdLen: 6)> [756924 values with dtype=|S1] Coordinates: * maxStaIdLen (maxStaIdLen) int64 0 1 2 3 4 5 * recNum (recNum) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ... Attributes: long_name: alphanumeric station Id reference: station table In [54]: for _, v in ds.variables.iteritems(): _fix_fillval_conflict(v) # You can find this function in the linked gist ....: In [55]: decoded_ds = xr.conventions.decode_cf(ds, concat_characters=True) In [56]: decoded_ds.stationId Out[56]: <xarray.DataArray 'stationId' (recNum: 126154, maxStaIdLen: 6)> [756924 values with dtype=|S1] Coordinates: * maxStaIdLen (maxStaIdLen) int64 0 1 2 3 4 5 * recNum (recNum) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ... Attributes: long_name: alphanumeric station Id reference: station table ``` That said, if you pass ``` In [57]: ds = xr.open_dataset(fname, decode_cf=True, mask_and_scale=False, decode_times=False) In [58]: ds.stationId Out[58]: <xarray.DataArray 'stationId' (recNum: 126154)> [126154 values with dtype=|S6] Coordinates: * recNum (recNum) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ... Attributes: long_name: alphanumeric station Id reference: station table ``` We then can fix the conflict and run To that extent, I've coded up tests and uploaded a gzipped MADIS netCDF file to DropBox if you're interested in reproducing this behavior. Thanks! |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/862/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |