issues: 53599413
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
53599413 | MDU6SXNzdWU1MzU5OTQxMw== | 305 | xray.concat incorrectly converts string arrays to numeric dtype | 10430506 | closed | 0 | 1 | 2015-01-07T05:35:47Z | 2015-01-07T18:16:02Z | 2015-01-07T18:14:31Z | NONE | String DataArrays containing mostly representations of numbers are converted to int or float during concatenation, even when they contain non-convertible values. a = xray.DataArray(np.arange(6).reshape(3, 2).astype(str), dims=['x','y']) b = xray.DataArray(np.arange(12).reshape(3, 4).astype(str), dims=['x','y']) a.dtype, b.dtype Out: (dtype('S21'), dtype('S21')) a[0,0] = 'foo' a Out: <xray.DataArray (x: 3, y: 2)> array([['foo', '1'], ['2', '3'], ['4', '5']], dtype='|S21') Coordinates: - x (x) int64 0 1 2 - y (y) int64 0 1 xray.concat([a,b], dim='y') Out: <xray.DataArray (x: 3, y: 6)> array([[ nan, 1., 0., 1., 2., 3.], [ 2., 3., 4., 5., 6., 7.], [ 4., 5., 8., 9., 10., 11.]]) Coordinates: - x (x) int64 0 1 2 - y (y) int64 0 1 0 1 2 3 Why is this converted to float, with non-convertible strings mapped to nan? Without setting the 'foo' value, the result is an int array. The same problem happens when directly calling the inner functions: xray.Dataset._concat([a._dataset, b._dataset], dim='y')[None] and xray.Variable.concat([a._dataset._arrays[None], b._dataset._arrays[None]], 'y') |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/305/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |