issues: 730569820
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
730569820 | MDU6SXNzdWU3MzA1Njk4MjA= | 4543 | Coordinate dtype changing to object after xr.concat | 22179202 | closed | 0 | 1 | 2020-10-27T15:41:05Z | 2021-01-13T17:09:06Z | 2021-01-13T17:09:06Z | NONE | What happened: The dtype of DataArray coordinates change after concatenation using xr.concat What you expected to happen: dtype of DataArray coordinates to stay the same. Minimal Complete Verifiable Example: In the below I create two examples. The first one shows the issue happening on the coords associated to the concatenated dimension. In the second I use different dtypes and the problem appears on both dimensions. Example 1: ```python import numpy as np import xarray as xr da1 = xr.DataArray(data=np.arange(4).reshape([2, 2]), dims=["x1", "x2"], coords={"x1": np.array([0, 1]), "x2": np.array(['a', 'b'])}) da2 = xr.DataArray(data=np.arange(4).reshape([2, 2]), dims=["x1", "x2"], coords={"x1": np.array([1, 2]), "x2": np.array(['c', 'd'])}) da_joined = xr.concat([da1, da2], dim="x2") print("coord x1 dtype:") print("in da1:", da1.coords["x1"].data.dtype) print("in da2:", da2.coords["x1"].data.dtype) print("after concat:", da_joined.coords["x1"].data.dtype) this in line with expectations:coord x1 dtype:in da1: int64in da2: int64after concat: int64print("coord x2 dtype") print("in da1:", da1.coords["x2"].data.dtype) print("in da2:", da2.coords["x2"].data.dtype) print("after concat:", da_joined.coords["x2"].data.dtype) coord x2 dtypein da1: <U1in da2: <U1after concat: object # This is the problem: it should still be <U1``` Example 2: ```python da1 = xr.DataArray(data=np.arange(4).reshape([2, 2]), dims=["x1", "x2"], coords={"x1": np.array([b'\x00', b'\x01']), "x2": np.array(['a', 'b'])}) da2 = xr.DataArray(data=np.arange(4).reshape([2, 2]), dims=["x1", "x2"], coords={"x1": np.array([b'\x01', b'\x02']), "x2": np.array(['c', 'd'])}) da_joined = xr.concat([da1, da2], dim="x2") coord x1 dtype:in da1: |S1in da2: |S1after concat: object # This is the problem: it should still be |S1coord x2 dtypein da1: <U1in da2: <U1after concat: object # This is the problem: it should still be <U1``` Anything else we need to know: This seems related to https://github.com/pydata/xarray/issues/1266 Environment: Ubuntu 18.04, python 3.7.9, xarray 0.16.1 Output of <tt>xr.show_versions()</tt>xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.7.9 (default, Aug 31 2020, 12:42:55) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 5.4.0-51-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None xarray: 0.16.1 pandas: 0.25.3 numpy: 1.19.1 scipy: 1.5.3 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None pint: None setuptools: 50.3.0 pip: 20.2.4 conda: None pytest: None IPython: 7.18.1 sphinx: None |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4543/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |