home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 499196320

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
499196320 MDU6SXNzdWU0OTkxOTYzMjA= 3348 Changing dtype on v0.13.0 causes Dataset attributes to be lost 17680388 closed 0     7 2019-09-27T02:09:27Z 2020-12-24T17:47:34Z 2020-12-24T17:47:34Z NONE      

MCVE Code Sample

```python import numpy as np import pandas as pd import xarray as xr

np.random.seed(123)

times = pd.date_range("2000-01-01", "2001-12-31", name="time") annual_cycle = np.sin(2 * np.pi * (times.dayofyear.values / 365.25 - 0.28))

base = 10 + 15 * annual_cycle.reshape(-1, 1) tmin_values = base + 3 * np.random.randn(annual_cycle.size, 3) tmax_values = base + 10 + 3 * np.random.randn(annual_cycle.size, 3)

ds = xr.Dataset({"tmin": (("time", "location"), tmin_values), "tmax": (("time", "location"), tmax_values),}, {"time": times, "location": ["IA", "IN", "IL"]})

Assign an attribute

ds = ds.assign_attrs(CRS = 'EPSG:4326')

Change dtype

ds.astype(np.float32) ```

Expected Output

ds to be returned with variables of dtype np.float32, with attributes (e.g. CRS = 'EPSG:4326') still included in the dataset.

Problem Description

On xarray version 0.12.1, changing the dtype of a dataset preserves any attached attributes, e.g:

<xarray.Dataset> Dimensions: (location: 3, time: 731) Coordinates: * location (location) <U2 'IA' 'IN' 'IL' * time (time) datetime64[ns] 2000-01-01 2000-01-02 ... 2001-12-31 Data variables: tmin (time, location) float32 -8.03737 -1.7884412 ... -4.543927 tmax (time, location) float32 12.980549 3.3104093 ... 3.8052793 Attributes: CRS: EPSG:4326

However, on xarray version 0.13.0, changing the dtype of a dataset silently drops any attached attributes, e.g:

<xarray.Dataset> Dimensions: (location: 3, time: 731) Coordinates: * time (time) datetime64[ns] 2000-01-01 2000-01-02 ... 2001-12-31 * location (location) <U2 'IA' 'IN' 'IL' Data variables: tmin (time, location) float32 -8.03737 -1.7884412 ... -4.543927 tmax (time, location) float32 12.980549 3.3104093 ... 3.8052793

This causes issues with large geospatial analyses (e.g. OpenDataCube workflows), as we need to change dtype to reduce memory, but also preserve CRS information that is used for downstream tools.

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.8 (default, Jan 14 2019, 11:02:34) [GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] python-bits: 64 OS: Linux OS-release: 4.14.133-113.112.amzn2.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: C.UTF-8 LANG: None LOCALE: en_US.UTF-8 libhdf5: 1.10.0 libnetcdf: 4.6.0 xarray: 0.13.0 pandas: 0.24.2 numpy: 1.16.2 scipy: 1.3.1 netCDF4: 1.3.1 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: 1.0.24 cfgrib: None iris: None bottleneck: None dask: 2.3.0 distributed: 2.3.2 matplotlib: 3.1.1 cartopy: 0.17.0 seaborn: None numbagg: None setuptools: 40.6.3 pip: 19.2.3 conda: None pytest: 3.5.0 IPython: 7.8.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3348/reactions",
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 7 rows from issue in issue_comments
Powered by Datasette · Queries took 0.711ms · About: xarray-datasette