home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

1 row where user = 11723107 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue 1

state 1

  • closed 1

repo 1

  • xarray 1
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
667203487 MDU6SXNzdWU2NjcyMDM0ODc= 4282 Values change when writing combined Dataset loaded with open_mfdataset chpolste 11723107 closed 0     1 2020-07-28T16:20:09Z 2022-04-09T03:00:55Z 2022-04-09T03:00:55Z NONE      

What happened:

Loading two netcdf files with open_mfdataset then writing into a combined file results in some values changed in the file.

What you expected to happen:

That the written file contains the same values than the in-memory Dataset when read again.

Minimal Complete Verifiable Example:

```python

import numpy as np import xarray as xr data1 = xr.open_dataset("file1.nc") data2 = xr.open_dataset("file2.nc") merged = xr.open_mfdataset(["file1.nc", "file2.nc"]) np.all(np.isclose(merged["u"].values[0], data1["u"].values[0])) True np.all(np.isclose(merged["u"].values[-1], data2["u"].values[-1])) True merged.to_netcdf("foo.nc") merged_file = xr.load_dataset("foo.nc") np.all(np.isclose(merged_file["u"].values, merged["u"].values)) False ```

The files contain wind data from the ERA5 reanalysis, downloaded from CDS.

Anything else we need to know?:

The issue might be related to the scale and offset values of the variable. Continuing the example:

```python

np.all(np.isclose(merged_file["u"].values[0], data1["u"].values[0])) True np.all(np.isclose(merged_file["u"].values[-1], data2["u"].values[-1])) False ```

Data from the first file seems to be correct. When writing the combined dataset, the scale and offset from the first file are written to the combined file:

```python

data1_nomas = xr.open_dataset("file1.nc", mask_and_scale=False) data2_nomas = xr.open_dataset("file2.nc", mask_and_scale=False) merged_file_nomas = xr.open_dataset("foo.nc", mask_and_scale=False) data1_nomas["u"].attrs {'scale_factor': 0.002397265127278432, 'add_offset': 25.620963232670736, '_FillValue': -32767, 'missing_value': -32767, 'units': 'm s-1', 'long_name': 'U component of wind', 'standard_name': 'eastward_wind'} data2_nomas["u"].attrs {'scale_factor': 0.0024358825557859445, 'add_offset': 21.288035293585388, '_FillValue': -32767, 'missing_value': -32767, 'units': 'm s-1', 'long_name': 'U component of wind', 'standard_name': 'eastward_wind'} merged_file_nomas["u"].attrs {'scale_factor': 0.002397265127278432, 'add_offset': 25.620963232670736, '_FillValue': -32767, 'units': 'm s**-1', 'long_name': 'U component of wind', 'standard_name': 'eastward_wind', 'missing_value': -32767}

```

Maybe the data from the second file is not adjusted to fit the new scaling and offset.

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 | packaged by conda-forge | (default, Jun 1 2020, 18:57:50) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 4.15.0-107-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.16.0 pandas: 1.0.4 numpy: 1.18.5 scipy: 1.4.1 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.2.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: 0.9.8.2 iris: None bottleneck: None dask: 2.18.1 distributed: 2.21.0 matplotlib: 3.2.1 cartopy: 0.18.0 seaborn: None numbagg: None pint: 0.14 setuptools: 49.2.0.post20200712 pip: 20.1.1 conda: 4.8.3 pytest: None IPython: 7.16.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4282/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 23.765ms · About: xarray-datasette