home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 620514214

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
620514214 MDU6SXNzdWU2MjA1MTQyMTQ= 4077 open_mfdataset overwrites variables with different values but overlapping coordinates 22245117 open 0     12 2020-05-18T21:22:09Z 2022-04-28T15:08:53Z   CONTRIBUTOR      

In the example below I'm opening and concatenating two datasets using open_mfdataset. These datasets have variables with different values but overlapping coordinates. I'm concatenating along y, which is 0...4 in one dataset and 0...5 in the other. The y dimension of the resulting dataset is 0...5 which means that open_mfdataset has overwritten some values without showing any error/warning.

Is this the expected default behavior? I would expect to get at least a warning, but maybe I'm misunderstanding the default arguments.

I tried to play with the arguments, but I couldn't figure out which argument I should change to get an error in these scenarios.

MCVE Code Sample

python import xarray as xr import numpy as np

python for i in range(2): ds = xr.Dataset( {"foo": (("x", "y"), np.random.rand(4, 5 + i))}, coords={"x": np.arange(4), "y": np.arange(5 + i)}, ) print(ds) ds.to_netcdf(f"tmp{i}.nc")

<xarray.Dataset>
Dimensions:  (x: 4, y: 5)
Coordinates:
  * x        (x) int64 0 1 2 3
  * y        (y) int64 0 1 2 3 4
Data variables:
    foo      (x, y) float64 0.1271 0.6117 0.3769 0.1884 ... 0.853 0.5026 0.3762
<xarray.Dataset>
Dimensions:  (x: 4, y: 6)
Coordinates:
  * x        (x) int64 0 1 2 3
  * y        (y) int64 0 1 2 3 4 5
Data variables:
    foo      (x, y) float64 0.2841 0.6098 0.7761 0.0673 ... 0.2954 0.7212 0.3954

python DS = xr.open_mfdataset("tmp*.nc", concat_dim="y", combine="by_coords") print(DS)

<xarray.Dataset>
Dimensions:  (x: 4, y: 6)
Coordinates:
  * x        (x) int64 0 1 2 3
  * y        (y) int64 0 1 2 3 4 5
Data variables:
    foo      (x, y) float64 dask.array<chunksize=(4, 6), meta=np.ndarray>

Versions

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.2 | packaged by conda-forge | (default, Apr 24 2020, 08:20:52) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 5.4.0-29-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.15.1 pandas: 1.0.3 numpy: 1.18.4 scipy: None netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.1.3 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2.16.0 distributed: 2.16.0 matplotlib: None cartopy: None seaborn: None numbagg: None setuptools: 46.4.0.post20200518 pip: 20.1 conda: None pytest: None IPython: 7.13.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4077/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 12 rows from issue in issue_comments
Powered by Datasette · Queries took 80.591ms · About: xarray-datasette