home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

1 row where repo = 13221727, state = "open" and user = 35295222 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

type 1

  • issue 1

state 1

  • open · 1 ✖

repo 1

  • xarray · 1 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
515320303 MDU6SXNzdWU1MTUzMjAzMDM= 3473 Xarray drops certain coordinates without warning after multiplying datasets folmerkrikken 35295222 open 0     1 2019-10-31T10:27:58Z 2019-12-16T18:36:32Z   NONE      

files.zip Hi, I really like Xarray because it is doing a lot of work me. However, maybe in this case it was overdoing it a bit. When making a field average of some data (data.nc) I use the grid area as weights (gridarea.nc). However, after multiplying the data with the gridarea the grid changed without warning. The gridarea is constructed using 'cdo gridarea other_data.nc gridarea.nc'. The 'other_data.nc' in this case is from the same climate model and grid as data.nc, though from different simulations (and possibly post-processing). Below a reproducable example, with the nc files attached.

MCVE Code Sample

python import xarray as xr data = xr.open_dataset('data.nc') gridarea = xr.open_dataset('gridarea.nc') Then I check whether the grids are the same python print((data.lat == gridarea.lat).all()) print((data.lon == gridarea.lon).all()) python <xarray.DataArray 'lat' ()> array(True) <xarray.DataArray 'lon' ()> array(True) Then I multiply the data with the weights python new_data = data * gridarea ```python print(data)

<xarray.Dataset> Dimensions: (lat: 16, lon: 30, time: 1) Coordinates: * time (time) datetime64[ns] 1995-01-01T12:00:00 * lon (lon) float64 0.0 1.125 2.25 3.375 4.5 ... 29.25 30.38 31.5 32.62 * lat (lat) float64 71.21 70.09 68.97 67.85 ... 57.76 56.64 55.51 54.39 Data variables: U10M (time, lat, lon) float32 ... python print(new_data)

<xarray.Dataset> Dimensions: (lat: 12, lon: 30, time: 1) Coordinates: * lat (lat) float64 71.21 70.09 68.97 65.61 ... 58.88 57.76 56.64 55.51 * time (time) datetime64[ns] 1995-01-01T12:00:00 * lon (lon) float64 0.0 1.125 2.25 3.375 4.5 ... 29.25 30.38 31.5 32.62 Data variables: empty ``` Note the different latitude for both datasets

Expected Output

The expected output is to get a dataset with the same grid as the original dataset, however, the latitude has now length 12 i.s.o. length 16

Problem Description

After digging a bit more into this problem it turns out the grids are not identical. python print(data.lat == gridarea.lat) <xarray.DataArray 'lat' (lat: 12)> array([ True, True, True, True, True, True, True, True, True, True, True, True]) Coordinates: * lat (lat) float64 71.21 70.09 68.97 65.61 ... 58.88 57.76 56.64 55.51 whilst python print(data.lat.values == gridarea.lat.values) [ True True True False False True False True True True True True True True True False] So the 'problem' is that xarray identifies the values which are not identical (e.g. 67.84978441466984 i.s.o. 67.84978441466983) and throws them out. Hence my original check if the grids are the same returns 'True' because all the values not equal are already thrown out. For me this introduced a 'conclusion rewriting error', because I averaged the data straight after multiplying with the weights. Hence, I didn't catch it at first. I'm not sure if this is 'wanted behavior'. For me it isn't at least ;-)

Could it be possible to either - make xarray not throw out the unequal values when comparing data, or
- get a warning when you multiply two fields with the same dimensions (e.g ['time','lat','lon] and ['lat','lon'] and then it returns a different sized ['lat','lon']?

Output of xr.show_versions()

# Paste the output here xr.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Apr 3 2019, 19:16:38) [GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] python-bits: 64 OS: Linux OS-release: 4.15.0-58-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.0 libnetcdf: 4.6.0 xarray: 0.14.0 pandas: 0.25.2 numpy: 1.17.3 scipy: 1.3.1 netCDF4: 1.4.0 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: 1.1.0 PseudoNetCDF: None rasterio: 1.0.26 cfgrib: None iris: None bottleneck: 1.2.1 dask: 2.3.0 distributed: None matplotlib: 3.1.1 cartopy: 0.17.0 seaborn: 0.9.0 numbagg: None setuptools: 41.2.0 pip: 9.0.1 conda: None pytest: None IPython: 7.7.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3473/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 4397.895ms · About: xarray-datasette