home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

1 row where user = 56541075 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue 1

state 1

  • closed 1

repo 1

  • xarray 1
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
924676925 MDU6SXNzdWU5MjQ2NzY5MjU= 5490 Nan/ changed values in output when only reading data, saving and reading again lthUniBonn 56541075 closed 0     9 2021-06-18T08:35:09Z 2023-09-13T13:38:33Z 2023-09-13T13:38:32Z NONE      

What happened: When combining monthly ERA5 data and saving it individually for single locations, different values/nan values appear when reading the single location file back in.

What you expected to happen: Both should be the same. This works, e.g. when only one month is read.

Minimal Complete Verifiable Example:

```python import xarray as xr #using version 0.18.2 import numpy as np

import dask

only as many threads as requested CPUs | only one to be requested, more threads don't seem to be used

dask.config.set(scheduler='synchronous') # this is used only because of the Cluster I work on, but keeping it here in case it is relevant

model_level_file_name_format = "{:d}europe{:d}_130_131_132_133_135.nc" ml_files = [model_level_file_name_format.format(2012, 9), model_level_file_name_format.format(2012, 10)] ds = xr.open_mfdataset(ml_files, decode_times=True)

Select single location data

lons = ds['longitude'].values lats = ds['latitude'].values i_lat, i_lon = 27,30 ds_loc = ds.sel(latitude=lats[i_lat], longitude=lons[i_lon])

Save to file

ds_loc.to_netcdf('europe_i_lat_{i_lat}i_lon{i_lon}.nc'.format(i_lat=i_lat, i_lon=i_lon))

Read in again

ds_loc_1 = xr.open_dataset('europe_i_lat_{i_lat}i_lon{i_lon}.nc'.format(i_lat=i_lat, i_lon=i_lon), decode_times=True)

print('Test all q values same: ', np.all(ds_loc.q.values == ds_loc_1.q.values)) ```

Anything else we need to know?: I tested this using these two months - many times saving the output works, or the values are slightly different (in the 6th digit). Using a larger timespan (2010-2012) even nan values appear. This issue is not clearly restricted to the q variable, I've not yet found the pattern. I've included a more detailed assessment (output, data, code) - only one month: no discrepancies - two months: discrepancies (in the second month) - 2010-2013: discrepancies and nan values at https://uni-bonn.sciebo.de/s/OLHhid8zJg65IFB I'm not sure where the issue might come from, but as the data is read in correctly at first, it does not seem to be on that side - which would then only leave the process of writing the netcdf output in xarray. I've tested this for a few years and for two months I always get the result, that not all q values are the same. I'm not sure where the problem might be, so I'm not sure where to start for a more minimal example. Hope this is ok. Cheers, Lavinia

Environment: INSTALLED VERSIONS


commit: None python: 3.9.4 | packaged by conda-forge | (default, May 10 2021, 22:13:33) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1160.25.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.utf8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.8.0

xarray: 0.18.2 pandas: 1.2.4 numpy: 1.20.3 scipy: 1.6.3 netCDF4: 1.5.6 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.5.0 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2021.06.0 distributed: 2021.06.0 matplotlib: 3.4.2 cartopy: None seaborn: None numbagg: None pint: None setuptools: 49.6.0.post20210108 pip: 21.1.2 conda: None pytest: None IPython: None sphinx: None

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5490/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 20.104ms · About: xarray-datasette