home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

2 rows where state = "closed" and user = 26401994 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue 2

state 1

  • closed · 2 ✖

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1468838643 I_kwDOAMm_X85XjLLz 7336 Instability when calculating standard deviation ShihengDuan 26401994 closed 0     4 2022-11-29T23:33:55Z 2023-03-10T20:32:51Z 2023-03-10T20:32:50Z NONE      

What happened?

I noticed that for some large values (not really that large) and lots of samples, the data.std() yields different values than np.std(data). This seems to be related to the magnitude. See attached code here: nino34_tas_picontrol_detrend = nino34_tas_picontrol-298 std_dev = nino34_tas_picontrol_detrend.std() print(std_dev.data) std_dev = nino34_tas_picontrol.std() print(std_dev.data) nino34_tas_picontrol_detrend = nino34_tas_picontrol-10 std_dev = nino34_tas_picontrol_detrend.std() print(std_dev.data) and the results are: 1.4448999166488647 24.911161422729492 20.054718017578125

So I guess this is related to the magnitude, but not sure. Anyone has similar issue?

What did you expect to happen?

Adding or subtracting a constant should not change the standard deviation. See screenshot here about what the data look like:

Minimal Complete Verifiable Example

No response

MVCE confirmation

  • [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [ ] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [ ] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.10.4 (main, Mar 31 2022, 08:41:55) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1160.71.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.8.1 xarray: 2022.6.0 pandas: 1.4.4 numpy: 1.22.3 scipy: 1.8.1 netCDF4: 1.6.1 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.6.2 nc_time_axis: 1.4.1 PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.5 dask: 2022.9.0 distributed: 2022.9.0 matplotlib: 3.5.2 cartopy: 0.21.0 seaborn: None numbagg: None fsspec: 2022.10.0 cupy: None pint: None sparse: 0.13.0 flox: None numpy_groupies: None setuptools: 65.5.0 pip: 22.2.2 conda: None pytest: None IPython: 8.6.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7336/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
1004873981 I_kwDOAMm_X8475Sj9 5809 DataArray to_netcdf returns invalid argument ShihengDuan 26401994 closed 0     1 2021-09-22T23:57:56Z 2021-09-24T22:23:28Z 2021-09-24T22:23:28Z NONE      

What happened: When I save a dataset with to_netcdf, it shows the following error message: ``` RuntimeError Traceback (most recent call last) /tmp/ipykernel_16908/4157932485.py in <module> ----> 1 pr.to_netcdf('test.nc')

/global/homes/d/duan0000/.conda/envs/duan/lib/python3.8/site-packages/xarray/core/dataarray.py in to_netcdf(self, args, kwargs) 2820 dataset = self.to_dataset() 2821 -> 2822 return dataset.to_netcdf(args, **kwargs) 2823 2824 def to_dict(self, data: bool = True) -> dict:

/global/homes/d/duan0000/.conda/envs/duan/lib/python3.8/site-packages/xarray/core/dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf) 1898 from ..backends.api import to_netcdf 1899 -> 1900 return to_netcdf( 1901 self, 1902 path,

/global/homes/d/duan0000/.conda/envs/duan/lib/python3.8/site-packages/xarray/backends/api.py in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf) 1075 # TODO: allow this work (setting up the file for writing array data) 1076 # to be parallelized with dask -> 1077 dump_to_store( 1078 dataset, store, writer, encoding=encoding, unlimited_dims=unlimited_dims 1079 )

/global/homes/d/duan0000/.conda/envs/duan/lib/python3.8/site-packages/xarray/backends/api.py in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims) 1122 variables, attrs = encoder(variables, attrs) 1123 -> 1124 store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims) 1125 1126

/global/homes/d/duan0000/.conda/envs/duan/lib/python3.8/site-packages/xarray/backends/common.py in store(self, variables, attributes, check_encoding_set, writer, unlimited_dims) 264 self.set_attributes(attributes) 265 self.set_dimensions(variables, unlimited_dims=unlimited_dims) --> 266 self.set_variables( 267 variables, check_encoding_set, writer, unlimited_dims=unlimited_dims 268 )

/global/homes/d/duan0000/.conda/envs/duan/lib/python3.8/site-packages/xarray/backends/common.py in set_variables(self, variables, check_encoding_set, writer, unlimited_dims) 302 name = _encode_variable_name(vn) 303 check = vn in check_encoding_set --> 304 target, source = self.prepare_variable( 305 name, v, check, unlimited_dims=unlimited_dims 306 )

/global/homes/d/duan0000/.conda/envs/duan/lib/python3.8/site-packages/xarray/backends/netCDF4_.py in prepare_variable(self, name, variable, check_encoding, unlimited_dims) 484 nc4_var = self.ds.variables[name] 485 else: --> 486 nc4_var = self.ds.createVariable( 487 varname=name, 488 datatype=datatype,

src/netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Dataset.createVariable()

src/netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable.init()

src/netCDF4/_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success()

RuntimeError: NetCDF: Invalid argument ```

Minimal Complete Verifiable Example: The code is pretty simple: python pr.to_netcdf('test.nc')

Anything else we need to know?: I checked the version of my Xarray. It shows 0.20.0 from conda list but from my jupyter notebook it is 0.19.0 from xr.__version__ Environment: netCDF4.__netcdf4libversion__ shows 4.8.1

Thanks!

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5809/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 19.425ms · About: xarray-datasette