home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where issue = 1136315478 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 2

  • ArcticSnow 3
  • andersy005 1

author_association 2

  • NONE 3
  • MEMBER 1

issue 1

  • ds.to_netcdf() changes values of variable · 4 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1039323638 https://github.com/pydata/xarray/issues/6272#issuecomment-1039323638 https://api.github.com/repos/pydata/xarray/issues/6272 IC_kwDOAMm_X8498tH2 ArcticSnow 2042458 2022-02-14T16:59:07Z 2022-02-14T16:59:07Z NONE

oh and another strange thing. The timeseries I multiply by 1 and save to sub1.nc is not exaclty the same:

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ds.to_netcdf() changes values of variable 1136315478
1039317394 https://github.com/pydata/xarray/issues/6272#issuecomment-1039317394 https://api.github.com/repos/pydata/xarray/issues/6272 IC_kwDOAMm_X8498rmS ArcticSnow 2042458 2022-02-14T16:53:06Z 2022-02-14T16:53:06Z NONE

and in case I multiply the variable z prioir to save to_netcdf() ``` In [78]: (ds.z.isel(latitude=[1,2,3], longitude=[3,4,5])*1).to_netcdf('sub1.nc')

In [79]: d1 = xr.open_dataset('sub1.nc')

In [80]: d1.z.encoding Out[80]: {'zlib': False, 'shuffle': False, 'complevel': 0, 'fletcher32': False, 'contiguous': True, 'chunksizes': None, 'source': '/home/simonfi/github/TopoPyScale_examples/ex1_norway_finse/sub1.nc', 'original_shape': (35760, 14, 3, 3), 'dtype': dtype('float32'), '_FillValue': nan} ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ds.to_netcdf() changes values of variable 1136315478
1039303417 https://github.com/pydata/xarray/issues/6272#issuecomment-1039303417 https://api.github.com/repos/pydata/xarray/issues/6272 IC_kwDOAMm_X8498oL5 ArcticSnow 2042458 2022-02-14T16:39:49Z 2022-02-14T16:41:15Z NONE

Thank you for your reply @andersy005.

So this is the encodiing before writing to netcdf, when loaded with xr.open_mfdataset(): ``` In [61]: ds.z.isel(latitude=[1,2,3], longitude=[3,4,5]).encoding Out[61]: {'source': '/home/simonfi/github/TopoPyScale_examples/ex1_norway_finse/inputs/climate/PLEV_197810.nc', 'original_shape': (744, 14, 7, 10), 'dtype': dtype('int16'), 'missing_value': -32767, '_FillValue': -32767, 'scale_factor': 0.6796473581594864, 'add_offset': 21239.89345268811}

In [63]: ds.z.isel(latitude=[1,2,3], longitude=[3,4,5]).attrs Out[63]: {'units': 'm2 s-2', 'long_name': 'Geopotential', 'standard_name': 'geopotential'} After saving to netcdf with `ds.z.isel(latitude=[1,2,3], longitude=[3,4,5]).to_netcdf('sub.nc')`, In [64]: d.z.encoding Out[64]: {'zlib': False, 'shuffle': False, 'complevel': 0, 'fletcher32': False, 'contiguous': True, 'chunksizes': None, 'source': '/home/simonfi/github/TopoPyScale_examples/ex1_norway_finse/sub.nc', 'original_shape': (35760, 14, 3, 3), 'dtype': dtype('int16'), 'missing_value': -32767, '_FillValue': -32767, 'scale_factor': 0.6796473581594864, 'add_offset': 21239.89345268811} `` So the chuncks were concatenated into this single file. Now, if I look at for instance the same timeseries before and after saving tosub.nc`:

As this offset is not applied constantly in the dimension time, I though this could be seen as a "bug". Could it be that if the encoding is not specified, each dask chunck are encoded independently?

*sorry for the large data gap in between 1980 and the 2000's.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ds.to_netcdf() changes values of variable 1136315478
1039149253 https://github.com/pydata/xarray/issues/6272#issuecomment-1039149253 https://api.github.com/repos/pydata/xarray/issues/6272 IC_kwDOAMm_X8498CjF andersy005 13301940 2022-02-14T14:26:44Z 2022-02-14T14:26:44Z MEMBER

@ArcticSnow,

The value z is a float32 which varies from 2000 to -2000 along the time dimension. After being saved in the subsample, z is still a float32 but the values that are less than -1000 are being offset by 44500.

You may have scale_factor and add_offset attributes in your dataset.

However, if I do (ds.z.isel(latitude[1,2,3], longitude=[3,4,5])*1).to_netcdf('sub.nc')

There's a chance xarray is discarding the attributes/encoding during the ds.z.isel(latitude[1,2,3], longitude=[3,4,5])*1 and as a result, netCDF ends up not encoding z during the to_netcdf() call.

What's the output of

python print(ds.z.encoding) print(ds.z.attrs) ??

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ds.to_netcdf() changes values of variable 1136315478

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 27.191ms · About: xarray-datasette
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows