home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

16 rows where issue = 407746874 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: body, created_at (date), updated_at (date)

user 3

  • DevDaoud 10
  • shoyer 5
  • fmaussion 1

author_association 2

  • NONE 10
  • MEMBER 6

issue 1

  • Ability to force float64 instead of float32 issue #2304 · 16 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
558605566 https://github.com/pydata/xarray/pull/2751#issuecomment-558605566 https://api.github.com/repos/pydata/xarray/issues/2751 MDEyOklzc3VlQ29tbWVudDU1ODYwNTU2Ng== fmaussion 10050469 2019-11-26T12:23:44Z 2019-11-26T12:23:44Z MEMBER

I've noted that the very recent ERA5 data is also cast to float32 at decoding, although the scale and offset params are float64. I think it would be a very valuable addition to xarray to do this properly: @daoudjahdou are you still willing to work on this PR? Otherwise I'll try to pick this up when time permits.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Ability to force float64 instead of float32 issue #2304 407746874
495106994 https://github.com/pydata/xarray/pull/2751#issuecomment-495106994 https://api.github.com/repos/pydata/xarray/issues/2751 MDEyOklzc3VlQ29tbWVudDQ5NTEwNjk5NA== DevDaoud 971382 2019-05-23T07:48:39Z 2019-05-23T07:48:39Z NONE

@shoyer i've tested the solution provided, it works like a charm with my tests however many tests are broken on test_backends.py with cases where we lose precision i'll give you more detail.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Ability to force float64 instead of float32 issue #2304 407746874
480687178 https://github.com/pydata/xarray/pull/2751#issuecomment-480687178 https://api.github.com/repos/pydata/xarray/issues/2751 MDEyOklzc3VlQ29tbWVudDQ4MDY4NzE3OA== shoyer 1217238 2019-04-08T05:19:47Z 2019-04-08T06:00:03Z MEMBER

Correct me if i'm wrong but this is a bit hard to do without knowing max,min values to avoid overlapping Evaluating those could break performance if only used for encoding and decoding.

Right, we definitely don't want to inspect the data to verify if it fits.

~~I think we will will need explicitly checks for isinstance(x, (int, float)) (for scale_factor and add_offset) to identify these cases in the encode() method. Let me try to come up with a concrete suggestion...~~ EDIT: See my suggestion below. I don't think we want these explicit checks, after all.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Ability to force float64 instead of float32 issue #2304 407746874
479831812 https://github.com/pydata/xarray/pull/2751#issuecomment-479831812 https://api.github.com/repos/pydata/xarray/issues/2751 MDEyOklzc3VlQ29tbWVudDQ3OTgzMTgxMg== DevDaoud 971382 2019-04-04T09:55:14Z 2019-04-04T09:55:14Z NONE

@shoyer sorry for the delayed response. dtypes.result_type(1, np.float32(1)) returns dtype('float64') That's what makes this behaviour for Python's int and float Keeping the consistency would then require testing if scale_factor*var_dtype + add_offset fit in var_dtype in case of Python's int and float. Correct me if i'm wrong but this is a bit hard to do without knowing max,min values to avoid overlapping Evaluating those could break performance if only used for encoding and decoding. Do have any idea how this could be acheived ? or is it simpler to keep it as it is ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Ability to force float64 instead of float32 issue #2304 407746874
473496064 https://github.com/pydata/xarray/pull/2751#issuecomment-473496064 https://api.github.com/repos/pydata/xarray/issues/2751 MDEyOklzc3VlQ29tbWVudDQ3MzQ5NjA2NA== shoyer 1217238 2019-03-16T03:50:00Z 2019-03-16T03:50:00Z MEMBER

@shoyer do you mean that we consider that by default when we deal with Python's int and float we cast them to np.int64 and np.float64 ?

Currently we cast to int32/float64 for netCDF3 files (which don't support int64) and int64/float64 for netCDF4 files.

That default behavior doesn't really make sense if the user provided builtin Python numbers for these attributes.

For example, if I write something like the following, the data for x should probably decode to float32: python ds = xarray.Dataset({'x': np.float32(0)}) encoding = {'x': {'scale_factor': 1, 'add_offset': 1, 'dtype': 'int8'}} ds.to_netcdf('test.nc', encoding=encoding) With the current implementation in this PR, I think the data will come back as float64.

To get something that will decode as float64 without the original data being float64, the user should need to intentionally choose dtypes for scale_factor and add_offsert, e.g., python ds = xarray.Dataset({'x': np.float32(0)}) encoding = {'x': {'scale_factor': np.float64(1), 'add_offset': np.float64(1), 'dtype': 'int8'}} ds.to_netcdf('test.nc', encoding=encoding)

Does this make sense to you? From my perspective, two goals for this change should be: 1. Adhere to CF conventions for deciding how to decode data from disk. 2. Provide a consistent interface for xarray users, so that by default data is restored in the same form that it was written.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Ability to force float64 instead of float32 issue #2304 407746874
473296942 https://github.com/pydata/xarray/pull/2751#issuecomment-473296942 https://api.github.com/repos/pydata/xarray/issues/2751 MDEyOklzc3VlQ29tbWVudDQ3MzI5Njk0Mg== DevDaoud 971382 2019-03-15T14:00:34Z 2019-03-15T14:00:34Z NONE

@shoyer do you mean that we consider that by default when we deal with Python's int and float we cast them to np.int64 and np.float64 ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Ability to force float64 instead of float32 issue #2304 407746874
473165754 https://github.com/pydata/xarray/pull/2751#issuecomment-473165754 https://api.github.com/repos/pydata/xarray/issues/2751 MDEyOklzc3VlQ29tbWVudDQ3MzE2NTc1NA== shoyer 1217238 2019-03-15T05:42:18Z 2019-03-15T05:42:18Z MEMBER

@daoudjahdou thanks for sticking with this.

I'm concerned about how lose the "roundtrip" invariant in some edge cases, where the user did not specifically indicate the desired dtype but rather used a Python type. Right now xarray seems to use inconsistent rules for casting attribute types, based on the netCDF backend: ``` In [8]: for engine in ['scipy', 'netcdf4', 'h5netcdf']: ...: ds = xarray.Dataset({'x': np.float32(0)}) ...: encoding = {'x': {'scale_factor': 1, 'add_offset': 1}} ...: ds.to_netcdf(f'test-{engine}.nc', engine=engine, encoding=encoding) ...:

In [9]: ! ncdump -h test-scipy.nc netcdf test-scipy { variables: float x ; x:add_offset = 1 ; x:scale_factor = 1 ; x:_FillValue = NaNf ; }

In [10]: ! ncdump -h test-netcdf4.nc netcdf test-netcdf4 { variables: float x ; x:_FillValue = NaNf ; x:add_offset = 1LL ; x:scale_factor = 1LL ; }

In [11]: ! ncdump -h test-h5netcdf.nc netcdf test-h5netcdf { variables: float x ; x:_FillValue = NaNf ; x:add_offset = 1LL ; x:scale_factor = 1LL ; } ```

At least for add_offset and scale_factor, I think we should be more careful when determining the attribute dtype. Python's int and float don't have any intrinsic dtype. If the attributes are specified as native Python types (not NumPy types) and can be safely cast into the dtype of the data, we should probably use that instead.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Ability to force float64 instead of float32 issue #2304 407746874
470881715 https://github.com/pydata/xarray/pull/2751#issuecomment-470881715 https://api.github.com/repos/pydata/xarray/issues/2751 MDEyOklzc3VlQ29tbWVudDQ3MDg4MTcxNQ== DevDaoud 971382 2019-03-08T10:29:32Z 2019-03-08T10:29:32Z NONE

@shoyer tests are failing but it does'nt seem to be coming from this PR, i saw the same error on other PRs as well, my tests were working fine until i made a git pull.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Ability to force float64 instead of float32 issue #2304 407746874
468177645 https://github.com/pydata/xarray/pull/2751#issuecomment-468177645 https://api.github.com/repos/pydata/xarray/issues/2751 MDEyOklzc3VlQ29tbWVudDQ2ODE3NzY0NQ== DevDaoud 971382 2019-02-28T08:09:58Z 2019-02-28T08:09:58Z NONE

@shoyer did you have a look at this ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Ability to force float64 instead of float32 issue #2304 407746874
466290710 https://github.com/pydata/xarray/pull/2751#issuecomment-466290710 https://api.github.com/repos/pydata/xarray/issues/2751 MDEyOklzc3VlQ29tbWVudDQ2NjI5MDcxMA== DevDaoud 971382 2019-02-22T06:38:32Z 2019-02-22T06:38:59Z NONE

@shoyer i changed the implementation, and took into consideration your comments. now returning largest type takes place only when decoding. i added a test with all types of scale factor, add_offset and variable.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Ability to force float64 instead of float32 issue #2304 407746874
464749511 https://github.com/pydata/xarray/pull/2751#issuecomment-464749511 https://api.github.com/repos/pydata/xarray/issues/2751 MDEyOklzc3VlQ29tbWVudDQ2NDc0OTUxMQ== DevDaoud 971382 2019-02-18T14:25:00Z 2019-02-18T14:25:25Z NONE

@shoyer now scale_factor and add_offset are taken into account when encoding and decoding data. if none of them is present or if they're not a subtype of np.generic the old behaviour takes place.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Ability to force float64 instead of float32 issue #2304 407746874
463097712 https://github.com/pydata/xarray/pull/2751#issuecomment-463097712 https://api.github.com/repos/pydata/xarray/issues/2751 MDEyOklzc3VlQ29tbWVudDQ2MzA5NzcxMg== DevDaoud 971382 2019-02-13T08:01:39Z 2019-02-13T08:01:39Z NONE

@shoyer yes sure i'll update the pull request with the mentioned modifications.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Ability to force float64 instead of float32 issue #2304 407746874
462876031 https://github.com/pydata/xarray/pull/2751#issuecomment-462876031 https://api.github.com/repos/pydata/xarray/issues/2751 MDEyOklzc3VlQ29tbWVudDQ2Mjg3NjAzMQ== shoyer 1217238 2019-02-12T18:24:51Z 2019-02-12T18:24:51Z MEMBER

@daoudjahdou what do you think of @magau's proposed solution over in https://github.com/pydata/xarray/issues/2304#issuecomment-462592638? Checking the dtypes of of add_offset and scale_factor might be a better alternative than requiring setting a custom option.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Ability to force float64 instead of float32 issue #2304 407746874
462681656 https://github.com/pydata/xarray/pull/2751#issuecomment-462681656 https://api.github.com/repos/pydata/xarray/issues/2751 MDEyOklzc3VlQ29tbWVudDQ2MjY4MTY1Ng== DevDaoud 971382 2019-02-12T09:21:59Z 2019-02-12T09:22:12Z NONE

@shoyer the logic is now propagated down to _choose_float_dtype inside CFScaleOffsetCoder, please let me know what you think.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Ability to force float64 instead of float32 issue #2304 407746874
461856173 https://github.com/pydata/xarray/pull/2751#issuecomment-461856173 https://api.github.com/repos/pydata/xarray/issues/2751 MDEyOklzc3VlQ29tbWVudDQ2MTg1NjE3Mw== shoyer 1217238 2019-02-08T16:15:11Z 2019-02-08T16:15:11Z MEMBER

@daoudjahdou sorry I missed your comment over in #2304. Threading the new parameter down though other functions is definitely the preferred approach here -- mutating global variables makes it much harder to follow control flow. Though I would probably stop short of modifying maybe_promote and would put this logic inside CFMaskCoder and/or CFScaleOffsetCoder in xarray.coding.variables.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Ability to force float64 instead of float32 issue #2304 407746874
461761945 https://github.com/pydata/xarray/pull/2751#issuecomment-461761945 https://api.github.com/repos/pydata/xarray/issues/2751 MDEyOklzc3VlQ29tbWVudDQ2MTc2MTk0NQ== DevDaoud 971382 2019-02-08T10:41:34Z 2019-02-08T10:41:34Z NONE

@shoyer did you have a look at this ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Ability to force float64 instead of float32 issue #2304 407746874

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 565.434ms · About: xarray-datasette