issue_comments
7 rows where author_association = "NONE" and issue = 343659822 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
issue 1
- float32 instead of float64 when decoding int16 with scale_factor netcdf var using xarray · 7 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
852069023 | https://github.com/pydata/xarray/issues/2304#issuecomment-852069023 | https://api.github.com/repos/pydata/xarray/issues/2304 | MDEyOklzc3VlQ29tbWVudDg1MjA2OTAyMw== | ACHMartin 18679148 | 2021-06-01T12:03:55Z | 2021-06-07T20:48:00Z | NONE | Dear all and thank you for your work on Xarray, Link to @magau comment, I have a netcdf with multiple variables in different format (float, short, byte). Using open_mfdataset 'short' and 'byte' are converted in 'float64' (no scaling, but some masking for the float data). It doesn't raise major issue for me, but it is taking plenty of memory space for nothing. Below an example of the 3 format from (ncdump -h):
And how they appear after opening in as xarray using open_mfdataset:
Is there any recommandation? Regards |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
float32 instead of float64 when decoding int16 with scale_factor netcdf var using xarray 343659822 | |
731253022 | https://github.com/pydata/xarray/issues/2304#issuecomment-731253022 | https://api.github.com/repos/pydata/xarray/issues/2304 | MDEyOklzc3VlQ29tbWVudDczMTI1MzAyMg== | psybot-ca 66918146 | 2020-11-20T15:59:13Z | 2020-11-20T15:59:13Z | NONE | Hey everyone, tumbled on this while searching for approximately the same problem. Thought I'd share since the issue is still open. On my part, there is two situations that seem buggy. I haven't been using xarray for that long yet so maybe there is something I'm missing here... My first problem relates to the data types of dimensions with float notation. To give another answer to @shoyer's question:
it is a problem in my case because I would like to perform slicing operations of a dataset using longitude values from another dataset. This operation raises a "KeyError : not all values found in index 'longitude'" since either one of the dataset's longitude is float32 and the other is float64 or because both datasets' float32 approximations are not exactly the same value in each dataset. I can work around this and assign new coords to be float64 after reading and it works, though it is kind of a hassle considering I have to perform this thousands of times. This situation also create a problem when concatenating multiple netCDF files together (along time dim in my case). The discrepancies between the approximations of float32 values or the float32 vs float 64 situation will add new dimension values where it shouldn't. On the second part of my problem, it comes with writing/reading netCDF files (maybe more related to @daoudjahdou problem). I tried to change the data type to float64 for all my files, save them and then perform what I need to do, but for some reason even though dtype is float64 for all my dimensions when writing the files (using default args), it will sometime be float32, sometime float64 when reading the files (with default ags values) previously saved with float64 dtype. If using the default args, shouldn't the decoding makes the dtype of dimension the same for all files I read? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
float32 instead of float64 when decoding int16 with scale_factor netcdf var using xarray 343659822 | |
462592638 | https://github.com/pydata/xarray/issues/2304#issuecomment-462592638 | https://api.github.com/repos/pydata/xarray/issues/2304 | MDEyOklzc3VlQ29tbWVudDQ2MjU5MjYzOA== | magau 791145 | 2019-02-12T02:48:00Z | 2019-02-12T02:48:00Z | NONE | Hi everyone, I've start using xarray recently, so I apologize if I'm saying something wrong... I've also faced the here reported issue, so have tried to find some answers. Unpacking netcdf files with respect to the NUG attributes (scale_factor and add_offset) seems to be mentioned by the CF-Conventions directives. And it's clear about which data type should be applied to the unpacked data. cf-conventions-1.7/packed-data In this chapter you can read that: "If the scale_factor and add_offset attributes are of the same data type as the associated variable, the unpacked data is assumed to be of the same data type as the packed data. However, if the scale_factor and add_offset attributes are of a different data type from the variable (containing the packed data) then the unpacked data should match the type of these attributes". In my opinion this should be the default behavior of the xarray.decode_cf function. Which doesn't invalidate the idea of forcing the unpacked data dtype. However non of the CFScaleOffsetCoder and CFMaskCoder de/encoder classes seems to be according with these CF directives, since the first one doesn't look for the scale_factor or add_offset dtypes, and the second one also changes the unpacked data dtype (maybe because nan values are being used to replace the fill values). Sorry for such an extensive comment, without any solutions proposal... Regards! :+1: |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
float32 instead of float64 when decoding int16 with scale_factor netcdf var using xarray 343659822 | |
451984471 | https://github.com/pydata/xarray/issues/2304#issuecomment-451984471 | https://api.github.com/repos/pydata/xarray/issues/2304 | MDEyOklzc3VlQ29tbWVudDQ1MTk4NDQ3MQ== | DevDaoud 971382 | 2019-01-07T16:04:11Z | 2019-01-07T16:04:11Z | NONE | Hi, thank you for your effort into making xarray a great library. As mentioned in the issue the discussion went on a PR in order to make xr.open_dataset parametrable. This post is about asking you about recommendations regarding our PR. In this case we would add a parameter to the open_dataset function called "force_promote" which is a boolean and False by default and thus not mandatory. And then spread that parameter down to the function maybe_promote in dtypes.py Where we say the following: if dtype.itemsize <= 2 and not force_promote: dtype = np.float32 else: dtype = np.float64 The downside of that is that we somehow pollute the code with a parameter that is used in a specific case. The second approach would check the value of an environment variable called "XARRAY_FORCE_PROMOTE" if it exists and set to true would force promoting type to float64. please tells us which approach suits best your vision of xarray. Regards. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
float32 instead of float64 when decoding int16 with scale_factor netcdf var using xarray 343659822 | |
412492776 | https://github.com/pydata/xarray/issues/2304#issuecomment-412492776 | https://api.github.com/repos/pydata/xarray/issues/2304 | MDEyOklzc3VlQ29tbWVudDQxMjQ5Mjc3Ng== | DevDaoud 971382 | 2018-08-13T11:51:15Z | 2018-08-13T11:51:15Z | NONE | Any updates about this ? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
float32 instead of float64 when decoding int16 with scale_factor netcdf var using xarray 343659822 | |
410678021 | https://github.com/pydata/xarray/issues/2304#issuecomment-410678021 | https://api.github.com/repos/pydata/xarray/issues/2304 | MDEyOklzc3VlQ29tbWVudDQxMDY3ODAyMQ== | DevDaoud 971382 | 2018-08-06T11:31:00Z | 2018-08-06T11:31:00Z | NONE | As mentioned in the original issue the modification is straightforward. Any ideas if this could be integrated to xarray anytime soon ? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
float32 instead of float64 when decoding int16 with scale_factor netcdf var using xarray 343659822 | |
407092265 | https://github.com/pydata/xarray/issues/2304#issuecomment-407092265 | https://api.github.com/repos/pydata/xarray/issues/2304 | MDEyOklzc3VlQ29tbWVudDQwNzA5MjI2NQ== | DevDaoud 971382 | 2018-07-23T15:10:13Z | 2018-07-23T15:10:13Z | NONE | Thank you for your quick answer. In our case we could evaluate std dev or square sums on long lists of values and the accumulation of those small values due to float32 type could create considerable differences. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
float32 instead of float64 when decoding int16 with scale_factor netcdf var using xarray 343659822 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 4