issues: 233992696

This data as json

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
233992696	MDU6SXNzdWUyMzM5OTI2OTY=	1444	Best practice when the _Unsigned attribute is present in NetCDF files	1325771	closed	0			5	2017-06-06T19:05:07Z	2017-07-28T17:39:04Z	2017-07-28T17:39:04Z	CONTRIBUTOR				Some (large) data providers are writing NetCDF-4-extended files but using an `_Unsigned` attribute to indicate that a signed data type should be interpreted as unsigned bytes. Background: https://github.com/Unidata/netcdf4-python/issues/656 From the background discussion above, it is my understanding that xarray does not honor the attribute because it’s not a part of the CF spec, is only mentioned as a proposed attribute in the NetCDF Best Practices, and because "xarray wants the `Variable` dtype to be the same as the dtype of the data returned." Taking the above as a given, it is necessary for xarray users encountering such variables to do the following after reading the data: `dtype = data.encoding['dtype'].str.replace('i', 'u') scale_factor = data.encoding['scale_factor'] add_offset = data.encoding['add_offset'] unscale = ((data - add_offset)/scale_factor).data.astype(dtype).astype('float64') fixed = unscale * scale_factor + add_offset` The un-scaling step can be saved by turning off auto mask and scale. In order to automate the above process while still being able to use the functionality of `Dataset`, one approach might be to automatically perform the above steps on some known list of variables, and then reassign those variables to the `Dataset`. The downside is the need to read all variables up front, which could be expensive when processing large datasets where not all variables are needed. Is there another approach that would preserve lazy data loading, for instance by providing pre/post hooks for transformation functions at the `__getitem__` stage? Is there something I could do to help document that as a best practice?	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1444/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		completed	13221727	issue

Links from other tables

2 rows from issues_id in issues_labels
5 rows from issue in issue_comments