id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 348081353,MDU6SXNzdWUzNDgwODEzNTM=,2348,Should the transform attribute be a six-element or nine-element tuple when reading from rasterio?,296686,closed,0,,,2,2018-08-06T21:03:43Z,2018-08-13T22:33:54Z,2018-08-13T22:33:54Z,CONTRIBUTOR,,,,"My basic question is whether XArray should be storing the rasterio transform as a 6-element tuple or a 9-element tuple - as there seems to be a mismatch between the documentation and the actual code. The documentation at https://github.com/pydata/xarray/blob/7cd3442fc61e94601c3bfb20377f4f795cde584d/xarray/backends/rasterio_.py#L164-L170 says you can run the following code: ``` from affine import Affine da = xr.open_rasterio('path_to_file.tif') transform = Affine(*da.attrs['transform']) ``` This takes the tuple stored in the `transform` attribute and uses it as the arguments to the `Affine` class. However, running this gives an error: `TypeError: Expected 6 coefficients, found 9`. If you look at the code, then this line in the `open_rasterio` function sets the `transform` attribute to be a 6-element tuple - the first 6 elements of the full Affine tuple: https://github.com/pydata/xarray/blob/7cd3442fc61e94601c3bfb20377f4f795cde584d/xarray/backends/rasterio_.py#L249. However, about twenty lines later, another chunk of code looks to see if there is a transform attribute on the rasterio dataset and if so, sets the `transform` attribute to be the full Affine tuple (that is, a 9-element tuple): https://github.com/pydata/xarray/blob/7cd3442fc61e94601c3bfb20377f4f795cde584d/xarray/backends/rasterio_.py#L262-L268 Thus there seems to be confusion both within the code and the documentation as to whether the transform should be a six-element or nine-element tuple. Which is the intended behaviour? I am happy to submit a PR to fix either the code or the docs or both. #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.6.6.final.0 python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 xarray: 0.10.8 pandas: 0.23.4 numpy: 1.14.2 scipy: 1.1.0 netCDF4: 1.4.0 h5netcdf: 0.6.1 h5py: 2.8.0 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: 0.18.2 distributed: 1.22.1 matplotlib: 2.2.2 cartopy: None seaborn: None setuptools: 40.0.0 pip: 18.0 conda: None pytest: None IPython: 6.5.0 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2348/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 165151235,MDU6SXNzdWUxNjUxNTEyMzU=,897,Allow specification of figsize in plot methods,296686,closed,0,,,1,2016-07-12T18:45:26Z,2016-12-18T22:43:19Z,2016-12-18T22:43:19Z,CONTRIBUTOR,,,,"Pandas allows a call to `plot` like: ``` python df.plot(x='var1', y='var2', figsize=(10, 6)) ``` but this doesn't seem to be possible in xarray - but it would be handy if it were. It looks like fixing this would require modifying the `@_plot2d` decorator, specifically around https://github.com/pydata/xarray/blob/master/xarray/plot/plot.py#L376 (although I can't seem to find how to do the equivalent of `gca()` but set a `figsize` too). Any thoughts or ideas? ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/897/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 166511736,MDU6SXNzdWUxNjY1MTE3MzY=,911,KeyError on saving to NetCDF - due to objects in attrs?,296686,closed,0,,,3,2016-07-20T07:09:38Z,2016-09-02T22:52:04Z,2016-09-02T22:52:04Z,CONTRIBUTOR,,,,"I have an xarray.Dataset that I'm trying to save out to a NetCDF file. The dataset looks like this: ``` python Out[97]: Dimensions: (x: 1240, y: 1162) Coordinates: * x (x) float64 -9.476e+05 -9.464e+05 -9.451e+05 -9.439e+05 ... * y (y) float64 1.429e+06 1.428e+06 1.427e+06 1.426e+06 1.424e+06 ... Data variables: data (y, x) float32 nan nan nan nan nan nan nan nan nan nan nan nan ... ``` It has two attributes, both of which have a string key, and a value which is an object (in this case, instances of classes from the `rasterio` library). ``` python OrderedDict([('affine', Affine(1256.5430440955893, 0.0, -947639.6305106478, 0.0, -1256.5430440955893, 1429277.8120091767)), ('crs', CRS({'init': 'epsg:27700'}))]) ``` When I try to save out the NetCDF using this code: ``` python ds.to_netcdf('test.nc') ``` I get the following error: ``` --------------------------------------------------------------------------- KeyError Traceback (most recent call last) in () ----> 1 ds.to_netcdf('blah3.nc') /Users/robin/anaconda3/lib/python3.5/site-packages/xarray/core/dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding) 789 from ..backends.api import to_netcdf 790 return to_netcdf(self, path, mode, format=format, group=group, --> 791 engine=engine, encoding=encoding) 792 793 dump = utils.function_alias(to_netcdf, 'dump') /Users/robin/anaconda3/lib/python3.5/site-packages/xarray/backends/api.py in to_netcdf(dataset, path, mode, format, group, engine, writer, encoding) 354 store = store_cls(path, mode, format, group, writer) 355 try: --> 356 dataset.dump_to_store(store, sync=sync, encoding=encoding) 357 if isinstance(path, BytesIO): 358 return path.getvalue() /Users/robin/anaconda3/lib/python3.5/site-packages/xarray/core/dataset.py in dump_to_store(self, store, encoder, sync, encoding) 735 variables, attrs = encoder(variables, attrs) 736 --> 737 store.store(variables, attrs, check_encoding) 738 if sync: 739 store.sync() /Users/robin/anaconda3/lib/python3.5/site-packages/xarray/backends/common.py in store(self, variables, attributes, check_encoding_set) 226 cf_variables, cf_attrs = cf_encoder(variables, attributes) 227 AbstractWritableDataStore.store(self, cf_variables, cf_attrs, --> 228 check_encoding_set) /Users/robin/anaconda3/lib/python3.5/site-packages/xarray/backends/common.py in store(self, variables, attributes, check_encoding_set) 201 if not (k in neccesary_dims and 202 is_trivial_index(v))) --> 203 self.set_variables(variables, check_encoding_set) 204 205 def set_attributes(self, attributes): /Users/robin/anaconda3/lib/python3.5/site-packages/xarray/backends/common.py in set_variables(self, variables, check_encoding_set) 211 name = _encode_variable_name(vn) 212 check = vn in check_encoding_set --> 213 target, source = self.prepare_variable(name, v, check) 214 self.writer.add(source, target) 215 /Users/robin/anaconda3/lib/python3.5/site-packages/xarray/backends/netCDF4_.py in prepare_variable(self, name, variable, check_encoding) 277 # set attributes one-by-one since netCDF4<1.0.10 can't handle 278 # OrderedDict as the input to setncatts --> 279 nc4_var.setncattr(k, v) 280 return nc4_var, variable.data 281 netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable.setncattr (netCDF4/_netCDF4.c:33460)() netCDF4/_netCDF4.pyx in netCDF4._netCDF4._set_att (netCDF4/_netCDF4.c:6171)() /Users/robin/anaconda3/lib/python3.5/collections/__init__.py in __getitem__(self, key) 967 if hasattr(self.__class__, ""__missing__""): 968 return self.__class__.__missing__(self, key) --> 969 raise KeyError(key) 970 def __setitem__(self, key, item): self.data[key] = item 971 def __delitem__(self, key): del self.data[key] KeyError: 0 ``` The error seems slightly strange to me, but it seems to be related to saving attributes. If I change the attributes to make all of the values strings (for example, using `ds['data'].attrs = {k: repr(v) for k, v in ds['data'].attrs.items()}`) then it saves out fine. Is there a restriction on what sort of values can be stored in `attrs` and saved out to NetCDF? If so, should this be enforced somehow? It would be ideal if any object could be stored as an attr and saved out (eg. as a pickle) - but this may be difficult (for example, for multiple python versions, if using pickle). Any thoughts? ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/911/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 166642852,MDU6SXNzdWUxNjY2NDI4NTI=,913,dtype changes after .load(),296686,closed,0,,,4,2016-07-20T17:56:35Z,2016-07-21T00:49:02Z,2016-07-21T00:49:02Z,CONTRIBUTOR,,,,"I've found that in some situations a `DataArray` using dask as the storage backend will report its `dtype` as `float32`, but then once the data has been loaded (eg. with `load()`) the `dtype` changes to `float64`. This surprised me, and actually caught me out in a few situations where I was writing code to export a DataArray to a custom file format (where the metadata specification for the custom format needed to know the `dtype` but then complained when the actual `dtype` was difference). Is this desired behaviour, or a bug? (Or somewhere in between...?). This only seems to occur with dask-backed DataArrays, and not 'normal' DataArrays. **Example:** Create the example netCDF file like this: ``` python xa = xr.DataArray(data=np.random.rand(10, 10).astype(np.float32)) xa.to_dataset(name='data').to_netcdf('test.nc') ``` Then doing some simple operations with normal DataArrays: ``` python normal_data = xr.open_dataset('test.nc')['data'] normal_data.dtype # => float32 normal_data.mean(dim='dim_0').dtype # => float32 ``` But doing the same thing in dask: ``` python dask_data = xr.open_dataset('test.nc', chunks={'dim_0': 2})['data'] dask_data.mean(dim='dim_0').dtype # => float32 dask_data.mean(dim='dim_0').load().dtype # => float64 ``` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/913/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue