id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
348081353,MDU6SXNzdWUzNDgwODEzNTM=,2348,Should the transform attribute be a six-element or nine-element tuple when reading from rasterio?,296686,closed,0,,,2,2018-08-06T21:03:43Z,2018-08-13T22:33:54Z,2018-08-13T22:33:54Z,CONTRIBUTOR,,,,"My basic question is whether XArray should be storing the rasterio transform as a 6-element tuple or a 9-element tuple - as there seems to be a mismatch between the documentation and the actual code.
The documentation at https://github.com/pydata/xarray/blob/7cd3442fc61e94601c3bfb20377f4f795cde584d/xarray/backends/rasterio_.py#L164-L170 says you can run the following code:
```
from affine import Affine
da = xr.open_rasterio('path_to_file.tif')
transform = Affine(*da.attrs['transform'])
```
This takes the tuple stored in the `transform` attribute and uses it as the arguments to the `Affine` class. However, running this gives an error: `TypeError: Expected 6 coefficients, found 9`.
If you look at the code, then this line in the `open_rasterio` function sets the `transform` attribute to be a 6-element tuple - the first 6 elements of the full Affine tuple: https://github.com/pydata/xarray/blob/7cd3442fc61e94601c3bfb20377f4f795cde584d/xarray/backends/rasterio_.py#L249. However, about twenty lines later, another chunk of code looks to see if there is a transform attribute on the rasterio dataset and if so, sets the `transform` attribute to be the full Affine tuple (that is, a 9-element tuple): https://github.com/pydata/xarray/blob/7cd3442fc61e94601c3bfb20377f4f795cde584d/xarray/backends/rasterio_.py#L262-L268
Thus there seems to be confusion both within the code and the documentation as to whether the transform should be a six-element or nine-element tuple.
Which is the intended behaviour? I am happy to submit a PR to fix either the code or the docs or both.
#### Output of ``xr.show_versions()``
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8
xarray: 0.10.8
pandas: 0.23.4
numpy: 1.14.2
scipy: 1.1.0
netCDF4: 1.4.0
h5netcdf: 0.6.1
h5py: 2.8.0
Nio: None
zarr: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.18.2
distributed: 1.22.1
matplotlib: 2.2.2
cartopy: None
seaborn: None
setuptools: 40.0.0
pip: 18.0
conda: None
pytest: None
IPython: 6.5.0
sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2348/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
165151235,MDU6SXNzdWUxNjUxNTEyMzU=,897,Allow specification of figsize in plot methods,296686,closed,0,,,1,2016-07-12T18:45:26Z,2016-12-18T22:43:19Z,2016-12-18T22:43:19Z,CONTRIBUTOR,,,,"Pandas allows a call to `plot` like:
``` python
df.plot(x='var1', y='var2', figsize=(10, 6))
```
but this doesn't seem to be possible in xarray - but it would be handy if it were.
It looks like fixing this would require modifying the `@_plot2d` decorator, specifically around https://github.com/pydata/xarray/blob/master/xarray/plot/plot.py#L376 (although I can't seem to find how to do the equivalent of `gca()` but set a `figsize` too).
Any thoughts or ideas?
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/897/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
166511736,MDU6SXNzdWUxNjY1MTE3MzY=,911,KeyError on saving to NetCDF - due to objects in attrs?,296686,closed,0,,,3,2016-07-20T07:09:38Z,2016-09-02T22:52:04Z,2016-09-02T22:52:04Z,CONTRIBUTOR,,,,"I have an xarray.Dataset that I'm trying to save out to a NetCDF file. The dataset looks like this:
``` python
Out[97]:
Dimensions: (x: 1240, y: 1162)
Coordinates:
* x (x) float64 -9.476e+05 -9.464e+05 -9.451e+05 -9.439e+05 ...
* y (y) float64 1.429e+06 1.428e+06 1.427e+06 1.426e+06 1.424e+06 ...
Data variables:
data (y, x) float32 nan nan nan nan nan nan nan nan nan nan nan nan ...
```
It has two attributes, both of which have a string key, and a value which is an object (in this case, instances of classes from the `rasterio` library).
``` python
OrderedDict([('affine',
Affine(1256.5430440955893, 0.0, -947639.6305106478,
0.0, -1256.5430440955893, 1429277.8120091767)),
('crs', CRS({'init': 'epsg:27700'}))])
```
When I try to save out the NetCDF using this code:
``` python
ds.to_netcdf('test.nc')
```
I get the following error:
```
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
in ()
----> 1 ds.to_netcdf('blah3.nc')
/Users/robin/anaconda3/lib/python3.5/site-packages/xarray/core/dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding)
789 from ..backends.api import to_netcdf
790 return to_netcdf(self, path, mode, format=format, group=group,
--> 791 engine=engine, encoding=encoding)
792
793 dump = utils.function_alias(to_netcdf, 'dump')
/Users/robin/anaconda3/lib/python3.5/site-packages/xarray/backends/api.py in to_netcdf(dataset, path, mode, format, group, engine, writer, encoding)
354 store = store_cls(path, mode, format, group, writer)
355 try:
--> 356 dataset.dump_to_store(store, sync=sync, encoding=encoding)
357 if isinstance(path, BytesIO):
358 return path.getvalue()
/Users/robin/anaconda3/lib/python3.5/site-packages/xarray/core/dataset.py in dump_to_store(self, store, encoder, sync, encoding)
735 variables, attrs = encoder(variables, attrs)
736
--> 737 store.store(variables, attrs, check_encoding)
738 if sync:
739 store.sync()
/Users/robin/anaconda3/lib/python3.5/site-packages/xarray/backends/common.py in store(self, variables, attributes, check_encoding_set)
226 cf_variables, cf_attrs = cf_encoder(variables, attributes)
227 AbstractWritableDataStore.store(self, cf_variables, cf_attrs,
--> 228 check_encoding_set)
/Users/robin/anaconda3/lib/python3.5/site-packages/xarray/backends/common.py in store(self, variables, attributes, check_encoding_set)
201 if not (k in neccesary_dims and
202 is_trivial_index(v)))
--> 203 self.set_variables(variables, check_encoding_set)
204
205 def set_attributes(self, attributes):
/Users/robin/anaconda3/lib/python3.5/site-packages/xarray/backends/common.py in set_variables(self, variables, check_encoding_set)
211 name = _encode_variable_name(vn)
212 check = vn in check_encoding_set
--> 213 target, source = self.prepare_variable(name, v, check)
214 self.writer.add(source, target)
215
/Users/robin/anaconda3/lib/python3.5/site-packages/xarray/backends/netCDF4_.py in prepare_variable(self, name, variable, check_encoding)
277 # set attributes one-by-one since netCDF4<1.0.10 can't handle
278 # OrderedDict as the input to setncatts
--> 279 nc4_var.setncattr(k, v)
280 return nc4_var, variable.data
281
netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable.setncattr (netCDF4/_netCDF4.c:33460)()
netCDF4/_netCDF4.pyx in netCDF4._netCDF4._set_att (netCDF4/_netCDF4.c:6171)()
/Users/robin/anaconda3/lib/python3.5/collections/__init__.py in __getitem__(self, key)
967 if hasattr(self.__class__, ""__missing__""):
968 return self.__class__.__missing__(self, key)
--> 969 raise KeyError(key)
970 def __setitem__(self, key, item): self.data[key] = item
971 def __delitem__(self, key): del self.data[key]
KeyError: 0
```
The error seems slightly strange to me, but it seems to be related to saving attributes. If I change the attributes to make all of the values strings (for example, using `ds['data'].attrs = {k: repr(v) for k, v in ds['data'].attrs.items()}`) then it saves out fine.
Is there a restriction on what sort of values can be stored in `attrs` and saved out to NetCDF? If so, should this be enforced somehow? It would be ideal if any object could be stored as an attr and saved out (eg. as a pickle) - but this may be difficult (for example, for multiple python versions, if using pickle).
Any thoughts?
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/911/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
166642852,MDU6SXNzdWUxNjY2NDI4NTI=,913,dtype changes after .load(),296686,closed,0,,,4,2016-07-20T17:56:35Z,2016-07-21T00:49:02Z,2016-07-21T00:49:02Z,CONTRIBUTOR,,,,"I've found that in some situations a `DataArray` using dask as the storage backend will report its `dtype` as `float32`, but then once the data has been loaded (eg. with `load()`) the `dtype` changes to `float64`.
This surprised me, and actually caught me out in a few situations where I was writing code to export a DataArray to a custom file format (where the metadata specification for the custom format needed to know the `dtype` but then complained when the actual `dtype` was difference). Is this desired behaviour, or a bug? (Or somewhere in between...?).
This only seems to occur with dask-backed DataArrays, and not 'normal' DataArrays.
**Example:**
Create the example netCDF file like this:
``` python
xa = xr.DataArray(data=np.random.rand(10, 10).astype(np.float32))
xa.to_dataset(name='data').to_netcdf('test.nc')
```
Then doing some simple operations with normal DataArrays:
``` python
normal_data = xr.open_dataset('test.nc')['data']
normal_data.dtype
# => float32
normal_data.mean(dim='dim_0').dtype
# => float32
```
But doing the same thing in dask:
``` python
dask_data = xr.open_dataset('test.nc', chunks={'dim_0': 2})['data']
dask_data.mean(dim='dim_0').dtype
# => float32
dask_data.mean(dim='dim_0').load().dtype
# => float64
```
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/913/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue