id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
348462356,MDExOlB1bGxSZXF1ZXN0MjA2ODA3Mjkz,2351,Remove redundant code from open_rasterio and ensure all transform tuples are six elements long,296686,closed,0,,,2,2018-08-07T19:48:39Z,2018-08-13T22:34:18Z,2018-08-13T22:33:54Z,CONTRIBUTOR,,0,pydata/xarray/pulls/2351," - [x] Closes #2348
 - [x] Tests added (for all bug fixes or enhancements)
 - [x] Tests passed (for all non-documentation changes)
 - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)

This removes the redundant code that ended up with the `transform` attribute being set twice - and being set to a nine-element long tuple rather than the correct six-element long tuple. It also adds tests to ensure that all `transform` attributes are six-element-long tuples.

I haven't made any changes to the documentation, as I wasn't sure if it was needed. This could potentially affect users as the documentation and the code differed and people may have written other interface code (as, in my case, code to export a DataArray to a GeoTIFF using rasterio) which relies on the transform element having 9 elements rather than the 6 it is meant to have. Any thoughts?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2351/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
348081353,MDU6SXNzdWUzNDgwODEzNTM=,2348,Should the transform attribute be a six-element or nine-element tuple when reading from rasterio?,296686,closed,0,,,2,2018-08-06T21:03:43Z,2018-08-13T22:33:54Z,2018-08-13T22:33:54Z,CONTRIBUTOR,,,,"My basic question is whether XArray should be storing the rasterio transform as a 6-element tuple or a 9-element tuple - as there seems to be a mismatch between the documentation and the actual code.

The documentation at https://github.com/pydata/xarray/blob/7cd3442fc61e94601c3bfb20377f4f795cde584d/xarray/backends/rasterio_.py#L164-L170 says you can run the following code:

```
from affine import Affine
da = xr.open_rasterio('path_to_file.tif')
transform = Affine(*da.attrs['transform'])
```

This takes the tuple stored in the `transform` attribute and uses it as the arguments to the `Affine` class. However, running this gives an error: `TypeError: Expected 6 coefficients, found 9`.

If you look at the code, then this line in the `open_rasterio` function sets the `transform` attribute to be a 6-element tuple - the first 6 elements of the full Affine tuple: https://github.com/pydata/xarray/blob/7cd3442fc61e94601c3bfb20377f4f795cde584d/xarray/backends/rasterio_.py#L249. However, about twenty lines later, another chunk of code looks to see if there is a transform attribute on the rasterio dataset and if so, sets the `transform` attribute to be the full Affine tuple (that is, a 9-element tuple): https://github.com/pydata/xarray/blob/7cd3442fc61e94601c3bfb20377f4f795cde584d/xarray/backends/rasterio_.py#L262-L268

Thus there seems to be confusion both within the code and the documentation as to whether the transform should be a six-element or nine-element tuple.

Which is the intended behaviour? I am happy to submit a PR to fix either the code or the docs or both.

#### Output of ``xr.show_versions()``

<details>
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8

xarray: 0.10.8
pandas: 0.23.4
numpy: 1.14.2
scipy: 1.1.0
netCDF4: 1.4.0
h5netcdf: 0.6.1
h5py: 2.8.0
Nio: None
zarr: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.18.2
distributed: 1.22.1
matplotlib: 2.2.2
cartopy: None
seaborn: None
setuptools: 40.0.0
pip: 18.0
conda: None
pytest: None
IPython: 6.5.0
sphinx: None
</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2348/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
165151235,MDU6SXNzdWUxNjUxNTEyMzU=,897,Allow specification of figsize in plot methods,296686,closed,0,,,1,2016-07-12T18:45:26Z,2016-12-18T22:43:19Z,2016-12-18T22:43:19Z,CONTRIBUTOR,,,,"Pandas allows a call to `plot` like:

``` python
df.plot(x='var1', y='var2', figsize=(10, 6))
```

but this doesn't seem to be possible in xarray - but it would be handy if it were.

It looks like fixing this would require modifying the `@_plot2d` decorator, specifically around https://github.com/pydata/xarray/blob/master/xarray/plot/plot.py#L376 (although I can't seem to find how to do the equivalent of `gca()` but set a `figsize` too).

Any thoughts or ideas?
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/897/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
173632183,MDExOlB1bGxSZXF1ZXN0ODMwMDc3MzY=,990,Added convenience method for saving DataArray to netCDF file,296686,closed,0,,,17,2016-08-28T06:30:32Z,2016-09-06T04:00:25Z,2016-09-06T04:00:06Z,CONTRIBUTOR,,0,pydata/xarray/pulls/990,"Added a simple function to DataArray that creates a dataset with one variable
called 'data' and then saves it to a netCDF file. All parameters are passed through
to to_netcdf().

Added an equivalent function called `open_dataarray` to be used to load from these files.

Fixes #915.
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/990/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
173640823,MDExOlB1bGxSZXF1ZXN0ODMwMTI0MzE=,991,Added validation of attrs before saving to netCDF files,296686,closed,0,,,6,2016-08-28T11:01:18Z,2016-09-02T22:52:09Z,2016-09-02T22:52:04Z,CONTRIBUTOR,,0,pydata/xarray/pulls/991,"This allows us to give nice errors if users try to save a Dataset with
attr values that can't be written to a netCDF file.

Fixes #911.

I've added tests to `test_backends.py` as I can't see a better place to put them. I've also made the tests fairly extensive, but also used some helper functions to stop too much repetition. Please let me know if any of this doesn't fit within the xarray style.
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/991/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
166511736,MDU6SXNzdWUxNjY1MTE3MzY=,911,KeyError on saving to NetCDF - due to objects in attrs?,296686,closed,0,,,3,2016-07-20T07:09:38Z,2016-09-02T22:52:04Z,2016-09-02T22:52:04Z,CONTRIBUTOR,,,,"I have an xarray.Dataset that I'm trying to save out to a NetCDF file. The dataset looks like this:

``` python

Out[97]:
<xarray.Dataset>
Dimensions:  (x: 1240, y: 1162)
Coordinates:
  * x        (x) float64 -9.476e+05 -9.464e+05 -9.451e+05 -9.439e+05 ...
  * y        (y) float64 1.429e+06 1.428e+06 1.427e+06 1.426e+06 1.424e+06 ...
Data variables:
    data     (y, x) float32 nan nan nan nan nan nan nan nan nan nan nan nan ...
```

It has two attributes, both of which have a string key, and a value which is an object (in this case, instances of classes from the `rasterio` library).

``` python
OrderedDict([('affine',
              Affine(1256.5430440955893, 0.0, -947639.6305106478,
       0.0, -1256.5430440955893, 1429277.8120091767)),
             ('crs', CRS({'init': 'epsg:27700'}))])
```

When I try to save out the NetCDF using this code:

``` python
ds.to_netcdf('test.nc')
```

I get the following error:

```
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-96-51e5ff396887> in <module>()
----> 1 ds.to_netcdf('blah3.nc')

/Users/robin/anaconda3/lib/python3.5/site-packages/xarray/core/dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding)
    789         from ..backends.api import to_netcdf
    790         return to_netcdf(self, path, mode, format=format, group=group,
--> 791                          engine=engine, encoding=encoding)
    792 
    793     dump = utils.function_alias(to_netcdf, 'dump')

/Users/robin/anaconda3/lib/python3.5/site-packages/xarray/backends/api.py in to_netcdf(dataset, path, mode, format, group, engine, writer, encoding)
    354     store = store_cls(path, mode, format, group, writer)
    355     try:
--> 356         dataset.dump_to_store(store, sync=sync, encoding=encoding)
    357         if isinstance(path, BytesIO):
    358             return path.getvalue()

/Users/robin/anaconda3/lib/python3.5/site-packages/xarray/core/dataset.py in dump_to_store(self, store, encoder, sync, encoding)
    735             variables, attrs = encoder(variables, attrs)
    736 
--> 737         store.store(variables, attrs, check_encoding)
    738         if sync:
    739             store.sync()

/Users/robin/anaconda3/lib/python3.5/site-packages/xarray/backends/common.py in store(self, variables, attributes, check_encoding_set)
    226         cf_variables, cf_attrs = cf_encoder(variables, attributes)
    227         AbstractWritableDataStore.store(self, cf_variables, cf_attrs,
--> 228                                         check_encoding_set)

/Users/robin/anaconda3/lib/python3.5/site-packages/xarray/backends/common.py in store(self, variables, attributes, check_encoding_set)
    201                                 if not (k in neccesary_dims and
    202                                         is_trivial_index(v)))
--> 203         self.set_variables(variables, check_encoding_set)
    204 
    205     def set_attributes(self, attributes):

/Users/robin/anaconda3/lib/python3.5/site-packages/xarray/backends/common.py in set_variables(self, variables, check_encoding_set)
    211             name = _encode_variable_name(vn)
    212             check = vn in check_encoding_set
--> 213             target, source = self.prepare_variable(name, v, check)
    214             self.writer.add(source, target)
    215 

/Users/robin/anaconda3/lib/python3.5/site-packages/xarray/backends/netCDF4_.py in prepare_variable(self, name, variable, check_encoding)
    277             # set attributes one-by-one since netCDF4<1.0.10 can't handle
    278             # OrderedDict as the input to setncatts
--> 279             nc4_var.setncattr(k, v)
    280         return nc4_var, variable.data
    281 

netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable.setncattr (netCDF4/_netCDF4.c:33460)()

netCDF4/_netCDF4.pyx in netCDF4._netCDF4._set_att (netCDF4/_netCDF4.c:6171)()

/Users/robin/anaconda3/lib/python3.5/collections/__init__.py in __getitem__(self, key)
    967         if hasattr(self.__class__, ""__missing__""):
    968             return self.__class__.__missing__(self, key)
--> 969         raise KeyError(key)
    970     def __setitem__(self, key, item): self.data[key] = item
    971     def __delitem__(self, key): del self.data[key]

KeyError: 0
```

The error seems slightly strange to me, but it seems to be related to saving attributes. If I change the attributes to make all of the values strings (for example, using `ds['data'].attrs = {k: repr(v) for k, v in ds['data'].attrs.items()}`) then it saves out fine.

Is there a restriction on what sort of values can be stored in `attrs` and saved out to NetCDF? If so, should this be enforced somehow? It would be ideal if any object could be stored as an attr and saved out (eg. as a pickle) - but this may be difficult (for example, for multiple python versions, if using pickle).

Any thoughts?
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/911/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
166642852,MDU6SXNzdWUxNjY2NDI4NTI=,913,dtype changes after .load(),296686,closed,0,,,4,2016-07-20T17:56:35Z,2016-07-21T00:49:02Z,2016-07-21T00:49:02Z,CONTRIBUTOR,,,,"I've found that in some situations a `DataArray` using dask as the storage backend will report its `dtype` as `float32`, but then once the data has been loaded (eg. with `load()`) the `dtype` changes to `float64`.

This surprised me, and actually caught me out in a few situations where I was writing code to export a DataArray to a custom file format (where the metadata specification for the custom format needed to know the `dtype` but then complained when the actual `dtype` was difference). Is this desired behaviour, or a bug? (Or somewhere in between...?).

This only seems to occur with dask-backed DataArrays, and not 'normal' DataArrays.

**Example:**

Create the example netCDF file like this:

``` python
xa = xr.DataArray(data=np.random.rand(10, 10).astype(np.float32))
xa.to_dataset(name='data').to_netcdf('test.nc')
```

Then doing some simple operations with normal DataArrays:

``` python
normal_data = xr.open_dataset('test.nc')['data']
normal_data.dtype
    # => float32
normal_data.mean(dim='dim_0').dtype
    # => float32
```

But doing the same thing in dask:

``` python
dask_data = xr.open_dataset('test.nc', chunks={'dim_0': 2})['data']
dask_data.mean(dim='dim_0').dtype
    # => float32
dask_data.mean(dim='dim_0').load().dtype
    # => float64
```
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/913/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue