id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
1452118523,I_kwDOAMm_X85WjZH7,7293,Clarify that `chunks={}` in `.open_dataset` reproduces the default behavior of deprecated `.open_zarr`,14314623,closed,0,,,1,2022-11-16T18:58:47Z,2023-01-13T20:50:34Z,2023-01-13T20:50:34Z,CONTRIBUTOR,,,,"### What is your issue?

I was wondering if we could add some language to the docstring of `xr.open_dataset` regarding the `chunk` kwarg to make the transition for folks who have used a lot of `xr.open_zarr` in the past. 

the [current text](https://docs.xarray.dev/en/stable/generated/xarray.open_dataset.html?highlight=open_dataset) is:

>chunks ([int](https://docs.python.org/3/library/functions.html#int), [dict](https://docs.python.org/3/library/stdtypes.html#dict), 'auto' or [None](https://docs.python.org/3/library/constants.html#None), optional) – If chunks is provided, it is used to load the new dataset into dask arrays. chunks=-1 loads the dataset with dask using a single chunk for all arrays. chunks={} loads the dataset with dask using engine preferred chunks if exposed by the backend, otherwise with a single chunk for all arrays. chunks='auto' will use dask auto chunking taking into account the engine preferred chunks. See dask chunking for more details.

I found that for opening large zarr stores, setting `chunks={}` reproduces the behavior of `xr.open_zarr()`? If this is true I think it would be great to include something like

> chunks={} loads the dataset with dask using engine preferred chunks if exposed by the backend, otherwise with a single chunk for all arrays. **In order to reproduce the default behavior of `xr.open_zarr(...)` use `xr.open_dataset(..., engine='zarr', chunks={})**

to make this clear for users who have been using `xr.open_zarr` in the past. 

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7293/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1197655735,I_kwDOAMm_X85HYsa3,6459,Support **kwargs form in `.chunk()`,14314623,closed,0,,,0,2022-04-08T18:21:22Z,2022-04-11T19:36:40Z,2022-04-11T19:36:40Z,CONTRIBUTOR,,,,"### Is your feature request related to a problem?

Take a simple example

```python
import xarray as xr
da = xr.DataArray([1,2,4], dims=['x'])
```
If I want to chunk the array I can do:
```python
da.chunk({'x':1})
```

but I cant do this:
```python
da.chunk(x=1)
```

```
TypeError: chunk() got an unexpected keyword argument 'x'
```

### Describe the solution you'd like

I would like to be able to use the `.chunk()` method for dataarrays in both ways, since it is common for many xarray methods to work either way (e.g. `isel()`/`coarsen()`/etc). 

@TomNicholas 

### Describe alternatives you've considered

_No response_

### Additional context

_No response_","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6459/reactions"", ""total_count"": 4, ""+1"": 4, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
904000857,MDExOlB1bGxSZXF1ZXN0NjU1MTkzNzg0,5388,Inconsistent docstring for isel etc.,14314623,closed,0,,,1,2021-05-27T17:10:17Z,2021-05-27T19:37:40Z,2021-05-27T19:37:06Z,CONTRIBUTOR,,0,pydata/xarray/pulls/5388,"I found that the options for the `missing_dims` parameter where described inconsistently in the docstring (`""warning"" vs ""warn"").

Not sure I found all occurrences of it here.

<!-- Feel free to remove check-list items aren't relevant to your change -->

- [x] Passes `pre-commit run --all-files`
- [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst`
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5388/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
753517739,MDU6SXNzdWU3NTM1MTc3Mzk=,4625,Non lazy behavior for weighted average when using resampled data,14314623,closed,0,,,13,2020-11-30T14:19:48Z,2020-12-16T19:05:30Z,2020-12-16T19:05:30Z,CONTRIBUTOR,,,,"<!-- Please include a self-contained copy-pastable example that generates the issue if possible.

Please be concise with code posted. See guidelines below on how to provide a good bug report:

- Craft Minimal Bug Reports: http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports
- Minimal Complete Verifiable Examples: https://stackoverflow.com/help/mcve

Bug reports that follow these guidelines are easier to diagnose, and so are often handled much more quickly.
-->

I am trying to apply an averaging function to multi year chunks of monthly model data. At the core the function performs a weighted average (and then some coordinate manipulations). I am using `resample(time='1AS')` and then try to `map` my custom function onto the data (see example below). Without actually loading the data, this step is prohibitively long in my workflow (20-30min depending on the model). 
Is there a way to apply this step completely lazily, like in the case where a simple non-weighted `.mean()` is used?

```python
from dask.diagnostics import ProgressBar
import xarray as xr
import numpy as np

# simple customized weighted mean function
def mean_func(ds):
    return ds.weighted(ds.weights).mean('time')

# example dataset
t = xr.cftime_range(start='2000', periods=1000, freq='1AS')
weights = xr.DataArray(np.random.rand(len(t)),dims=['time'], coords={'time':t})
data = xr.DataArray(np.random.rand(len(t)),dims=['time'], coords={'time':t, 'weights':weights})
ds = xr.Dataset({'data':data}).chunk({'time':1})
ds
```
![image](https://user-images.githubusercontent.com/14314623/100620446-71550500-32ec-11eb-9ea5-b39399545c9f.png)

Using resample with a simple mean works without any computation being triggered:
```python
with ProgressBar():
    ds.resample(time='3AS').mean('time')
```


But when I do the same step with my custom function, there are some computations showing up

```python
with ProgressBar():
    ds.resample(time='3AS').map(mean_func)
```
```
[########################################] | 100% Completed |  0.1s
[########################################] | 100% Completed |  0.1s
[########################################] | 100% Completed |  0.1s
[########################################] | 100% Completed |  0.1s
```
I am quite sure these are the same kind of computations that make my real-world workflow so slow.

I also confirmed that this not happening when I do not use resample first
```
with ProgressBar():
    mean_func(ds)
```
this does not trigger a computation either. So this must be somehow related to `resample`? I would be happy to dig deeper into this, if somebody with more knowledge could point me to the right place.



**Environment**:

<details><summary>Output of <tt>xr.show_versions()</tt></summary>

INSTALLED VERSIONS
------------------
commit: None
python: 3.8.6 | packaged by conda-forge | (default, Oct  7 2020, 19:08:05) 
[GCC 7.5.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-1160.2.2.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.6
libnetcdf: 4.7.4

xarray: 0.16.2.dev77+g1a4f7bd
pandas: 1.1.3
numpy: 1.19.2
scipy: 1.5.2
netCDF4: 1.5.4
pydap: None
h5netcdf: 0.8.1
h5py: 2.10.0
Nio: None
zarr: 2.4.0
cftime: 1.2.1
nc_time_axis: 1.2.0
PseudoNetCDF: None
rasterio: 1.1.3
cfgrib: None
iris: None
bottleneck: None
dask: 2.30.0
distributed: 2.30.0
matplotlib: 3.3.2
cartopy: 0.18.0
seaborn: None
numbagg: None
pint: 0.16.1
setuptools: 49.6.0.post20201009
pip: 20.2.4
conda: None
pytest: 6.1.2
IPython: 7.18.1
sphinx: None

</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4625/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
760375642,MDExOlB1bGxSZXF1ZXN0NTM1MjE3ODQ5,4668,Fixing non-lazy behavior of sampled+weighted,14314623,closed,0,,,6,2020-12-09T14:26:08Z,2020-12-16T19:05:30Z,2020-12-16T19:05:30Z,CONTRIBUTOR,,0,pydata/xarray/pulls/4668,"<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [x] Closes #4625
 - [x] Tests added
 - [x] Passes `isort . && black . && mypy . && flake8`
 - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4668/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
754558237,MDU6SXNzdWU3NTQ1NTgyMzc=,4635,Unexpected error when using `weighted`,14314623,closed,0,,,2,2020-12-01T16:49:39Z,2020-12-01T20:13:24Z,2020-12-01T20:13:24Z,CONTRIBUTOR,,,,"<!-- Please include a self-contained copy-pastable example that generates the issue if possible.

Please be concise with code posted. See guidelines below on how to provide a good bug report:

- Craft Minimal Bug Reports: http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports
- Minimal Complete Verifiable Examples: https://stackoverflow.com/help/mcve

Bug reports that follow these guidelines are easier to diagnose, and so are often handled much more quickly.
-->

**What happened**:
I just updated to the newest upstream master of xarray to branch of a pull request for #4625 and noticed a strange error in my regular workflow. 

I am working with a dataset `ds_transformed` like this: 

![image](https://user-images.githubusercontent.com/14314623/100770013-c6674880-33ca-11eb-88b4-0b117e8291c8.png)

and when I try to apply a weighted mean with 
```python
transformed_ds.o2.weighted(transformed_ds.dz_t).mean('time')
```

I get the following error, which I do not understand:

```
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-101-b08fde69b74b> in <module>
----> 1 test = transformed_ds.o2.weighted(transformed_ds.dz_t).mean('time')

~/code/xarray/xarray/core/common.py in weighted(self, weights)
    788         """"""
    789 
--> 790         return self._weighted_cls(self, weights)
    791 
    792     def rolling(

~/code/xarray/xarray/core/weighted.py in __init__(self, obj, weights)
    121             _weight_check(weights.data)
    122 
--> 123         self.obj = obj
    124         self.weights = weights
    125 

TypeError: descriptor 'obj' for 'Weighted' objects doesn't apply to a 'DataArrayWeighted' object
```

Using the synthetic example from #4625 this does not show up. I am wondering if anybody has an idea what could be wrong with my dataset that would cause this error?



**Anything else we need to know?**:

**Environment**:

<details><summary>Output of <tt>xr.show_versions()</tt></summary>

INSTALLED VERSIONS
------------------
commit: None
python: 3.8.6 | packaged by conda-forge | (default, Oct  7 2020, 19:08:05) 
[GCC 7.5.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-1160.2.2.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.6
libnetcdf: 4.7.4

xarray: 0.16.3.dev2+ga41edc7.d20201201
pandas: 1.1.3
numpy: 1.19.2
scipy: 1.5.2
netCDF4: 1.5.4
pydap: None
h5netcdf: 0.8.1
h5py: 2.10.0
Nio: None
zarr: 2.4.0
cftime: 1.2.1
nc_time_axis: 1.2.0
PseudoNetCDF: None
rasterio: 1.1.3
cfgrib: None
iris: None
bottleneck: None
dask: 2.30.0
distributed: 2.30.0
matplotlib: 3.3.2
cartopy: 0.18.0
seaborn: None
numbagg: None
pint: 0.16.1
setuptools: 49.6.0.post20201009
pip: 20.2.4
conda: None
pytest: 6.1.2
IPython: 7.18.1
sphinx: None


</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4635/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
733789095,MDExOlB1bGxSZXF1ZXN0NTEzNDg0MzM0,4559,Dask friendly check in `.weighted()`,14314623,closed,0,,,15,2020-10-31T19:11:37Z,2020-11-09T16:22:51Z,2020-11-09T16:22:45Z,CONTRIBUTOR,,0,pydata/xarray/pulls/4559,"<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [x] Closes #4541 
 - [x] Tests added
 - [x] Passes `isort . && black . && mypy . && flake8`
 - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4559/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
729980097,MDU6SXNzdWU3Mjk5ODAwOTc=,4541,Option to skip tests in `weighted()`,14314623,closed,0,,,16,2020-10-26T23:32:36Z,2020-11-09T16:22:45Z,2020-11-09T16:22:45Z,CONTRIBUTOR,,,,"When working with large dask-array weights, [this check](https://github.com/pydata/xarray/blob/adc55ac4d2883e0c6647f3983c3322ca2c690514/xarray/core/weighted.py#L103-L107) triggers computation of the array. This affects [xgcms ability to layer operations lazily](https://github.com/xgcm/xgcm/blob/f0a0afd666184dd7d15bcff6f22ac50716b6c78a/xgcm/grid.py#L1737)

Would you be open to implement an option to skip this test, maybe with a warning displayed? Happy to submit a PR.

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4541/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
290084668,MDU6SXNzdWUyOTAwODQ2Njg=,1845,speed up opening multiple files with changing data variables,14314623,closed,0,,,1,2018-01-19T19:38:14Z,2020-09-23T16:47:37Z,2020-09-23T16:47:37Z,CONTRIBUTOR,,,,"#### Code Sample, a copy-pastable example if possible
I am trying to open several ocean model data files. During the model run additional variables were written to the files. So for instance the first file will look like this:

```
<xarray.Dataset>
Dimensions:         (st_edges_ocean: 51, st_ocean: 50, time: 1, xt_ocean: 3600, yt_ocean: 2700)
Coordinates:
  * xt_ocean        (xt_ocean) float64 -279.9 -279.8 -279.7 -279.6 -279.5 ...
  * yt_ocean        (yt_ocean) float64 -81.11 -81.07 -81.02 -80.98 -80.94 ...
  * time            (time) float64 4.401e+04
  * st_ocean        (st_ocean) float64 5.034 15.1 25.22 35.36 45.58 55.85 ...
  * st_edges_ocean  (st_edges_ocean) float64 0.0 10.07 20.16 30.29 40.47 ...
Data variables:
    jp_recycle      (time, st_ocean, yt_ocean, xt_ocean) float64 dask.array<shape=(1, 50, 2700, 3600), chunksize=(1, 1, 2700, 3600)>
    jp_reminp       (time, st_ocean, yt_ocean, xt_ocean) float64 dask.array<shape=(1, 50, 2700, 3600), chunksize=(1, 1, 2700, 3600)>
    jp_uptake       (time, st_ocean, yt_ocean, xt_ocean) float64 dask.array<shape=(1, 50, 2700, 3600), chunksize=(1, 1, 2700, 3600)>
    jo2             (time, st_ocean, yt_ocean, xt_ocean) float64 dask.array<shape=(1, 50, 2700, 3600), chunksize=(1, 1, 2700, 3600)>
    dic_stf         (time, yt_ocean, xt_ocean) float64 dask.array<shape=(1, 2700, 3600), chunksize=(1, 2700, 3600)>
    o2_stf          (time, yt_ocean, xt_ocean) float64 dask.array<shape=(1, 2700, 3600), chunksize=(1, 2700, 3600)>
Attributes:
    filename:   01210101.ocean_minibling_term_src.nc
    title:      CM2.6_miniBling
    grid_type:  mosaic
    grid_tile:  1
```
and the last file will look like this (with additional data variables `o2_btf`, `dic_btf`,  and 'po4_btf`).
```

<xarray.Dataset>
Dimensions:         (st_edges_ocean: 51, st_ocean: 50, time: 1, xt_ocean: 3600, yt_ocean: 2700)
Coordinates:
  * xt_ocean        (xt_ocean) float64 -279.9 -279.8 -279.7 -279.6 -279.5 ...
  * yt_ocean        (yt_ocean) float64 -81.11 -81.07 -81.02 -80.98 -80.94 ...
  * st_ocean        (st_ocean) float64 5.034 15.1 25.22 35.36 45.58 55.85 ...
  * st_edges_ocean  (st_edges_ocean) float64 0.0 10.07 20.16 30.29 40.47 ...
  * time            (time) float64 7.25e+04
Data variables:
    jp_recycle      (time, st_ocean, yt_ocean, xt_ocean) float64 dask.array<shape=(1, 50, 2700, 3600), chunksize=(1, 1, 2700, 3600)>
    jp_reminp       (time, st_ocean, yt_ocean, xt_ocean) float64 dask.array<shape=(1, 50, 2700, 3600), chunksize=(1, 1, 2700, 3600)>
    jp_uptake       (time, st_ocean, yt_ocean, xt_ocean) float64 dask.array<shape=(1, 50, 2700, 3600), chunksize=(1, 1, 2700, 3600)>
    jo2             (time, st_ocean, yt_ocean, xt_ocean) float64 dask.array<shape=(1, 50, 2700, 3600), chunksize=(1, 1, 2700, 3600)>
    dic_stf         (time, yt_ocean, xt_ocean) float64 dask.array<shape=(1, 2700, 3600), chunksize=(1, 2700, 3600)>
    dic_btf         (time, yt_ocean, xt_ocean) float64 dask.array<shape=(1, 2700, 3600), chunksize=(1, 2700, 3600)>
    o2_stf          (time, yt_ocean, xt_ocean) float64 dask.array<shape=(1, 2700, 3600), chunksize=(1, 2700, 3600)>
    o2_btf          (time, yt_ocean, xt_ocean) float64 dask.array<shape=(1, 2700, 3600), chunksize=(1, 2700, 3600)>
    po4_btf         (time, yt_ocean, xt_ocean) float64 dask.array<shape=(1, 2700, 3600), chunksize=(1, 2700, 3600)>
Attributes:
    date:       created 2014-01-08
    program:    time_average_netcdf.rb
    history:    Perform time-means on all variables in 01990101.ocean_minibli...
    filename:   01990101.ocean_minibling_term_src.nc
    title:      CM2.6_miniBling
    grid_type:  mosaic
    grid_tile:  1
```
If I specify the additional variables to be dropped, reading all files with `xarray.open_mfdataset` works like a charm. 
But without specifying the variables to be dropped it takes an excruciating amount of time to load.

First of all, I was wondering if there would be the possibility to display a warning if this situation occurs, suggesting to add these variables as `drop_variables` keyword. That would have saved me a ton of digging time.

Even better would be some way to read such datasets in a fast manner. If we could specify a `fastpath` option (like suggested in #1823), perhaps this could speed this task up (given that all dimensions stay the same)?



<details>
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-642.15.1.el6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US
LOCALE: en_US.ISO8859-1

xarray: 0.10.0rc2-2-g1a01208
pandas: 0.20.3
numpy: 1.13.3
scipy: 0.19.1
netCDF4: 1.3.0
h5netcdf: 0.4.2
Nio: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.16.0
matplotlib: 2.1.0
cartopy: 0.15.1
seaborn: 0.8.1
setuptools: 36.3.0
pip: 9.0.1
conda: None
pytest: 3.2.3
IPython: 6.2.1
sphinx: None
</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1845/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
577030502,MDU6SXNzdWU1NzcwMzA1MDI=,3841,Problems plotting long model control runs with gregorian calendar,14314623,closed,0,,,6,2020-03-06T16:11:50Z,2020-07-29T17:55:22Z,2020-07-29T17:55:22Z,CONTRIBUTOR,,,,"<!-- A short summary of the issue, if appropriate -->

I noticed a problem the other day, when I tried to plot some very long control run data from CMIP6 on ocean.pangeo.io.

The control run of this model (CSIRO's ACCESS-ESM1-5), starts in year 101, but runs for 900 years. If I try to plot the full run as a timeseries at any point in the grid:

```
import xarray as xr
import matplotlib.pyplot as plt
import intake
%matplotlib inline

col = intake.open_esm_datastore(""https://raw.githubusercontent.com/NCAR/intake-esm-datastore/master/catalogs/pangeo-cmip6.json"")
cat = col.search(variable_id='o2', source_id='ACCESS-ESM1-5', experiment_id='piControl', table_id='Omon')
data_dict = cat.to_dataset_dict(zarr_kwargs={'consolidated': True})

# This needs a good amount of dask workers!
ds = data_dict['CMIP.CSIRO.ACCESS-ESM1-5.piControl.Omon.gn']
ds.o2.isel(i=180, j=100, lev=0).plot()
```

I get the following error:
<details><summary>Error message</summary>

```
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/srv/conda/envs/notebook/lib/python3.7/site-packages/ipykernel/pylab/backend_inline.py in show(close, block)
     41             display(
     42                 figure_manager.canvas.figure,
---> 43                 metadata=_fetch_figure_metadata(figure_manager.canvas.figure)
     44             )
     45     finally:

/srv/conda/envs/notebook/lib/python3.7/site-packages/ipykernel/pylab/backend_inline.py in _fetch_figure_metadata(fig)
    179         # the background is transparent
    180         ticksLight = _is_light([label.get_color()
--> 181                                 for axes in fig.axes
    182                                 for axis in (axes.xaxis, axes.yaxis)
    183                                 for label in axis.get_ticklabels()])

/srv/conda/envs/notebook/lib/python3.7/site-packages/ipykernel/pylab/backend_inline.py in <listcomp>(.0)
    181                                 for axes in fig.axes
    182                                 for axis in (axes.xaxis, axes.yaxis)
--> 183                                 for label in axis.get_ticklabels()])
    184         if ticksLight.size and (ticksLight == ticksLight[0]).all():
    185             # there are one or more tick labels, all with the same lightness

/srv/conda/envs/notebook/lib/python3.7/site-packages/matplotlib/axis.py in get_ticklabels(self, minor, which)
   1294         if minor:
   1295             return self.get_minorticklabels()
-> 1296         return self.get_majorticklabels()
   1297 
   1298     def get_majorticklines(self):

/srv/conda/envs/notebook/lib/python3.7/site-packages/matplotlib/axis.py in get_majorticklabels(self)
   1250     def get_majorticklabels(self):
   1251         'Return a list of Text instances for the major ticklabels.'
-> 1252         ticks = self.get_major_ticks()
   1253         labels1 = [tick.label1 for tick in ticks if tick.label1.get_visible()]
   1254         labels2 = [tick.label2 for tick in ticks if tick.label2.get_visible()]

/srv/conda/envs/notebook/lib/python3.7/site-packages/matplotlib/axis.py in get_major_ticks(self, numticks)
   1405         'Get the tick instances; grow as necessary.'
   1406         if numticks is None:
-> 1407             numticks = len(self.get_majorticklocs())
   1408 
   1409         while len(self.majorTicks) < numticks:

/srv/conda/envs/notebook/lib/python3.7/site-packages/matplotlib/axis.py in get_majorticklocs(self)
   1322     def get_majorticklocs(self):
   1323         """"""Get the array of major tick locations in data coordinates.""""""
-> 1324         return self.major.locator()
   1325 
   1326     def get_minorticklocs(self):

/srv/conda/envs/notebook/lib/python3.7/site-packages/nc_time_axis/__init__.py in __call__(self)
    136     def __call__(self):
    137         vmin, vmax = self.axis.get_view_interval()
--> 138         return self.tick_values(vmin, vmax)
    139 
    140     def tick_values(self, vmin, vmax):

/srv/conda/envs/notebook/lib/python3.7/site-packages/nc_time_axis/__init__.py in tick_values(self, vmin, vmax)
    192             raise ValueError(msg)
    193 
--> 194         return utime.date2num(ticks)
    195 
    196 

cftime/_cftime.pyx in cftime._cftime.utime.date2num()

cftime/_cftime.pyx in cftime._cftime.JulianDayFromDate()

cftime/_cftime.pyx in cftime._cftime._IntJulianDayFromDate()

ValueError: year zero does not exist in the proleptic_gregorian calendar
```

</details>

When I restrict the timeseries to a shorter time it works fine:
```
ds = data_dict['CMIP.CSIRO.ACCESS-ESM1-5.piControl.Omon.gn']
ds.o2.isel(i=180, j=100, lev=0, time=slice(0,500)).plot()
```
![image](https://user-images.githubusercontent.com/14314623/76100654-07950e00-5f9b-11ea-8131-ab0a28ded5f2.png)


I assume that the internal logic tries to set the xlimit on the left side to some negative year if the full range is large. Is there a way to suppress that behavior? Or could the plotting routine default to a leftmost date of year 1 for any cftime data with gregorian calendar?
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3841/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
279909699,MDU6SXNzdWUyNzk5MDk2OTk=,1765,Error when using .apply_ufunc with .groupby_bins,14314623,closed,0,,,7,2017-12-06T21:17:34Z,2020-03-25T15:31:02Z,2020-03-25T15:31:01Z,CONTRIBUTOR,,,,"I am trying to create a function that applies a .groupby_bins operation over specified dimensions of a xarray dataset. E.g. I want to be able to sum temperture, salinity and other values grouped by oceanic oxygen concentrations.
I want to be able to be flexible over which dimensions I apply the groupby_bins operation. For instance, I would like to apply it in every depth colum (resulting in an array of (x,y,time) but also over all spatial dimensions, resulting in a timeseries.
I currently run into a strange error when I try the following.

#### Code Sample, a copy-pastable example if possible

```python
import xarray as xr
import numpy as np
import dask.array as dsa
from dask.diagnostics import ProgressBar
```

```python
def _func(data, bin_data, bins):
    """"""Group unlabeled array 'data' according to values in 'bin_data' using 
    bins defined in 'bins' and sum all values""""""
    labels = bins[1:]
    da_data = xr.DataArray(data, name='data')
    da_bin_data = xr.DataArray(bin_data, name='bin_data')
    
    binned = da_data.groupby_bins(da_bin_data, bins, labels=labels,
                              include_lowest=True).sum()
    return binned

def wrapper(obj, bin_obj, bins, dims):
        obj = obj.copy()
        bin_obj = bin_obj.copy()
        n_bins = len(bins)-1
        
        binned = xr.apply_ufunc(_func, obj, bin_obj, bins,
                            vectorize=True,
                            input_core_dims=[dims, dims, ['dummy']],
                            output_core_dims=[['bin_data_bins']],
                            output_dtypes=[np.float],
                            output_sizes={'bin_data_bins':n_bins},
                            dask='parallelized')
        return binned
```
I am showing the problem here on a sythetic example, since my current working dataset is quite big.
The problem is the exact same.
```python
# Groupby bins problem with small bins?
x_raw = np.arange(20)
y_raw = np.arange(10)
z_raw = np.arange(15)

x = xr.DataArray(dsa.from_array(x_raw, chunks=(-1)), dims=['x'], coords={'x':('x', x_raw)})
y = xr.DataArray(dsa.from_array(y_raw, chunks=(-1)), dims=['y'], coords={'y':('y', y_raw)})
z = xr.DataArray(dsa.from_array(z_raw, chunks=(-1)), dims=['z'], coords={'z':('z', z_raw)})

da = xr.DataArray(dsa.ones([20, 10, 15], chunks=[-1, -1, -1]), dims=['x', 'y', 'z'], coords={
    'x':x, 'y':y, 'z':z
})
da
```

```
<xarray.DataArray 'wrapped-bb05d395159047b749ca855110244cb7' (x: 20, y: 10, z: 15)>
dask.array<shape=(20, 10, 15), dtype=float64, chunksize=(20, 10, 15)>
Coordinates:
  * x        (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
  * y        (y) int64 0 1 2 3 4 5 6 7 8 9
  * z        (z) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
```
Now I define some bins and apply the private `_func` on the DataArray. This works as expected. Note that the array just contains ones, hence we see 3000 in the first bin.

```python
bins = np.arange(0,30,1)

# apply private function on unlabled array
binned_data = _func(da.data, da.data, bins)
print(binned_data.compute())
```
```
<xarray.DataArray 'data' (bin_data_bins: 29)>
array([ 3000.,    nan,    nan,    nan,    nan,    nan,    nan,    nan,    nan,
          nan,    nan,    nan,    nan,    nan,    nan,    nan,    nan,    nan,
          nan,    nan,    nan,    nan,    nan,    nan,    nan,    nan,    nan,
          nan,    nan])
Coordinates:
  * bin_data_bins  (bin_data_bins) int64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ...
```
This would e.g. be an operation on a single time step. But when I now try to apply the function over the full array (core dimensions are set to all available dimensions).. I am getting a very strange error

```python
binned_full = wrapper(da, da, bins, dims=['x','y','z'])
print(binned_full.data.compute())
```

```
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-13-05a5150badbf> in <module>()
      1 binned_full = wrapper(da, da, bins, dims=['x','y','z'])
----> 2 print(binned_full.data.compute())

~/miniconda/envs/standard/lib/python3.6/site-packages/dask/base.py in compute(self, **kwargs)
    133         dask.base.compute
    134         """"""
--> 135         (result,) = compute(self, traverse=False, **kwargs)
    136         return result
    137 

~/miniconda/envs/standard/lib/python3.6/site-packages/dask/base.py in compute(*args, **kwargs)
    331     postcomputes = [a.__dask_postcompute__() if is_dask_collection(a)
    332                     else (None, a) for a in args]
--> 333     results = get(dsk, keys, **kwargs)
    334     results_iter = iter(results)
    335     return tuple(a if f is None else f(next(results_iter), *a)

~/miniconda/envs/standard/lib/python3.6/site-packages/dask/threaded.py in get(dsk, result, cache, num_workers, **kwargs)
     73     results = get_async(pool.apply_async, len(pool._pool), dsk, result,
     74                         cache=cache, get_id=_thread_get_id,
---> 75                         pack_exception=pack_exception, **kwargs)
     76 
     77     # Cleanup pools associated to dead threads

~/miniconda/envs/standard/lib/python3.6/site-packages/dask/local.py in get_async(apply_async, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, **kwargs)
    519                         _execute_task(task, data)  # Re-execute locally
    520                     else:
--> 521                         raise_exception(exc, tb)
    522                 res, worker_id = loads(res_info)
    523                 state['cache'][key] = res

~/miniconda/envs/standard/lib/python3.6/site-packages/dask/compatibility.py in reraise(exc, tb)
     58         if exc.__traceback__ is not tb:
     59             raise exc.with_traceback(tb)
---> 60         raise exc
     61 
     62 else:

~/miniconda/envs/standard/lib/python3.6/site-packages/dask/local.py in execute_task(key, task_info, dumps, loads, get_id, pack_exception)
    288     try:
    289         task, data = loads(task_info)
--> 290         result = _execute_task(task, data)
    291         id = get_id()
    292         result = dumps((result, id))

~/miniconda/envs/standard/lib/python3.6/site-packages/dask/local.py in _execute_task(arg, cache, dsk)
    269         func, args = arg[0], arg[1:]
    270         args2 = [_execute_task(a, cache) for a in args]
--> 271         return func(*args2)
    272     elif not ishashable(arg):
    273         return arg

~/miniconda/envs/standard/lib/python3.6/site-packages/numpy/lib/function_base.py in __call__(self, *args, **kwargs)
   2737             vargs.extend([kwargs[_n] for _n in names])
   2738 
-> 2739         return self._vectorize_call(func=func, args=vargs)
   2740 
   2741     def _get_ufunc_and_otypes(self, func, args):

~/miniconda/envs/standard/lib/python3.6/site-packages/numpy/lib/function_base.py in _vectorize_call(self, func, args)
   2803         """"""Vectorized call to `func` over positional `args`.""""""
   2804         if self.signature is not None:
-> 2805             res = self._vectorize_call_with_signature(func, args)
   2806         elif not args:
   2807             res = func()

~/miniconda/envs/standard/lib/python3.6/site-packages/numpy/lib/function_base.py in _vectorize_call_with_signature(self, func, args)
   2844 
   2845         for index in np.ndindex(*broadcast_shape):
-> 2846             results = func(*(arg[index] for arg in args))
   2847 
   2848             n_results = len(results) if isinstance(results, tuple) else 1

<ipython-input-5-6d60d3b0b704> in _func(data, bin_data, bins)
      7 
      8     binned = da_data.groupby_bins(da_bin_data, bins, labels=labels,
----> 9                               include_lowest=True).sum()
     10     return binned
     11 

~/Work/CODE/PYTHON/xarray/xarray/core/common.py in groupby_bins(self, group, bins, right, labels, precision, include_lowest, squeeze)
    466                                  cut_kwargs={'right': right, 'labels': labels,
    467                                              'precision': precision,
--> 468                                              'include_lowest': include_lowest})
    469 
    470     def rolling(self, min_periods=None, center=False, **windows):

~/Work/CODE/PYTHON/xarray/xarray/core/groupby.py in __init__(self, obj, group, squeeze, grouper, bins, cut_kwargs)
    225 
    226         if bins is not None:
--> 227             binned = pd.cut(group.values, bins, **cut_kwargs)
    228             new_dim_name = group.name + '_bins'
    229             group = DataArray(binned, group.coords, name=new_dim_name)

~/miniconda/envs/standard/lib/python3.6/site-packages/pandas/core/reshape/tile.py in cut(x, bins, right, labels, retbins, precision, include_lowest)
    134                               precision=precision,
    135                               include_lowest=include_lowest,
--> 136                               dtype=dtype)
    137 
    138     return _postprocess_for_cut(fac, bins, retbins, x_is_series,

~/miniconda/envs/standard/lib/python3.6/site-packages/pandas/core/reshape/tile.py in _bins_to_cuts(x, bins, right, labels, precision, include_lowest, dtype, duplicates)
    227         return result, bins
    228 
--> 229     unique_bins = algos.unique(bins)
    230     if len(unique_bins) < len(bins) and len(bins) != 2:
    231         if duplicates == 'raise':

~/miniconda/envs/standard/lib/python3.6/site-packages/pandas/core/algorithms.py in unique(values)
    362 
    363     table = htable(len(values))
--> 364     uniques = table.unique(values)
    365     uniques = _reconstruct_data(uniques, dtype, original)
    366 

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.unique()

~/miniconda/envs/standard/lib/python3.6/site-packages/pandas/_libs/hashtable.cpython-36m-darwin.so in View.MemoryView.memoryview_cwrapper()

~/miniconda/envs/standard/lib/python3.6/site-packages/pandas/_libs/hashtable.cpython-36m-darwin.so in View.MemoryView.memoryview.__cinit__()

ValueError: buffer source array is read-only
```
This error only gets triggered upon computation.

#### Problem description
I am not sure If this is a bug or a user error on my side. I am still trying to get used to `.apply_ufunc`.
If anybody has an idea for a workaround I would greatly appreciate it.

I am not sure if the rewrapping in xr.DataArrays in `_func` is actually necessary. I tried to find an equivalent functions that operates directly on dask.arrays but was not successful.

#### Output of ``xr.show_versions()``

<details>
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

xarray: 0.10.0rc1-9-gdbf7b01
pandas: 0.21.0
numpy: 1.13.3
scipy: 1.0.0
netCDF4: 1.3.1
h5netcdf: 0.5.0
Nio: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.16.0
matplotlib: 2.1.0
cartopy: 0.15.1
seaborn: 0.8.1
setuptools: 38.2.3
pip: 9.0.1
conda: None
pytest: 3.3.1
IPython: 6.2.1
sphinx: 1.6.5
</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1765/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
429511994,MDU6SXNzdWU0Mjk1MTE5OTQ=,2867,Very slow coordinate assignment with dask array,14314623,closed,0,,,7,2019-04-04T22:36:57Z,2019-12-19T17:28:10Z,2019-12-17T16:21:23Z,CONTRIBUTOR,,,,"I am trying to reconstruct vertical cell depth from a z-star ocean model. This involves a few operations involving both dimensions and coordinates of a dataset like this:

```
<xarray.Dataset>
Dimensions:                              (nv: 2, run: 2, st_edges_ocean: 51, st_ocean: 50, st_ocean_sub02: 10, sw_edges_ocean: 51, sw_ocean: 50, time: 240, xt_ocean: 360, xu_ocean: 360, yt_ocean: 200, yu_ocean: 200)
Coordinates:
    area_e                               (yt_ocean, xu_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    area_n                               (yu_ocean, xt_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    area_t                               (yt_ocean, xt_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    area_u                               (yu_ocean, xu_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    dxt                                  (yt_ocean, xt_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    dxtn                                 (yu_ocean, xt_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    dxu                                  (yu_ocean, xu_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    dyt                                  (yt_ocean, xt_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    dyte                                 (yt_ocean, xu_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    dyu                                  (yu_ocean, xu_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    geolat_c                             (yu_ocean, xu_ocean) float32 dask.array<shape=(200, 360), chunksize=(200, 360)>
    geolat_e                             (yt_ocean, xu_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    geolat_n                             (yu_ocean, xt_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    geolat_t                             (yt_ocean, xt_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    geolat_u                             (yu_ocean, xu_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    geolon_c                             (yu_ocean, xu_ocean) float32 dask.array<shape=(200, 360), chunksize=(200, 360)>
    geolon_e                             (yt_ocean, xu_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    geolon_n                             (yu_ocean, xt_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    geolon_t                             (yt_ocean, xt_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    geolon_u                             (yu_ocean, xu_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    ht                                   (yt_ocean, xt_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    kmt                                  (yt_ocean, xt_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
  * nv                                   (nv) float64 1.0 2.0
  * run                                  (run) object 'control' 'forced'
  * st_edges_ocean                       (st_edges_ocean) float64 0.0 ... 5.5e+03
  * st_ocean                             (st_ocean) float64 5.034 ... 5.395e+03
  * st_ocean_sub02                       (st_ocean_sub02) float64 5.034 ... 98.62
  * sw_edges_ocean                       (sw_edges_ocean) float64 5.034 ... 5.5e+03
  * sw_ocean                             (sw_ocean) float64 10.07 ... 5.5e+03
  * time                                 (time) object 2181-01-16 12:00:00 ... 2200-12-16 12:00:00
    tmask                                (yt_ocean, xt_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    tmask_region                         (yt_ocean, xt_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    umask                                (yu_ocean, xu_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    umask_region                         (yu_ocean, xu_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
    wet_t                                (yt_ocean, xt_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 360)>
  * xt_ocean                             (xt_ocean) float64 -279.5 ... 79.5
  * xu_ocean                             (xu_ocean) float64 -279.0 ... 80.0
  * yt_ocean                             (yt_ocean) float64 -81.5 -80.5 ... 89.5
  * yu_ocean                             (yu_ocean) float64 -81.0 -80.0 ... 90.0
    dst                                  (st_ocean, yt_ocean, xt_ocean) float64 dask.array<shape=(50, 200, 360), chunksize=(50, 200, 360)>
    dswt                                 (sw_ocean, yt_ocean, xt_ocean) float64 dask.array<shape=(50, 200, 360), chunksize=(50, 200, 360)>
    dxte                                 (yt_ocean, xu_ocean) float64 dask.array<shape=(200, 360), chunksize=(200, 359)>
    dytn                                 (yu_ocean, xt_ocean) float64 dask.array<shape=(200, 360), chunksize=(199, 360)>
```

The problematic step is when I assign the calculated dask.arrays to the original dataset.
This happens in a function like this.
```python
def add_vertical_spacing(ds):
    grid = Grid(ds)
    ds.coords['dst'] = calculate_ds(ds, dim='st')
    ds.coords['dswt'] = calculate_ds(ds, dim='sw')
    ds.coords['dzt'] = calculate_dz(ds['eta_t'], ds['ht'], ds['dst'])
    ds.coords['dzu'] = grid.min(grid.min(ds['dzt'], 'X'),'Y')
    return ds
```

This takes very long compared to a version where I assign the values as data variables:
```python
def add_vertical_spacing(ds):
    grid = Grid(ds)
    ds.coords['dst'] = calculate_ds(ds, dim='st')
    ds.coords['dswt'] = calculate_ds(ds, dim='sw')
    ds['dzt'] = calculate_dz(ds['eta_t'], ds['ht'], ds['dst'])
    ds['dzu'] = grid.min(grid.min(ds['dzt'], 'X'),'Y')
    return ds
```
I am not able to reproduce this problem in a smaller example yet and realize that my example is quite complex (e.g. has functions that are not shown).
But I suspect that something triggers the computation of the array, when assigning a coordinate. 

I have profiled my more complex code involving this function and it seems like there is a substantial increase in calls to `{method 'acquire' of '_thread.lock' objects}`. 

Profile output of the first version (assigning coordinates)
<details>

         27662983 function calls (26798524 primitive calls) in 71.940 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   268632   46.914    0.000   46.914    0.000 {method 'acquire' of '_thread.lock' objects}
      438    4.296    0.010    4.296    0.010 {method 'read' of '_io.BufferedReader' objects}
    76883    1.909    0.000    1.939    0.000 local.py:240(release_data)
      144    1.489    0.010    4.519    0.031 rechunk.py:514(_compute_rechunk)
...
</details>

For the second version (assigning data variables)
<details>
         12928834 function calls (12489174 primitive calls) in 16.554 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      438    3.841    0.009    3.841    0.009 {method 'read' of '_io.BufferedReader' objects}
     9492    3.675    0.000    3.675    0.000 {method 'acquire' of '_thread.lock' objects}
      144    1.673    0.012    4.712    0.033 rechunk.py:514(_compute_rechunk)
...
</details>

Does anyone have a feel for why this could happen or how I could refine my testing to get to the bottom of this?

#### Output of ``xr.show_versions()``

<details>
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.7 | packaged by conda-forge | (default, Feb 28 2019, 09:07:38) 
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 2.6.32-696.30.1.el6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US
LOCALE: en_US.ISO8859-1
libhdf5: 1.10.4
libnetcdf: 4.6.2

xarray: 0.12.0
pandas: 0.24.2
numpy: 1.16.2
scipy: 1.2.1
netCDF4: 1.5.0.1
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.3.1
cftime: 1.0.3.4
nc_time_axis: 1.2.0
PseudonetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 1.1.5
distributed: 1.26.1
matplotlib: 3.0.3
cartopy: 0.17.0
seaborn: 0.9.0
setuptools: 40.8.0
pip: 19.0.3
conda: None
pytest: 4.4.0
IPython: 7.1.1
sphinx: None
</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2867/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
513916063,MDU6SXNzdWU1MTM5MTYwNjM=,3454,Large coordinate arrays trigger computation,14314623,closed,0,,,2,2019-10-29T13:27:00Z,2019-10-29T15:07:43Z,2019-10-29T15:07:43Z,CONTRIBUTOR,,,,"I want to bring up an issue that has tripped up my workflow with large climate models many times. I am dealing with large data arrays of vertical cell thickness. These are 4d arrays (x, y, z, time) but I would define them as coordinates, not data_variables in the xarrays data model (e.g. they should not be multiplied by a value if a dataset is multiplied).
These sort of coordinates might become more prevalent with newer ocean models like [MOM6](http://www.cesm.ucar.edu/events/wg-meetings/2017/presentations/omwg/adcroft.pdf)

Whenever I assign these arrays as coordinates operations on the arrays seem to trigger computation, whereas they don't if I set them up as data_variables. The example below shows this behavior. 
Is this a bug or done on purpose? Is there a workaround to keep these vertical thicknesses as coordinates?

```
import xarray as xr
import numpy as np
import dask.array as dsa

# create dataset with with vertical thickness `dz` as data variable
data = xr.DataArray(dsa.random.random([30, 50, 200, 1000]), dims=['x','y', 'z', 't'])
dz = xr.DataArray(dsa.random.random([30, 50, 200, 1000]), dims=['x','y', 'z', 't'])
ds = xr.Dataset({'data':data, 'dz':dz})

#another dataset with `dz` as coordinate
ds_new = xr.Dataset({'data':data})
ds_new.coords['dz'] = dz
```

```
%%time
test = ds['data'] * ds['dz']
```
`CPU times: user 1.94 ms, sys: 19.1 ms, total: 21 ms
Wall time: 21.6 ms`

```
%%time
test = ds_new['data'] * ds_new['dz']
```
`CPU times: user 17.4 s, sys: 1.98 s, total: 19.4 s
Wall time: 12.5 s`


#### Output of ``xr.show_versions()``
<details>
INSTALLED VERSIONS
------------------
commit: None
python: 3.7.4 (default, Aug 13 2019, 15:17:50) 
[Clang 4.0.1 (tags/RELEASE_401/final)]
python-bits: 64
OS: Darwin
OS-release: 18.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: None
libnetcdf: None

xarray: 0.13.0+24.g4254b4af
pandas: 0.25.1
numpy: 1.17.2
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.5.0
distributed: 2.5.1
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
setuptools: 41.2.0
pip: 19.2.3
conda: None
pytest: 5.2.0
IPython: 7.8.0
sphinx: None

</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3454/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
279883145,MDU6SXNzdWUyNzk4ODMxNDU=,1764,.groupby_bins fails when data is not contained in bins,14314623,closed,0,,3801867,2,2017-12-06T19:48:30Z,2019-10-22T14:53:31Z,2019-10-22T14:53:30Z,CONTRIBUTOR,,,,"Consider the following example.
```python
import xarray as xr
import numpy as np
import dask.array as dsa
from dask.diagnostics import ProgressBar
```
```
# Groupby bins problem with small bins?
x_raw = np.arange(20)
y_raw = np.arange(10)
z_raw = np.arange(15)

x = xr.DataArray(dsa.from_array(x_raw, chunks=(-1)), dims=['x'], coords={'x':('x', x_raw)})
y = xr.DataArray(dsa.from_array(y_raw, chunks=(-1)), dims=['y'], coords={'y':('y', y_raw)})
z = xr.DataArray(dsa.from_array(z_raw, chunks=(-1)), dims=['z'], coords={'z':('z', z_raw)})

data = xr.DataArray(dsa.ones([20, 10, 15], chunks=[-1, -1, -1]), dims=['x', 'y', 'z'], coords={
    'x':x, 'y':y, 'z':z
})
data
```
```
<xarray.DataArray 'wrapped-bb05d395159047b749ca855110244cb7' (x: 20, y: 10, z: 15)>
dask.array<shape=(20, 10, 15), dtype=float64, chunksize=(20, 10, 15)>
Coordinates:
  * x        (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
  * y        (y) int64 0 1 2 3 4 5 6 7 8 9
  * z        (z) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

This dask array only contains ones. If I now try to apply groupby_bins with a specified array of bins (which are all below 1) it fails with a rather cryptic error.
```

```
# This doesnt work
bins = np.array([0, 20, 40, 60 , 80, 100])*1e-6

binned = data.groupby_bins(data, bins).sum()
binned
```
```
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-7-dc9283bee4ea> in <module>()
      2 bins = np.array([0, 20, 40, 60 , 80, 100])*1e-6
      3 
----> 4 binned = data.groupby_bins(data, bins).sum()
      5 binned

~/Work/CODE/PYTHON/xarray/xarray/core/common.py in wrapped_func(self, dim, axis, skipna, keep_attrs, **kwargs)
     20                              keep_attrs=False, **kwargs):
     21                 return self.reduce(func, dim, axis, keep_attrs=keep_attrs,
---> 22                                    skipna=skipna, allow_lazy=True, **kwargs)
     23         else:
     24             def wrapped_func(self, dim=None, axis=None, keep_attrs=False,

~/Work/CODE/PYTHON/xarray/xarray/core/groupby.py in reduce(self, func, dim, axis, keep_attrs, shortcut, **kwargs)
    572         def reduce_array(ar):
    573             return ar.reduce(func, dim, axis, keep_attrs=keep_attrs, **kwargs)
--> 574         return self.apply(reduce_array, shortcut=shortcut)
    575 
    576 ops.inject_reduce_methods(DataArrayGroupBy)

~/Work/CODE/PYTHON/xarray/xarray/core/groupby.py in apply(self, func, shortcut, **kwargs)
    516         applied = (maybe_wrap_array(arr, func(arr, **kwargs))
    517                    for arr in grouped)
--> 518         return self._combine(applied, shortcut=shortcut)
    519 
    520     def _combine(self, applied, shortcut=False):

~/Work/CODE/PYTHON/xarray/xarray/core/groupby.py in _combine(self, applied, shortcut)
    520     def _combine(self, applied, shortcut=False):
    521         """"""Recombine the applied objects like the original.""""""
--> 522         applied_example, applied = peek_at(applied)
    523         coord, dim, positions = self._infer_concat_args(applied_example)
    524         if shortcut:

~/Work/CODE/PYTHON/xarray/xarray/core/utils.py in peek_at(iterable)
    114     """"""
    115     gen = iter(iterable)
--> 116     peek = next(gen)
    117     return peek, itertools.chain([peek], gen)
    118 

StopIteration: 
```

If however the last bin includes the value 1 it runs as expected:
```
# If I include a larger value at the end it works
bins = np.array([0, 20, 40, 60 , 80, 100, 1e7])*1e-6

binned = data.groupby_bins(data, bins).sum()
binned
```
```
<xarray.DataArray 'wrapped-bb05d395159047b749ca855110244cb7' (wrapped-bb05d395159047b749ca855110244cb7_bins: 6)>
dask.array<shape=(6,), dtype=float64, chunksize=(5,)>
Coordinates:
  * wrapped-bb05d395159047b749ca855110244cb7_bins  (wrapped-bb05d395159047b749ca855110244cb7_bins) object (0.0, 2e-05] ...
```
#### Problem description

Is this expected behaviour? I would prefer it if it returned nan values for the bins that capture no values. 
It took me a bit to find out why my script using this was failing, and if this is expected behavior could a more helpful error message be considered?

#### Expected Output

#### Output of ``xr.show_versions()``
<details>
# Paste the output here xr.show_versions() here
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

xarray: 0.10.0rc1-9-gdbf7b01
pandas: 0.20.3
numpy: 1.13.1
scipy: 0.19.1
netCDF4: 1.2.9
h5netcdf: 0.4.1
Nio: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.15.4
matplotlib: 2.0.2
cartopy: 0.15.1
seaborn: 0.8.1
setuptools: 36.3.0
pip: 9.0.1
conda: None
pytest: 3.2.2
IPython: 6.1.0
sphinx: 1.6.5
</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1764/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
503562032,MDU6SXNzdWU1MDM1NjIwMzI=,3377,Changed behavior for replacing coordinates on dataset.,14314623,closed,0,,,5,2019-10-07T16:32:33Z,2019-10-11T15:47:57Z,2019-10-11T15:47:57Z,CONTRIBUTOR,,,,"#### MCVE Code Sample
We noticed a change in behavior in xarray that broke a test in [xgcm](https://github.com/xgcm/xgcm/pull/130#issuecomment-538834037).

Consider this code example:
```python
import xarray as xr
import numpy as np
x = np.arange(5)
data = xr.DataArray(np.random.rand(5), coords={'x':x
}, dims=['x'])
x_c = np.arange(5) + 0.5
data_c = xr.DataArray(np.random.rand(5), coords={'x_c':x_c
}, dims=['x_c'])
ds = xr.Dataset({'data':data, 'data_c':data_c})
del ds['data_c']

ds['x_c'] = ds['x_c'][:3]
ds
```


#### Expected Output
In previous versions of xarray this resulted in 
```
<xarray.Dataset>
Dimensions:  (x: 5, x_c: 3)
Coordinates:
  * x_c      (x_c) float64 0.5 1.5 2.5
  * x        (x) int64 0 1 2 3 4
Data variables:
    data     (x) float64 0.3828 0.6016 0.3603 0.414 0.35
```
 (I tested right now with an older `v0.11.3` but this works with `v0.13` as well.

In the current master branch instead the coordinate gets padded with nans:
```
<xarray.Dataset>
Dimensions:  (x: 5, x_c: 5)
Coordinates:
  * x_c      (x_c) float64 0.5 1.5 2.5 nan nan
  * x        (x) int64 0 1 2 3 4
Data variables:
    data     (x) float64 0.8369 0.4197 0.3058 0.7419 0.8126
```


#### Problem Description
We fixed the test in a new PR, but @dcherian encouraged me to submit this.


#### Output of ``xr.show_versions()``
<details>
INSTALLED VERSIONS
------------------
commit: None
python: 3.7.4 (default, Aug 13 2019, 15:17:50) 
[Clang 4.0.1 (tags/RELEASE_401/final)]
python-bits: 64
OS: Darwin
OS-release: 18.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: None
libnetcdf: None

xarray: 0.13.0+24.g4254b4af
pandas: 0.25.1
numpy: 1.17.2
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.5.0
distributed: 2.5.1
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
setuptools: 41.2.0
pip: 19.2.3
conda: None
pytest: 5.2.0
IPython: 7.8.0
sphinx: None

</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3377/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
431584027,MDU6SXNzdWU0MzE1ODQwMjc=,2884,drop all but specified data_variables/coordinates as a convenience function,14314623,closed,0,,,5,2019-04-10T15:57:03Z,2019-04-16T12:44:13Z,2019-04-16T12:44:13Z,CONTRIBUTOR,,,,"I often work with datasets that consist out of a lot of data_variables and coordinates.
Often I am only concerned about a subset of variables, and for convenience drop all but a selected list of variables with a little snippet like this:

#### Code Sample, a copy-pastable example if possible
```python
def xr_keep(obj, varlist):
    """"""drop all data_vars exept the ones provided in `varlist` """"""
    obj = obj.copy()
    drop_vars = [a for a in obj.data_vars if a not in varlist]
    return obj.drop(drop_vars)
```

I would love to have this functionality available as a DataArray/Dataset function. It could look something like `da_slim = da.drop_all_but(['var1, 'var3'])`. Would this be of interest to people here? Then I could try to put in a PR.
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2884/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
433410125,MDExOlB1bGxSZXF1ZXN0MjcwNjE3NjQ5,2894,Added docs example for `xarray.Dataset.get()`,14314623,closed,0,,,7,2019-04-15T18:03:20Z,2019-04-16T12:44:13Z,2019-04-16T12:44:13Z,CONTRIBUTOR,,0,pydata/xarray/pulls/2894,"<!-- Feel free to remove check-list items aren't relevant to your change -->

Added example of `xarray.Dataset.get` method and alternative `ds[var_list]` syntax.
 - [ ] Closes #2884 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2894/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
433522846,MDExOlB1bGxSZXF1ZXN0MjcwNzA3ODYw,2897,Bugfix for docs build instructions,14314623,closed,0,,,1,2019-04-15T23:42:36Z,2019-04-16T04:22:43Z,2019-04-16T03:57:28Z,CONTRIBUTOR,,0,pydata/xarray/pulls/2897,"<!-- Feel free to remove check-list items aren't relevant to your change -->

Added clearer instructions on how to build xarray documentation for contribution guide
 - [x] Closes #2893","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2897/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
433406684,MDU6SXNzdWU0MzM0MDY2ODQ=,2893,Dependency issue when building docs according to instructions,14314623,closed,0,,,1,2019-04-15T17:54:35Z,2019-04-16T03:57:28Z,2019-04-16T03:57:28Z,CONTRIBUTOR,,,,"#### Code Sample, a copy-pastable example if possible

I was following the instructions to build the docs [locally](https://xarray.pydata.org/en/latest/contributing.html#how-to-build-the-xarray-documentation), using `xarray/ci/requirements-py36` as test environment.

When running `make html` in the `xarray/doc` folder, I get this:
```
xarray: 0.12.1+9.gfad6d624.dirty, /Users/juliusbusecke/Work/CODE/PYTHON/xarray/xarray/__init__.py

Extension error:
Could not import extension IPython.sphinxext.ipython_directive (exception: No module named 'IPython')
make: *** [html] Error 2
(test_env)
```
After installing ipython with `conda install ipython`, the build works properly. 

Should this line `conda install -c conda-forge sphinx sphinx_rtd_theme sphinx-gallery numpydoc` be updated to:
`conda install -c conda-forge sphinx sphinx_rtd_theme sphinx-gallery numpydoc ipython` ?

Or am I overlooking something?
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2893/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
258913450,MDU6SXNzdWUyNTg5MTM0NTA=,1581,Projection issue with plot.imshow and cartopy projection,14314623,closed,0,,,4,2017-09-19T18:10:06Z,2019-03-07T22:09:41Z,2019-03-07T22:09:41Z,CONTRIBUTOR,,,,"I am experiencing some trouble when using cartopy transformations in the plot module.
I am concerned about plotting speed (plotting very high resolution maps of ocean model output takes forever).
According to #657 I tried to use plot.imshow but I am getting unexpected results for the map projection.
The longitude wrapping does not seem to work and as seen by the mismatch between the ocean mask data and the coastline in the example below, the projection seems to be inaccurate.

This seems to be related to the cartopy module (see the last plot which was done 'outside' of xarray).

My question is twofold, I guess

1) Is there is any other 'high speed' alternative to plot high resolution maps.
2) Since this error might not appear as drastically in all mapping scenarios, should plot.imshow display a warning, when invoked with a transformation argument, or even an error? 

```python
import xarray as xr
%matplotlib inline
import numpy as np
import cartopy.crs as ccrs
import cartopy.feature as cfeature
import matplotlib.pyplot as plt

ds = xr.open_dataset('ocean_mask.nc')

plt.figure()
ax_i = plt.gca(projection=ccrs.Robinson())
ds.wet.plot.imshow(x='lonh',y='lath',ax=ax_i,transform=ccrs.PlateCarree())
ax_i.coastlines()
plt.title('imshow')

plt.figure()
ax_p = plt.gca(projection=ccrs.Robinson())
ds.wet.plot(x='lonh',y='lath',ax=ax_p,transform=ccrs.PlateCarree())
ax_p.coastlines()
plt.title('standard plot')

plt.figure()
ax = plt.gca(projection=ccrs.Robinson())

ax.imshow(ds.wet.data,
          transform=ccrs.PlateCarree(),
          extent=[ds.lonh.min().data, ds.lonh.max().data,ds.lath.min().data, ds.lath.max().data])
ax.coastlines()
plt.title('cartopy imshow')
```

![download-2](https://user-images.githubusercontent.com/14314623/30607764-213ae800-9d44-11e7-9481-23d2782d41d9.png)
![download-1](https://user-images.githubusercontent.com/14314623/30607769-23d22a56-9d44-11e7-8fa9-315b2920de67.png)
![download](https://user-images.githubusercontent.com/14314623/30607774-2571bf20-9d44-11e7-8cd9-c74cf591bc5c.png)

[ocean_mask.nc.zip](https://github.com/pydata/xarray/files/1315373/ocean_mask.nc.zip)

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1581/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
398041758,MDExOlB1bGxSZXF1ZXN0MjQzODUwNjU1,2665,enable internal plotting with cftime datetime,14314623,closed,0,,,28,2019-01-10T22:23:31Z,2019-02-08T18:15:26Z,2019-02-08T00:11:14Z,CONTRIBUTOR,,0,pydata/xarray/pulls/2665,"This PR is meant to restore the internal plotting capabilities for objects with cftime.datetime dimensions.
Based mostly on the discussions in #2164

 - [x] Closes #2164 
 - [x] Tests added
 - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2665/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
272415954,MDU6SXNzdWUyNzI0MTU5NTQ=,1704,Error when using engine='scipy' reading CM2.6 ocean output,14314623,closed,0,,,7,2017-11-09T02:06:00Z,2019-01-22T22:48:21Z,2019-01-22T22:48:21Z,CONTRIBUTOR,,,,"#### Code Sample, a copy-pastable example if possible

```python
path = '/work/Julius.Busecke/CM2.6_staged/CM2.6_A_V03_1PctTo2X/annual_averages'
ds_ocean = xr.open_mfdataset(os.path.join(path,'ocean.*.ann.nc'), chunks={'time':1}, 
                             decode_times=False, engine='scipy')
ds_ocean
```
gives 

```
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-278556ff881c> in <module>()
      1 path = '/work/Julius.Busecke/CM2.6_staged/CM2.6_A_V03_1PctTo2X/annual_averages'
----> 2 ds_ocean = xr.open_mfdataset(os.path.join(path,'ocean.*.ann.nc'), chunks={'time':1}, decode_times=False, engine='scipy')
      3 ds_ocean

~/code/miniconda/envs/standard/lib/python3.6/site-packages/xarray/backends/api.py in open_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, lock, **kwargs)
    503         lock = _default_lock(paths[0], engine)
    504     datasets = [open_dataset(p, engine=engine, chunks=chunks or {}, lock=lock,
--> 505                              **kwargs) for p in paths]
    506     file_objs = [ds._file_obj for ds in datasets]
    507 

~/code/miniconda/envs/standard/lib/python3.6/site-packages/xarray/backends/api.py in <listcomp>(.0)
    503         lock = _default_lock(paths[0], engine)
    504     datasets = [open_dataset(p, engine=engine, chunks=chunks or {}, lock=lock,
--> 505                              **kwargs) for p in paths]
    506     file_objs = [ds._file_obj for ds in datasets]
    507 

~/code/miniconda/envs/standard/lib/python3.6/site-packages/xarray/backends/api.py in open_dataset(filename_or_obj, group, decode_cf, mask_and_scale, decode_times, autoclose, concat_characters, decode_coords, engine, chunks, lock, cache, drop_variables)
    283         elif engine == 'scipy':
    284             store = backends.ScipyDataStore(filename_or_obj,
--> 285                                             autoclose=autoclose)
    286         elif engine == 'pydap':
    287             store = backends.PydapDataStore(filename_or_obj)

~/code/miniconda/envs/standard/lib/python3.6/site-packages/xarray/backends/scipy_.py in __init__(self, filename_or_obj, mode, format, group, writer, mmap, autoclose)
    133                                    filename=filename_or_obj,
    134                                    mode=mode, mmap=mmap, version=version)
--> 135         self.ds = opener()
    136         self._autoclose = autoclose
    137         self._isopen = True

~/code/miniconda/envs/standard/lib/python3.6/site-packages/xarray/backends/scipy_.py in _open_scipy_netcdf(filename, mode, mmap, version)
     81     try:
     82         return scipy.io.netcdf_file(filename, mode=mode, mmap=mmap,
---> 83                                     version=version)
     84     except TypeError as e:  # netcdf3 message is obscure in this case
     85         errmsg = e.args[0]

~/code/miniconda/envs/standard/lib/python3.6/site-packages/scipy/io/netcdf.py in __init__(self, filename, mode, mmap, version, maskandscale)
    264 
    265         if mode in 'ra':
--> 266             self._read()
    267 
    268     def __setattr__(self, attr, value):

~/code/miniconda/envs/standard/lib/python3.6/site-packages/scipy/io/netcdf.py in _read(self)
    591         self._read_dim_array()
    592         self._read_gatt_array()
--> 593         self._read_var_array()
    594 
    595     def _read_numrecs(self):

~/code/miniconda/envs/standard/lib/python3.6/site-packages/scipy/io/netcdf.py in _read_var_array(self)
    696             # Build rec array.
    697             if self.use_mmap:
--> 698                 rec_array = self._mm_buf[begin:begin+self._recs*self._recsize].view(dtype=dtypes)
    699                 rec_array.shape = (self._recs,)
    700             else:

ValueError: new type not compatible with array.
```

xarray version: '0.9.6'

#### Problem description

I am trying to lazily read in a large number of high resolution ocean model output files. If I omit the `engine='scipy'` it works but takes *forever*.
Is there a known reason why this would fail with the 'scipy' option?

I found #1313, and checked my conda environment:
```
$ conda list hdf
# packages in environment at /home/Julius.Busecke/code/miniconda/envs/standard:
#
hdf4                      4.2.12                        0    conda-forge
hdf5                      1.8.18                        1    conda-forge
```
```
$ conda list netcdf
# packages in environment at /home/Julius.Busecke/code/miniconda/envs/standard:
#
h5netcdf                  0.4.2                      py_0    conda-forge
libnetcdf                 4.4.1.1                       6    conda-forge
netcdf4                   1.3.0                    py36_0    conda-forge
```
I can also `import netCDF4`  and also load a single file using netCDF, so I am unsure if this is the same error as in #1313

I keep getting this error with some of the files for this particular model but not with others. 

Any help would be greatly appreciated.
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1704/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
356067160,MDU6SXNzdWUzNTYwNjcxNjA=,2394,Change default colormaps,14314623,closed,0,,,3,2018-08-31T17:38:08Z,2018-09-05T15:17:23Z,2018-09-05T15:17:23Z,CONTRIBUTOR,,,,"#### Problem description

xarrays plotting module is awesome because it detects automatically if the data is divergent or not and adjusts the colormap accordingly.

But what if I do not particularly like the default colormap choice for either data type (e.g. my current boss is not a fan of matplotlibs [viridis](https://matplotlib.org/examples/color/colormaps_reference.html))?

@shoyer mentioned on [stackexchange](https://stackoverflow.com/questions/52114364/change-the-default-colorbar-in-xarray-for-non-diverging-data?noredirect=1#comment91188566_52114364) that this is not currently possible. 

I would be happy to submit a PR, if I can get some guidance on where to implement this.

What does everyone think about a syntax this shoule be implemented with?

Something like:

```
with xr.set_options(diverging_cmap='RdYlBl'):
    ...
with xr.set_options(nondiverging_cmap='magma'):
    ....
```
Or is there a better way to do this?
I would personally prefer to be able to set this at the beginning of a notebook, so that it is applied to all following plots. Maybe something like the `mpl.rcparams` is a way to go here?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2394/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
356546301,MDExOlB1bGxSZXF1ZXN0MjEyNzgwMjUy,2397,add options for nondivergent and divergent cmap,14314623,closed,0,,,6,2018-09-03T15:31:26Z,2018-09-05T15:17:23Z,2018-09-05T15:17:23Z,CONTRIBUTOR,,0,pydata/xarray/pulls/2397," - [ ] Closes #2394 (remove if there is no corresponding issue, which should only be the case for minor changes)
 - [ ] Tests passed (for all non-documentation changes)
- [ ] Tests added (for all bug fixes or enhancements)
  - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2397/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
265013667,MDU6SXNzdWUyNjUwMTM2Njc=,1630,Option to include units in .plot(),14314623,closed,0,,,2,2017-10-12T16:53:57Z,2018-06-12T00:33:49Z,2018-06-12T00:33:23Z,CONTRIBUTOR,,,,"I find myself editing the label on the colorbar of xarray plots often to include the units of the plotted variable.
Would it be possible to extract the value of DataArray.attrs['units] (if present) and add it to the colorbar label?

Given an example DataArray with name='temperature' and attrs['units']='deg C', the current colorbar label says 'temperature'.

A 'plot_units' keyword could possibly be used like this for example:

```python
da.plot(plot_units=True)
```
Then the colorbar label should be 'temperature [deg C]'.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1630/reactions"", ""total_count"": 4, ""+1"": 4, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
271957479,MDU6SXNzdWUyNzE5NTc0Nzk=,1695,Diagnose groupby/groupby_bins issues,14314623,closed,0,,,3,2017-11-07T19:39:38Z,2017-11-09T16:36:26Z,2017-11-09T16:36:19Z,CONTRIBUTOR,,,,"#### Code Sample, a copy-pastable example if possible

```python
import xarray as xr
xr.__version__
>>> '0.9.6'

ds = xr.open_dataset('../testing/Bianchi_o2.nc',chunks={'TIME':1})
ds
>>> <xarray.Dataset>
>>> Dimensions:     (DEPTH: 33, LATITUDE: 180, LONGITUDE: 360, TIME: 12, bnds: 2)
>>> Coordinates:
>>>   * LONGITUDE   (LONGITUDE) float64 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 ...
>>>   * LATITUDE    (LATITUDE) float64 -89.5 -88.5 -87.5 -86.5 -85.5 -84.5 -83.5 ...
>>>   * DEPTH       (DEPTH) float64 0.0 10.0 20.0 30.0 50.0 75.0 100.0 125.0 ...
>>>   * TIME        (TIME) float64 15.0 44.0 73.5 104.0 134.5 165.0 195.5 226.5 ...
>>> Dimensions without coordinates: bnds
>>> Data variables:
>>>     DEPTH_bnds  (DEPTH, bnds) float64 -5.0 5.0 5.0 15.0 15.0 25.0 25.0 40.0 ...
>>>     TIME_bnds   (TIME, bnds) float64 0.5 29.5 29.5 58.75 58.75 88.75 88.75 ...
>>>     O2_LINEAR   (TIME, DEPTH, LATITUDE, LONGITUDE) float64 nan nan nan nan ...
>>> Attributes:
>>>     history:  FERRET V5.70 (alpha) 29-Sep-11

# This runs as expected
ds.isel(TIME=0).groupby_bins('O2_LINEAR', np.array([0,20,40,60,100])).max()

# This crashes the kernel
ds.groupby_bins('O2_LINEAR', np.array([0,20,40,60,100])).max()

```
#### Problem description

I am working on ocean oxygen data and would like to compute the volume of the ocean contained within a range of concentration values. 

I am trying to use groupby_bins but even with this modest size dataset (1 deg global resolution, 25 depth levels, 12 time steps) my kernel crashes every time without any error message.

I eventually want to perform this step on several TB of ocean model output, so this is concerning.

First of all I would like to ask if there is an easy way to diagnose the problem further. And secondly, are there recommendations how to compute the sum over groupby_bins for very large datasets (consisting out of dask arrays).

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1695/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
199218465,MDU6SXNzdWUxOTkyMTg0NjU=,1195,Bug in dateconversion?,14314623,closed,0,,,4,2017-01-06T15:25:03Z,2017-01-09T14:27:18Z,2017-01-09T14:27:18Z,CONTRIBUTOR,,,,"I noticed an undesired behavior in xarray when using xarray.open_dataset:

running the following in version 0.8.2-90-g2c7730d:
```
import xarray as xr
fid = 'dt_global_allsat_msla_uv_20140101_20140829.nc'
ds = xr.open_dataset(fid)
ds.time
```
gives 

```
<xarray.DataArray 'time' (time: 1)>
array(['2013-12-31T19:00:00.000000000-0500'], dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 2014-01-01
Attributes:
    long_name: Time
    standard_name: time
    axis: T
```

Note the hour is 19, I also encountered files with 20. 
Since the time in the .nc file is given in 'days since', the expected output would be 00.

Indeed when running version 0.8.2 the output is:
```
<xarray.DataArray 'time' (time: 1)>
array(['2014-01-01T00:00:00.000000000'], dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 2014-01-01
Attributes:
    long_name: Time
    standard_name: time
    axis: T
```
Any idea what could cause this?

Sample file used in the example can be found here: https://ufile.io/077da

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1195/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
194370559,MDU6SXNzdWUxOTQzNzA1NTk=,1159,Problem passing 'norm' when plotting a faceted figure,14314623,closed,0,,,3,2016-12-08T15:49:00Z,2016-12-09T14:08:49Z,2016-12-09T09:46:58Z,CONTRIBUTOR,,,,"I am not sure if this is a bug or user error by me.

I am basically trying to pass a SymLogNorm to all plots in a faceted plot, but it has no effect.
It works when I just plot a single axis. 
Below I reproduced the effect with an example dataset.

Matplotlib version: 1.5.1
xarray version: 0.8.2

```python
import xarray as xr
import matplotlib as mpl

# Here the passed 'norm=' keyword has the desired effect
da1 = xr.DataArray(np.random.exponential(size=[10,10]))

plt.figure()
da1.plot()

plt.figure()
da1.plot(norm=mpl.colors.SymLogNorm(0.1))


# In the faceted plot this has no effect
da2 = xr.DataArray(np.random.exponential(size=[10,10,4]))

plt.figure()
da.plot(x='dim_0',y='dim_1',col='dim_2',col_wrap=2)

plt.figure()
da.plot(x='dim_0',y='dim_1',col='dim_2',col_wrap=3,norm=mpl.colors.SymLogNorm(0.1))
```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1159/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue