id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
196541604,MDU6SXNzdWUxOTY1NDE2MDQ=,1173,Some queries,7300413,closed,0,,,11,2016-12-19T22:53:32Z,2019-01-13T06:27:38Z,2019-01-13T06:00:22Z,NONE,,,,"Hello @shoyer @pwolfram @mrocklin @rabernat ,

I was trying to write a design/requirements doc with ref. to the Columbia meetup,
and I had a few queries, on which I wanted your inputs (basically to ask whether
they make sense or not!)

1. If you serialize a labeled n-d data array using netCDF or HFD5, it gets written into
a single file, which is not really a good option if you want to eventually do distributed
processing of the data. Things like HDFS/lustreFS can split files, but that is not really
what we want. How do you think this issue could be solved within the xarray+dask
framework? 
   * is it a matter of adding some code to the dataset.to_netcdf() method or
     adding a new method that would split the DataArray (based on some user guidelines) into multiple files?
   * Or does it make more sense to add a new serialization format like Zarr?
2. Continuing along similar lines, how does xarray+dask currently decide on how to distribute the workload between dask workers? are there any heuristics to handle data locality? or does experience say that network I/O is fast enough that this is not an issue? I'm asking this question because of this article by Matt: http://blaze.pydata.org/blog/2015/10/28/distributed-hdfs/
   * If this is desirable, how would one go about implementing it?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1173/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
274233261,MDU6SXNzdWUyNzQyMzMyNjE=,1717,colorbars in facet grids,7300413,closed,0,,,6,2017-11-15T17:06:15Z,2018-10-25T16:06:53Z,2018-10-25T16:06:53Z,NONE,,,,"Hello,

In the 0.9.6 version, it does not appear to be possible to pass any arguments to the colorbar plotting
routine.

https://github.com/pydata/xarray/blob/8267fdb1093bba3934a172cf71128470698279cd/xarray/plot/facetgrid.py#L239

explicitly sets set_colorbar = False, which makes sense.

However, if we want horizontal colorbars, or any way of adjusting the colorbar plotted (it is huge and unwieldy), it would be good if the plotting routine checks for and passes suitable arguments to
https://github.com/pydata/xarray/blob/8267fdb1093bba3934a172cf71128470698279cd/xarray/plot/facetgrid.py#L256

I tried hacking something together, I can do something like the following now:

```python
import xarray
import matplotlib.pyplot as plt

data = xarray.open_dataset('/data/ERSST/sst.mnmean.old.nc').sst

data = data.loc[dict(time=slice('1999-1', '1999-4'))]
data.plot.contourf(col='time', col_wrap=2, levels=12, cbar_kwargs=dict(orientation='horizontal',
                                                pad=0.1, aspect=30, shrink=0.6, ticks=[0, 10, 20 ,30]))
```

which produces:

![figure_1](https://user-images.githubusercontent.com/7300413/32849428-673c0aec-ca2f-11e7-8330-62872d96148a.png)

Is something like this available in the development version? If not, and it seems like a useful feature, I can create a PR.

Joy
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1717/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
219184224,MDU6SXNzdWUyMTkxODQyMjQ=,1351,Creating a 2D DataArray,7300413,closed,0,,,5,2017-04-04T09:04:37Z,2017-04-04T16:19:12Z,2017-04-04T15:52:31Z,NONE,,,,"Hello,

I think I'm missing something simple here. I tried looking at the documentation, but no luck.
I'm trying to create DataArrays whose coordinates are two dimensional as follows

```python
from xarray import DataArray
import numpy as np

x_physical = DataArray(np.ones((2,2)),
                       dims = ['x_logical', 'y_logical'])

y_physical = DataArray(np.ones((2,2)),
                       dims = ['x_logical', 'y_logical'])

new_array = DataArray(np.zeros((2,2)), dims=['x_logical','y_logical'], coords=[x_physical, y_physical])
```
trying to follow the multidimensional example in the docs.

This gives me a 
ValueError: 'x_logical' has more than 1-dimension and the same name as one of its dimensions ('x_logical', 'y_logical'). xarray disallows such variables because they conflict with the coordinates used to label dimensions.

I have tried multiple variants of this:

* replace ``x/y_logical`` in the arguments to creating ``new_array`` with something else, which gives me a
ValueError: dimensions ('x',) must have the same length as the number of data dimensions, ndim=2

and some other variants which in hindsight make no sense.

Is there something I'm missing, and is creating multidimensional DataArrays documented somewhere?

TIA,
Joy","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1351/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
214088387,MDU6SXNzdWUyMTQwODgzODc=,1308,Using groupby with custom index,7300413,closed,0,,,8,2017-03-14T14:24:11Z,2017-03-15T15:32:34Z,2017-03-15T15:32:34Z,NONE,,,,"Hello,

I have 6 hourly data (ERA Interim) for around 10 years. I want to calculate the annual 6 hourly climatology, i.e,  366*4 values, with each value corresponding to a 6 hourly interval. I am chunking the data along longitude.
I'm using xarray 0.9.1 with Python 3.6 (Anaconda).

For a daily climatology on this data, I do the usual:
```python
mean = data.groupby('time.dayofyear').mean(dim='time').compute()
```
For the 6 hourly version, I am trying the following:
```python
test = (data['time.hour']/24 + data['time.dayofyear'])
test.name = 'dayHourly'
new_test = data.groupby(test).mean(dim='time').compute()
```
The first one (daily climatology) takes around 15 minutes for my data, whereas the second one ran for almost 30 minutes after which I gave up and killed the process.

Is there some obvious reason why the first is much faster than the second? ```data``` in both cases is the 6 hourly dataset. And is there an alternative way of expressing this computation which would make it faster?

TIA,
Joy","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1308/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
158212793,MDU6SXNzdWUxNTgyMTI3OTM=,866,Drawing only one contour,7300413,closed,0,,,8,2016-06-02T18:50:36Z,2016-07-20T17:16:27Z,2016-07-20T17:16:27Z,NONE,,,,"Hello,

I was trying to draw only a single contour by passing levels=[0], and nothing gets
plotted.

I checked utils.py, and the logic used to calculate **n_colors** in **_build_discrete_cmap**
gives **n_colors**=0, since it will first set **extend** to 'neither', and so **ext_n** = 0, and 
n_colors = len(levels) + ext_n - 1

I'm not sure, but this might be the issue.

Another issue, which might be unrelated, is when I'm trying to draw two contours. It plots only
one contour, and it will plot two contours only if I set  **norm**=None.

Any suggestions?

TIA,
Joy
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/866/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
143422096,MDU6SXNzdWUxNDM0MjIwOTY=,803,Unable to reference variable,7300413,closed,0,,,2,2016-03-25T04:40:35Z,2016-03-26T03:59:24Z,2016-03-26T03:59:24Z,NONE,,,,"Hello,

I was trying to use xarray to access the MERRA-2 monthly dataset. This dataset provides nc4 files, one for each month, with multiple analyzed variables (U,V,T, etc.,) in each file.

if I open one file and try to access the zonal wind (U) as 

``` python
       data = xarray.open_dataset('/data/MERRA2/MERRA2_100.instM_3d_ana_Np.198001.nc4')
       data = data.U
```

This works just fine.

However, if I use 

``` python
data = data.T
```

I get back the entire dataset, i.e, I then have data.T.T, and data.T.T.T etc.,
I can open a single file using netCDF4, and the values seem ok. 

This seems to happen only with the temperature (T) field. all other fields seem to
behave ok.

Any ideas as to what is happening?

TIA,
Joy
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/803/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
120681918,MDU6SXNzdWUxMjA2ODE5MTg=,672,Making xray use multiple cores,7300413,closed,0,,,5,2015-12-07T01:41:17Z,2015-12-07T09:33:18Z,2015-12-07T09:33:17Z,NONE,,,,"Hello,

I was trying out the 'chunks' argument to open dataset so that I could use
the out-of-core functionality. It works very well, but when I run top I see only
one core being utilised. Is there some argument I need to pass to make
it use more cores?

TIA,
Joy
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/672/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
69141510,MDU6SXNzdWU2OTE0MTUxMA==,393,JJAS?,7300413,closed,0,,,2,2015-04-17T13:41:35Z,2015-04-19T00:00:56Z,2015-04-18T06:02:04Z,NONE,,,,"Hello,

I noticed that you have added the 'time.season' attribute in the latest version of xray. Thanks!

Those of us who study monsoons, especially the South Asian one, define the monsoon
season as JJAS, which is not a valid value for time.season.

How can one implement this sort of selection?

TIA,
Joy
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/393/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
59467251,MDU6SXNzdWU1OTQ2NzI1MQ==,349,Query about concat,7300413,closed,0,,,11,2015-03-02T11:07:58Z,2015-04-10T06:16:02Z,2015-03-03T05:45:50Z,NONE,,,,"Hello,

I have multiple nc files, and I want to pick one variable from all of them to
write to a separate file, and if possible pick one vertical level. The issue
is that it has no aggregation dimension, so MFDataset does not work.

The idea is to get all data about one variable from one vertical level into
a single file.

When I use the example in the netCDF4-python website, concat merges
all variables along all dimensions, making the in-memory size really large.

I'm new to xray, and I was hoping something of this sort can be done.
In fact, I don't really need to write it to a new file. Even if I can get
one ""descriptor"" (instead of an array of Dataset objects) to access my data, I will be quite
happy!

TIA,
Joy
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/349/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue