html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2314#issuecomment-666422864,https://api.github.com/repos/pydata/xarray/issues/2314,666422864,MDEyOklzc3VlQ29tbWVudDY2NjQyMjg2NA==,4992424,2020-07-30T14:52:50Z,2020-07-30T14:52:50Z,NONE,"Hi @shaprann, I haven't re-visited this exact workflow recently, but one really good option (if you can manage the intermediate storage cost) would be to try to use new tools like http://github.com/pangeo-data/rechunker to pre-process and prepare your data archive prior to analysis. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,344621749
https://github.com/pydata/xarray/issues/1086#issuecomment-661953980,https://api.github.com/repos/pydata/xarray/issues/1086,661953980,MDEyOklzc3VlQ29tbWVudDY2MTk1Mzk4MA==,4992424,2020-07-21T16:09:25Z,2020-07-21T16:09:52Z,NONE,"Hi @andreall, I'll leave @dcherian or another maintainer to comment on internals of `xarray` that might be pertinent for optimization here. However, just to throw it out there, for workflows like this, it can sometimes be a bit easier to process each NetCDF file (subsetting your locations and whatnot) and convert it to CSV individually, then merge/concatenate those CSV files together at the end. This sort of workflow can be parallelized a few different ways, but is nice because you can parallelize across the number of files you need to process. A simple example based on your MRE:

``` python
import xarray as xr
from pathlib import Path
from joblib import delayed, Parallel

dir_input = Path('.')
fns = list(sorted(dir_input.glob('**/' + 'WW3_EUR-11_CCCma-CanESM2_r1i1p1_CLMcom-CCLM4-8-17_v1_6hr_*.nc')))

# Helper function to convert NetCDF to CSV with our processing
def _nc_to_csv(fn):
    data_ww3 = xr.open_dataset(fn)
    data_ww3 = data_ww3.isel(latitude=74, longitude=18)
    df_ww3 = data_ww3[['hs', 't02', 't0m1', 't01', 'fp', 'dir', 'spr', 'dp']].to_dataframe()

    out_fn = fn.replace("".nc"", "".csv"")
    df_ww3.to_csv(out_fn)

    return out_fn

# Using joblib.Parallel to distribute my work across whatever resources i have
out_fns = Parallel(
    n_jobs=-1,  # Use all cores available here
    delayed(_nc_to_csv)(fn) for fn in fns
)

# Read the CSV files and merge them
dfs = [
    pd.read_csv(fn) for fn in out_fns
]
df_ww3_all = pd.concat(dfs, ignore_index=True)
```

YMMV but this pattern often works for many types of processing applications.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,187608079
https://github.com/pydata/xarray/issues/3349#issuecomment-536079602,https://api.github.com/repos/pydata/xarray/issues/3349,536079602,MDEyOklzc3VlQ29tbWVudDUzNjA3OTYwMg==,4992424,2019-09-27T20:07:13Z,2019-09-27T20:07:13Z,NONE,I second @TomNicholas' point... functionality like this would be wonderful to have but where would be the best place for it to live?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,499477363
https://github.com/pydata/xarray/issues/3213#issuecomment-524104485,https://api.github.com/repos/pydata/xarray/issues/3213,524104485,MDEyOklzc3VlQ29tbWVudDUyNDEwNDQ4NQ==,4992424,2019-08-22T22:39:21Z,2019-08-22T22:39:21Z,NONE,Tagging @jeliashi for visibility/collaboration,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,479942077
https://github.com/pydata/xarray/issues/2911#issuecomment-485272748,https://api.github.com/repos/pydata/xarray/issues/2911,485272748,MDEyOklzc3VlQ29tbWVudDQ4NTI3Mjc0OA==,4992424,2019-04-21T18:32:56Z,2019-04-21T18:32:56Z,NONE,"Hi @tomchor, it's not too difficult to take the readers that you already have and two wrap them in such a way that you can interact with them via xarray; you can check out the packages [xgcm](https://github.com/xgcm/xgcm) or [xbpch](https://github.com/darothen/xbpch) for examples of this can work in practice. I'm not sure if a more generic reader is within or beyond the scope of the core xarray project, though... although example implementations and writeups would make a great contribution to the community!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,435532136
https://github.com/pydata/xarray/issues/2314#issuecomment-417175383,https://api.github.com/repos/pydata/xarray/issues/2314,417175383,MDEyOklzc3VlQ29tbWVudDQxNzE3NTM4Mw==,4992424,2018-08-30T03:09:41Z,2018-08-30T03:09:41Z,NONE,"Can you provide a `gdalinfo` of one of the GeoTiffs? I'm still working on some documentation for use-cases with cloud-optimized GeoTiffs to supplement @scottyhq's fantastic example notebook. One of the wrinkles I'm tracking down and trying to document is when exactly the GDAL->rasterio->dask->xarray pipeline eagerly load the entire file versus when it defers reading or reads subsets of files. So far, it seems that if the GeoTiff is appropriately chunked ahead of time (when it's written to disk), things basically work ""automagically.""","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,344621749
https://github.com/pydata/xarray/issues/1970#issuecomment-372475210,https://api.github.com/repos/pydata/xarray/issues/1970,372475210,MDEyOklzc3VlQ29tbWVudDM3MjQ3NTIxMA==,4992424,2018-03-12T21:52:22Z,2018-03-12T21:52:22Z,NONE,@jhamman What do you think would be involved in fleshing out the integration between xarray and rasterio in order to output cloud-optimized GeoTiffs? I,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302806158
https://github.com/pydata/xarray/issues/1631#issuecomment-336634555,https://api.github.com/repos/pydata/xarray/issues/1631,336634555,MDEyOklzc3VlQ29tbWVudDMzNjYzNDU1NQ==,4992424,2017-10-14T13:19:58Z,2017-10-14T13:19:58Z,NONE,Thanks for documenting this @jhamman. I think all the logic is in `.resample(...).interpolate()` to build out true interpolation or really imputation/infilling. I can jump in if there's any confusion in the code.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,265056503
https://github.com/pydata/xarray/issues/1627#issuecomment-336001921,https://api.github.com/repos/pydata/xarray/issues/1627,336001921,MDEyOklzc3VlQ29tbWVudDMzNjAwMTkyMQ==,4992424,2017-10-12T02:26:05Z,2017-10-12T02:26:05Z,NONE,"Wow, great job @benbovy!

With the upcoming move towards Jupyter Lab and a better infrastructure for custom plugins, could this serve as the basis for a ""NetCDF Extension"" for Jupyter Lab? It would be great if double clicking on a NetCDF file in the JLab file explorer could open up this sort of information, or even a quick and dirty ncview-like plotter.","{""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,264747372
https://github.com/pydata/xarray/pull/1608#issuecomment-334526971,https://api.github.com/repos/pydata/xarray/issues/1608,334526971,MDEyOklzc3VlQ29tbWVudDMzNDUyNjk3MQ==,4992424,2017-10-05T16:57:03Z,2017-10-05T16:57:03Z,NONE,"I'm a bit slow on the uptake here, but big 👍  from me. Thanks for catching this bug!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,262874270
https://github.com/pydata/xarray/pull/1608#issuecomment-334453965,https://api.github.com/repos/pydata/xarray/issues/1608,334453965,MDEyOklzc3VlQ29tbWVudDMzNDQ1Mzk2NQ==,4992424,2017-10-05T12:46:54Z,2017-10-05T12:46:54Z,NONE,Great catch; do you need any input from me @jhamman ?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,262874270
https://github.com/pydata/xarray/issues/1605#issuecomment-334224596,https://api.github.com/repos/pydata/xarray/issues/1605,334224596,MDEyOklzc3VlQ29tbWVudDMzNDIyNDU5Ng==,4992424,2017-10-04T17:10:02Z,2017-10-04T17:10:02Z,NONE,"(sorry, originally commented from my work account)

The tutorial dataset is ~6-hourly, so your operation is a downsampling operation.  We don't actually support interpolation on downsampling operations - just aggregations/reductions. Upsampling supports interpolation since there is no implicit way to estimate data between the gaps at the lower temporal frequency. If you just want to estimate a given field at 15-day intervals, for 00Z on those days, then I think you should use `ds.reindex()`, but at the moment I do not think it will work with timeseries. That would be a critical feature to implement.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,262847801
https://github.com/pydata/xarray/issues/1596#issuecomment-332619692,https://api.github.com/repos/pydata/xarray/issues/1596,332619692,MDEyOklzc3VlQ29tbWVudDMzMjYxOTY5Mg==,4992424,2017-09-27T18:49:34Z,2017-09-27T18:49:34Z,NONE,@willirath Never hurts! ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,260912521
https://github.com/pydata/xarray/issues/1596#issuecomment-332519089,https://api.github.com/repos/pydata/xarray/issues/1596,332519089,MDEyOklzc3VlQ29tbWVudDMzMjUxOTA4OQ==,4992424,2017-09-27T13:23:38Z,2017-09-27T13:23:38Z,NONE,"@willirath is your time data equally spaced? If so, you should be able to use the new version of `DataArray.resample()` available on the master (and scheduled for the 0.10.0 release) which supports upsampling/infilling. 

Should work something like this, assuming each timestep is a daily value on the **time** axis:

``` python
ds = xr.open_mfdataset(""paths/to/my/data.nc"")

ds_infilled = ds.resample(time='1D').asfreq()
```

That should get you nans wherever your data is missing.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,260912521
https://github.com/pydata/xarray/pull/1272#issuecomment-331281120,https://api.github.com/repos/pydata/xarray/issues/1272,331281120,MDEyOklzc3VlQ29tbWVudDMzMTI4MTEyMA==,4992424,2017-09-21T21:02:39Z,2017-09-21T21:10:51Z,NONE,"@jhamman Ohhh i totally misunderstood the last readout from travis-ci. Dealing with the scipy dependency is easy enough. ~However, another test fails because it uses [`np.flip()`](https://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.flip.html) which wasn't added to numpy until v1.12.0. Do we want to bump the numpy version in the dependencies? Or is there another aproach to take here?~


Nevermind, easy solution is just to use other axis-reversal methods :)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208215185
https://github.com/pydata/xarray/pull/1272#issuecomment-330910590,https://api.github.com/repos/pydata/xarray/issues/1272,330910590,MDEyOklzc3VlQ29tbWVudDMzMDkxMDU5MA==,4992424,2017-09-20T16:41:01Z,2017-09-20T16:41:01Z,NONE,"@jhamman done - caught me right while I was compiling GEOS-Chem, and the merge conflicts were very simple.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208215185
https://github.com/pydata/xarray/pull/1272#issuecomment-330840457,https://api.github.com/repos/pydata/xarray/issues/1272,330840457,MDEyOklzc3VlQ29tbWVudDMzMDg0MDQ1Nw==,4992424,2017-09-20T12:47:08Z,2017-09-20T12:47:08Z,NONE,"@jhamman Think we're good. I deferred 4 small pep8 issues because they're in parts of the codebase which I don't think I ever touched, and i'm worried they're going to screw up the merge.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208215185
https://github.com/pydata/xarray/pull/1272#issuecomment-330530760,https://api.github.com/repos/pydata/xarray/issues/1272,330530760,MDEyOklzc3VlQ29tbWVudDMzMDUzMDc2MA==,4992424,2017-09-19T12:58:34Z,2017-09-19T12:58:34Z,NONE,"@jhamman Gotcha, I'll clean everything up by the end of the week. If that's going to block 0.10.0, let me know and I'll shuffle some things around to prioritize this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208215185
https://github.com/pydata/xarray/pull/1272#issuecomment-329227114,https://api.github.com/repos/pydata/xarray/issues/1272,329227114,MDEyOklzc3VlQ29tbWVudDMyOTIyNzExNA==,4992424,2017-09-13T16:43:32Z,2017-09-13T16:43:32Z,NONE,@shoyer fixed.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208215185
https://github.com/pydata/xarray/pull/1272#issuecomment-329162517,https://api.github.com/repos/pydata/xarray/issues/1272,329162517,MDEyOklzc3VlQ29tbWVudDMyOTE2MjUxNw==,4992424,2017-09-13T13:10:04Z,2017-09-13T13:10:04Z,NONE,Hmmm. Something is really screwy with my feature branch and making the task of cleaning up the merge difficult. I'll work on fixing this.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208215185
https://github.com/pydata/xarray/pull/1272#issuecomment-329039697,https://api.github.com/repos/pydata/xarray/issues/1272,329039697,MDEyOklzc3VlQ29tbWVudDMyOTAzOTY5Nw==,4992424,2017-09-13T02:34:21Z,2017-09-13T02:34:21Z,NONE,"Try refreshing? Latest commit is 7a767d8 and has all these changes plus
some more tweaks.

*Daniel Rothenberg *
Postdoctoral Research Associate
Center for Global Change Science
Massachusetts Institute of Technology
*A: *77 Massachusetts Ave | E18-402A
     Cambridge, MA 02139
*T: *(502) 648-7513; *T: *(617) 258-0407
*E: *darothen@mit.edu
*H: *www.danielrothenberg.com
<http://github.com/darothen> [image:
http://www.linkedin.com/in/rothenbergdaniel/]
<http://www.linkedin.com/in/rothenbergdaniel/>
<http://www.twitter.com/danrothenberg>

On Tue, Sep 12, 2017 at 12:02 PM, Stephan Hoyer <notifications@github.com>
wrote:

> *@shoyer* commented on this pull request.
> ------------------------------
>
> In xarray/core/resample.py
> <https://github.com/pydata/xarray/pull/1272#discussion_r138390791>:
>
> >
> -            f = interp1d(x, y, kind=kind, axis=axis, bounds_error=True,
> -                         assume_sorted=True)
> +        # Prepare coordinates by dropping non-dimension coordinates along the
> +        # resampling dimension.
> +        # note: the lower-level Variable API could be used to speed this up
> +        coords = OrderedDict()
> +        if self._obj.data_vars:
> +            var = list(self._obj.data_vars.keys())[0]
> +            da = self._obj[var].copy()
>
> did you push those commits yet? I'm not seeing it yet
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <https://github.com/pydata/xarray/pull/1272#discussion_r138390791>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AEwtqKtNFFYYlmBgLHyDLPVZNJe7T18Gks5shqqdgaJpZM4MDalB>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208215185
https://github.com/pydata/xarray/issues/1279#issuecomment-328724595,https://api.github.com/repos/pydata/xarray/issues/1279,328724595,MDEyOklzc3VlQ29tbWVudDMyODcyNDU5NQ==,4992424,2017-09-12T03:29:29Z,2017-09-12T03:29:29Z,NONE,"@shoyer - This output is usually provided as a sequence of daily netCDF files, each on a ~2 degree global grid with 24 timesteps per file (so shape 24 x 96 x 144). For convenience, I usually concatenate these files into yearly datasets, so they'll have a shape (8736 x 96 x 144). I haven't played too much with how to chunk the data, but it's not uncommon for me to load 20-50 of these files simultaneously (each holding a years worth of data) and treat each year as an ""ensemble member dimension, so my data has shape (50 x 8736 x 96 x 144). Yes, keeping everything in dask array land is preferable, I suppose.

@jhamman - Wow, that worked pretty much perfectly! There's a handful of typos (you switch from ""a"" to ""x"" halfway through), and there's a lot of room for optimization by chunksize. But it just *works*, which is absolutely ridiculous. I just pushed a ~200 GB dataset on my cluster with ~50 cores and it screamed through the calculation.

Is there anyway this could be pushed before 0.10.0? It's a killer enhancement. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208903781
https://github.com/pydata/xarray/issues/1279#issuecomment-328314676,https://api.github.com/repos/pydata/xarray/issues/1279,328314676,MDEyOklzc3VlQ29tbWVudDMyODMxNDY3Ng==,4992424,2017-09-10T02:04:33Z,2017-09-10T02:04:33Z,NONE,"In light of #1489 is there a way to move forward here with `rolling` on `dask`-backed data structures? 

In soliciting the atmospheric chemistry community for a few illustrative examples for [gcpy](http://danielrothenberg.com/gcpy/), it's become apparent that indices computed from re-sampled timeseries would be killer, attention-grabbing functionality. For instance, the EPA air quality standard we use for ozone involves taking hourly data, computing 8-hour rolling means for each day of your dataset, and then picking the maximum of those means for each day (""MDA8 ozone""). Similar metrics exist for other pollutants.

With traditional xarray data-structures, it's *trivial* to compute this quantity (assuming we have hourly data and using the new resample API from #1272):

``` python
ds = xr.open_dataset(""hourly_ozone_data.nc"")
mda8_o3 = (
    ds['O3']
    .rolling(time=8, min_periods=6)
    .mean('time')
    .resample(time='D').max()
)
```
There's one quirk relating to timestamp the rolling data (by default `rolling` uses the *last* timestamp in a dataset, where in my application I want to label data with the *first* one) which makes that chained method a bit impractical, but it only adds like one line of code and it is totally dask-friendly. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208903781
https://github.com/pydata/xarray/pull/1272#issuecomment-326689304,https://api.github.com/repos/pydata/xarray/issues/1272,326689304,MDEyOklzc3VlQ29tbWVudDMyNjY4OTMwNA==,4992424,2017-09-01T21:38:18Z,2017-09-01T21:38:18Z,NONE,"Resolved to drop auxiliary coordinates which are defined along the dimension to be re-sampled. This makes sense; if someone wants them to be interpolated or manipulated in some way, then they should promote them from coordinates to variables before doing the resampling.

In response to #1328, `count()` works just fine if you call it from a `Resample` object. Works for both resampling and up-sampling, but it will preserve the shape of the non-resampled dimensions. I think that's fine, because `count()` treats NaN as missing by default, so you can immediately know in which grid cells you have missing data :)

Final review, @shoyer, before merging in anticipation of 0.10.0?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208215185
https://github.com/pydata/xarray/issues/486#issuecomment-325974604,https://api.github.com/repos/pydata/xarray/issues/486,325974604,MDEyOklzc3VlQ29tbWVudDMyNTk3NDYwNA==,4992424,2017-08-30T12:26:07Z,2017-08-30T12:26:07Z,NONE,"@ocefpaf Awesome, good to know that hurdle has already been leaped :)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,96211612
https://github.com/pydata/xarray/issues/486#issuecomment-325969302,https://api.github.com/repos/pydata/xarray/issues/486,325969302,MDEyOklzc3VlQ29tbWVudDMyNTk2OTMwMg==,4992424,2017-08-30T12:01:29Z,2017-08-30T12:01:29Z,NONE,"If ESMF is the way to go, then some effort needs to be made to build conda recipes and other infrastructure for distributing and building the platform. It's a heavy dependency to haul around.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,96211612
https://github.com/pydata/xarray/issues/1534#issuecomment-325777712,https://api.github.com/repos/pydata/xarray/issues/1534,325777712,MDEyOklzc3VlQ29tbWVudDMyNTc3NzcxMg==,4992424,2017-08-29T19:42:24Z,2017-08-29T19:42:24Z,NONE,"@mmartini-usgs, an entire netCDF file (as long as it only has 1 group, which it most likely does if we're talking about standard atmospheric/oceanic data) would be the equivalent of an `xarray.Dataset`. Each variable could be represented as a `pandas.DataFrame`, but with a `MultiIndex` - an index with multiple levels, but which are consist across each level. 

To start with, you should read in your data using the **chunks** keyword to `open_dataset()`; this turns all of the data you read into dask arrays. Then, you use xarray Dataset and DataArray operations to manipulate them. So you can start, instead, by opening your data:

``` python
ds = xr.open_dataset('hugefile.nc', chunks={<fill me in>})
ds_lp = ds.resample('H','time','mean')
```

You'd have to choose chunks based on the dimensions of your data. Like @rabernat previously mentioned, it's very likely you can perform your entire workflow within xarray without every having to drop down to pandas; let us know if you can share more details","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253407851
https://github.com/pydata/xarray/issues/1535#issuecomment-325494110,https://api.github.com/repos/pydata/xarray/issues/1535,325494110,MDEyOklzc3VlQ29tbWVudDMyNTQ5NDExMA==,4992424,2017-08-28T21:52:54Z,2017-08-28T21:52:54Z,NONE,"Great; there's only a single action item left on #1272, so I'll try to get to that later this week.","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 1, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,253463226
https://github.com/pydata/xarray/pull/1272#issuecomment-323539716,https://api.github.com/repos/pydata/xarray/issues/1272,323539716,MDEyOklzc3VlQ29tbWVudDMyMzUzOTcxNg==,4992424,2017-08-19T18:24:29Z,2017-08-19T18:24:29Z,NONE,"All set except for my one question to @shoyer above. I've opted not to include a chart outlining the various upsampling options... couldn't really think of a nice and clean way to do so, because adding it to the time series doc page ends up being really ugly and there isn't quite enough substance for its own worked example page.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208215185
https://github.com/pydata/xarray/pull/1272#issuecomment-320297159,https://api.github.com/repos/pydata/xarray/issues/1272,320297159,MDEyOklzc3VlQ29tbWVudDMyMDI5NzE1OQ==,4992424,2017-08-04T16:45:56Z,2017-08-19T18:23:06Z,NONE,"Okay, it was a bit of effort but I implemented upsampling. For the padding methods I just re-index the Dataset or DataArray using the re-sampled time frequencies. I also added interpolation, but that was a bit tricky; we have to sort of break the split-apply-combine idiom to do that, so I created a `Resampler` mix-in which could contain the logic for the up-sampling. The `DatasetResampler` and `DataArrayResampler` each then implement similar logic for doing the interpolation. The up-sampling is designed to work with n-dimensional data.

The padding methods work 100% with dask arrays - since we're just calling xarray methods which themselves work with dask arrays! There are some eager computations (just the calculation of the up-sampled time frequencies) but I don't think that's a major issue; the actual re-indexing/padding is deferred. Interpolation works with dask arrays too, but eagerly does the computations. 

Could use a review from @shoyer  or @jhamman.

New **TODO** list:

- [ ] Add example chart to the timeseries doc page comparing the different upsampling options
- [x] Additional up-sampling test cases for both `DataArray`s and `Dataset`s
- [x] Code clean-up
- [x] What's new","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208215185
https://github.com/pydata/xarray/issues/1509#issuecomment-323105006,https://api.github.com/repos/pydata/xarray/issues/1509,323105006,MDEyOklzc3VlQ29tbWVudDMyMzEwNTAwNg==,4992424,2017-08-17T15:20:22Z,2017-08-17T15:20:22Z,NONE,@betaplane a re-factoring of the `resample` API to match pandas' is currently being wrapped up and slated for 0.10.0; see #1272,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,250751931
https://github.com/pydata/xarray/issues/1505#issuecomment-321245721,https://api.github.com/repos/pydata/xarray/issues/1505,321245721,MDEyOklzc3VlQ29tbWVudDMyMTI0NTcyMQ==,4992424,2017-08-09T12:50:40Z,2017-08-09T12:50:40Z,NONE,"How exactly is your WRF output split? It's not clear exactly what you want to do... is it split along different tiles such that indices [1, ..., m] are in `ds_col_0`, [m+1, ..., p] are in `ds_col_1`, and [p+1, ..., n] are in `ds_col_2`? Or is each dataset a different vertical level? Or a different timestep?

I'm not sure that `xr.concat` will even work if you pass **dim** a list of dimensions. It's only designed to concatenate along one dimension at a time; if you pass a pandas Index or a DataArray as the argument for **dim**, then it will create a new dimension in the dataset and use the values in that argument as the coordinates - so you have to exactly match the number Datasets or DataArrays in the first argument. ","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",,248942085
https://github.com/pydata/xarray/pull/1272#issuecomment-316404161,https://api.github.com/repos/pydata/xarray/issues/1272,316404161,MDEyOklzc3VlQ29tbWVudDMxNjQwNDE2MQ==,4992424,2017-07-19T14:24:38Z,2017-08-04T16:39:53Z,NONE,"### TODO

- [x] ensure that `count()` works on `Data{Array,set}Resample` objects
- [x] refactor `Data{Array,set}Resample` objects into a stand-alone file **core/resample.py** alongside **core/groupby.py**
- [x] wrap `pytest.warns` around tests targeting old API 
- [x] move old API tests into stand-alone
- [x] Crude up-sampling. Copy/pasting Stephan's earlier comment from Feb 20:

> I think we need to fix this before merging this PR, since it suggests the existing functionality would only exist in deprecated form. Pandas does this with a method called .asfreq, though this is basically pure sugar since in practice I think it works exactly the same as .first (or .mean if only doing pure upsampling).

---

Alright @jhamman, here's the complete list of work left here. I'll tackle some of it during my commutes this week.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208215185
https://github.com/pydata/xarray/pull/1272#issuecomment-319988645,https://api.github.com/repos/pydata/xarray/issues/1272,319988645,MDEyOklzc3VlQ29tbWVudDMxOTk4ODY0NQ==,4992424,2017-08-03T14:39:04Z,2017-08-03T14:39:04Z,NONE,"Finished off everything except upsampling. In pandas, all [upsampling](https://github.com/pandas-dev/pandas/blob/d02ef6f04466e4a74f67ad584cf38cdc6df56e42/pandas/core/resample.py#L890) works by constructing a new time index (which we already do) and then filling in the NaNs that result in the dataset with one of a few different rules. Not sure how involved this will be, but I anticipate this can all be implemented in core/resample.py","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208215185
https://github.com/pydata/xarray/issues/1490#issuecomment-318079611,https://api.github.com/repos/pydata/xarray/issues/1490,318079611,MDEyOklzc3VlQ29tbWVudDMxODA3OTYxMQ==,4992424,2017-07-26T14:57:58Z,2017-07-26T14:57:58Z,NONE,"Did some digging. 

Note here that the **dtype**s of `time1` and `time2` are different; the first is a **datetime64[ns]** but the second is a **datetime64[ns, UTC]**. For the sake of illustration, I'm going to change the timezone to EST. If we print `time2`, we get something that looks like this:

``` python
>>> time2
DatetimeIndex(['2000-01-01 00:00:00-05:00', '2000-01-01 01:00:00-05:00',
               '2000-01-01 02:00:00-05:00', '2000-01-01 03:00:00-05:00',
               '2000-01-01 04:00:00-05:00', '2000-01-01 05:00:00-05:00',
               '2000-01-01 06:00:00-05:00', '2000-01-01 07:00:00-05:00',
               '2000-01-01 08:00:00-05:00', '2000-01-01 09:00:00-05:00',
               ...
               '2000-12-30 14:00:00-05:00', '2000-12-30 15:00:00-05:00',
               '2000-12-30 16:00:00-05:00', '2000-12-30 17:00:00-05:00',
               '2000-12-30 18:00:00-05:00', '2000-12-30 19:00:00-05:00',
               '2000-12-30 20:00:00-05:00', '2000-12-30 21:00:00-05:00',
               '2000-12-30 22:00:00-05:00', '2000-12-30 23:00:00-05:00'],
              dtype='datetime64[ns, EST]', length=8760, freq='H')
```

But, if we directly print its *values*, we get something slightly different:

``` python
>>> time2.values
array(['2000-01-01T05:00:00.000000000', '2000-01-01T06:00:00.000000000',
       '2000-01-01T07:00:00.000000000', ...,
       '2000-12-31T02:00:00.000000000', '2000-12-31T03:00:00.000000000',
       '2000-12-31T04:00:00.000000000'], dtype='datetime64[ns]')
``` 

The difference is that the timezone delta has been automatically added in terms of hours to each value in `time2`. This brings up something to note: if you construct your `Dataset` using `time1.values` and `time2.values`, there is no problem:

``` python
import pandas as pd
import xarray as xr
time1 = pd.date_range('2000-01-01', freq='H', periods=365 * 24)  #timezone naïve
time2 = pd.date_range('2000-01-01', freq='H', periods=365 * 24, tz='UTC')  #timezone aware
ds1 = xr.Dataset({'foo': ('time', np.arange(365 * 24)), 'time': time1.values})
ds2 = xr.Dataset({'foo': ('time', np.arange(365 * 24)), 'time': time2.values})
ds1.resample('3H', 'time', how='mean')  # works fine
ds2.resample('3H', 'time', how='mean')  # works fine
```

Both `time1` and `time2` are instances of `pd.DatetimeIndex` which are subclasses of `pd.Index`. When xarray tries to turn them into `Variable`s, it ultimately uses a `PandasIndexAdapter` to decode the contents of `time1` and `time2`, and this is where the trouble happens. The `PandasIndexAdapter` tries to safely cast the dtype of the array it is passed, which works just fine for `time1`. But for some weird reason, numpy doesn't recognize its own datetime dtypes when they have timezone information. That is, this will work:

``` python
>>> np.dtype('datetime64[ns]')
dtype('<M8[ns]')
```
But this won't:

``` python
>>> np.dtype('datetime64[ns, UTC]')
TypeError: Invalid datetime unit in metadata string ""[ns, UC]""
```

But also, the type of `time2.dtype` is a `pandas.types.dtypes.DatetimeTZDtype`, which NumPy doesn't know what to do with (it doesn't know how to map that type to its own `datetime64`).

So what happens is that the resulting `Variable` which defines the **time** coordinate on your `ds2` has an array with the correct values, but is explicitly told to have the dtype `object`. When the array is decoded, then, bad things happen. 

One solution would be to catch this potential glitch in either [`is_valid_numpy_dtype()`](https://github.com/pydata/xarray/blob/a943419b86c1c952d07cef6acf0b10ea5784a4cc/xarray/core/utils.py#L193) or the [`PandasIndexAdapter` constructor](https://github.com/pydata/xarray/blob/c2588dadff82f2e56b9ec9c10d6d57661dbcce15/xarray/core/indexing.py#L506). Alternatively, we could eagerly coerce arrays with type `pandas.types.dtypes.DatetimeTZDtype` into numpy-compliant types at some earlier point.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,245649333
https://github.com/pydata/xarray/pull/1272#issuecomment-316398830,https://api.github.com/repos/pydata/xarray/issues/1272,316398830,MDEyOklzc3VlQ29tbWVudDMxNjM5ODgzMA==,4992424,2017-07-19T14:07:00Z,2017-07-19T14:07:00Z,NONE,I did my best to re-base everything to master... plan on spending an hour or so figuring out what's broken and at least restoring the status quo.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208215185
https://github.com/pydata/xarray/issues/1483#issuecomment-316377854,https://api.github.com/repos/pydata/xarray/issues/1483,316377854,MDEyOklzc3VlQ29tbWVudDMxNjM3Nzg1NA==,4992424,2017-07-19T12:59:04Z,2017-07-19T12:59:04Z,NONE,"Instead of computing the mean over your non-stacked dimension by

``` python
dsg = dst.groupby('allpoints').mean()
```

why not just instead call

``` python
dsg = dst.mean('time', keep_attrs=True)
```

so that you just collapse the **time** dimension and preserve the attributes on your data? Then you can `unstack()` and everything should still be there. The idiom of stacking/applying/unstacking is really useful to fit your data to the interface of a numpy or scipy function that will do all the heavy lifting with a vectorized routine for you - isn't using `groupby` in this way really slow? 
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,244016361
https://github.com/pydata/xarray/issues/1482#issuecomment-316376598,https://api.github.com/repos/pydata/xarray/issues/1482,316376598,MDEyOklzc3VlQ29tbWVudDMxNjM3NjU5OA==,4992424,2017-07-19T12:54:30Z,2017-07-19T12:54:30Z,NONE,"@mitar it depends on your data/application, right? But that information would also be helpful in figuring out alternative pathways. If you're always going to process the images individually or sequentially, then what advantage is there (aside from convenience) of dumping them in some giant array with forced dimensions/shape per slice?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,243964948
https://github.com/pydata/xarray/issues/1482#issuecomment-316371416,https://api.github.com/repos/pydata/xarray/issues/1482,316371416,MDEyOklzc3VlQ29tbWVudDMxNjM3MTQxNg==,4992424,2017-07-19T12:34:32Z,2017-07-19T12:34:32Z,NONE,"The problem is that these sorts of arrays break the [common data model](http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/CDM/) on top of which xarray (and NetCDF) is built. 

> If I understand correctly, I could batch all images of the same size into its own dimension? That might be also acceptable.

Yes, if you can pre-process all the images and align them on some common set of dimensions (maybe just **xi** and **yi**, denoting integer index in the x and y directions), and pad unused space for each image with NaNs, then you could concatenate everything into a `Dataset`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,243964948
https://github.com/pydata/xarray/pull/1272#issuecomment-315355743,https://api.github.com/repos/pydata/xarray/issues/1272,315355743,MDEyOklzc3VlQ29tbWVudDMxNTM1NTc0Mw==,4992424,2017-07-14T13:10:22Z,2017-07-14T13:10:22Z,NONE,"I think a pull against the new releases is critical to see what breaks. Beyond that, just code clean up and testing. I can try to bump this higher on my priority list.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208215185
https://github.com/pydata/xarray/issues/1354#issuecomment-313106392,https://api.github.com/repos/pydata/xarray/issues/1354,313106392,MDEyOklzc3VlQ29tbWVudDMxMzEwNjM5Mg==,4992424,2017-07-05T13:41:56Z,2017-07-05T13:41:56Z,NONE,"@wqshen, a workaround until a more complete modification to `align` is available would be to explicitly copy/set the coordinate values on your arrays before using `xr.concat()`. Alternatively, if it's as simple as stacking along a new tailing axis, you could stack via dask/numpy and then construct a new `DataArray` passing the coordinates explicitly.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,219692578
https://github.com/pydata/xarray/issues/1447#issuecomment-308914123,https://api.github.com/repos/pydata/xarray/issues/1447,308914123,MDEyOklzc3VlQ29tbWVudDMwODkxNDEyMw==,4992424,2017-06-16T02:14:31Z,2017-06-16T02:14:31Z,NONE,"For [xbpch](https://github.com/darothen/xbpch) I followed a similar naming convention based on @rabernat's [xmitgcm](https://github.com/rabernat/xmitgcm). Brewing on the horizon is an xarray-powered toolkit for [GEOS-Chem](http://acmg.seas.harvard.edu/geos/) and while it'll be a stand-alone library, I imagine it'll belong to this confederation of toolkits and provide an accessor or two for computing model grid geometries and related things on-the-fly. I'd also +1 for an `xarray` prefix (so, `xbpch` -> `xarray-bpch` and `xmitgcm` -> `xarray-mitgcm`?)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,234658224
https://github.com/pydata/xarray/issues/1192#issuecomment-305178905,https://api.github.com/repos/pydata/xarray/issues/1192,305178905,MDEyOklzc3VlQ29tbWVudDMwNTE3ODkwNQ==,4992424,2017-05-31T12:59:52Z,2017-05-31T12:59:52Z,NONE,"Not to hijack the thread, but @PeterDSteinberg - this is the first I've heard of earthio and I think there would be a lot of interest from the broader atmospheric/oceanic sciences community to hear about what your all's plans are. Could your team do a blog post on Continuum sometime outlining the goals of the project?","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,198742089
https://github.com/pydata/xarray/issues/470#issuecomment-304107683,https://api.github.com/repos/pydata/xarray/issues/470,304107683,MDEyOklzc3VlQ29tbWVudDMwNDEwNzY4Mw==,4992424,2017-05-25T19:57:22Z,2017-05-25T19:57:22Z,NONE,"This certainly could be useful, but since this is essentially plotting a vector of data, why not just drop into pandas?

```
df = da.to_dataframe()
# Could reset coordinates if you really wanted
# df = df.reset_index()
df.plot.scatter('longitude', 'latitude', c=da.name)
```

Patching in this rough functionality into the plotting module should be really straightforward, maybe @jhamman has some tips?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,94787306
https://github.com/pydata/xarray/issues/1279#issuecomment-301489242,https://api.github.com/repos/pydata/xarray/issues/1279,301489242,MDEyOklzc3VlQ29tbWVudDMwMTQ4OTI0Mg==,4992424,2017-05-15T14:18:55Z,2017-05-15T14:18:55Z,NONE,Dask dataframes have recently been updated so that rolling operations work (dask/dask#2198). Does this open a pathway to enable rolling on dask arrays within xarray?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208903781
https://github.com/pydata/xarray/issues/1391#issuecomment-300462962,https://api.github.com/repos/pydata/xarray/issues/1391,300462962,MDEyOklzc3VlQ29tbWVudDMwMDQ2Mjk2Mg==,4992424,2017-05-10T12:11:56Z,2017-05-10T12:11:56Z,NONE,"@klapo! Great to see you here!

Happy to iterate with you on documenting this functionality. For reference, I wrote [a package](https://github.com/darothen/experiment) for my dissertation work to help automate the task of constructing multi-dimensional Datasets which include dimensions corresponding to experimental/ensemble factors. One of my on-going projects is to actually fully abstract this (I have a not-uploaded branch of the project which tries to build the notion of an ""EnsembleDataset"", which has the same relationship to a Dataset that an pandas Panel used to have to a DataFrame).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,225536793
https://github.com/pydata/xarray/issues/1397#issuecomment-299194997,https://api.github.com/repos/pydata/xarray/issues/1397,299194997,MDEyOklzc3VlQ29tbWVudDI5OTE5NDk5Nw==,4992424,2017-05-04T14:05:48Z,2017-05-04T14:05:48Z,NONE,"Cool; please keep me in the loop if you don't mind, because I also have an application which I'd really like to just be able use the built-in faceting for rather than building my plot grids manually. 

A good comparison case is to perform the same plots (with the same set aspect/size/ratio at both the figure and subplot level) but just don't use the Cartopy transformations. In these cases, I have all the control that I would expect. There are also important differences between `pcolor`ing and `imshow`ing which would be useful to understand. At a minimum, we should deliver back to **xarray** some improved documentation discussing handling subplot geometry during faceting.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,225846258
https://github.com/pydata/xarray/issues/1397#issuecomment-299191499,https://api.github.com/repos/pydata/xarray/issues/1397,299191499,MDEyOklzc3VlQ29tbWVudDI5OTE5MTQ5OQ==,4992424,2017-05-04T13:53:09Z,2017-05-04T13:53:09Z,NONE,"@fmaussion What happens if you add `aspect=""auto""` to **subplot_kws**? 

I'm tempted to have us move this discussion to StackOverflow (for heightened visibility), but I suspect there might actually be a bug somewhere in the finalization of the faceting that undoes the specifications you pass to the initial subplot constructor.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,225846258
https://github.com/pydata/xarray/issues/1397#issuecomment-299056235,https://api.github.com/repos/pydata/xarray/issues/1397,299056235,MDEyOklzc3VlQ29tbWVudDI5OTA1NjIzNQ==,4992424,2017-05-03T22:43:55Z,2017-05-03T22:43:55Z,NONE,"> The biggest trouble I have is with tightening the space between the map and the colorbar at the bottom, but this looks like a cartopy/mpl question, not an xarray question, so I should quit pestering you guys.

You just need to pass the ""pad"" argument to `cbar_kwargs`. 

The trickier problem is that sometimes cartopy can be a bit unpredictable in controlling the size and aspect ratio of axes after you've plotted maps on them. You can force a plot to respect the aspect ratio you use when you construct an axis by using the keyword `aspect=""auto""`, but it can be a bit difficult to get this to work in xarray sometimes. But at the end of the day, it's not a big deal to hand-craft a publication-quality figure once you know the rough gist of what you want to go on it - and xarray's tools are already great for that.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,225846258
https://github.com/pydata/xarray/pull/1356#issuecomment-294829429,https://api.github.com/repos/pydata/xarray/issues/1356,294829429,MDEyOklzc3VlQ29tbWVudDI5NDgyOTQyOQ==,4992424,2017-04-18T12:53:01Z,2017-04-18T12:53:01Z,NONE,"Alrighty, patched and ready for a final look-over! I appreciate the help and patience, @shoyer!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,220011864
https://github.com/pydata/xarray/pull/1356#issuecomment-294628295,https://api.github.com/repos/pydata/xarray/issues/1356,294628295,MDEyOklzc3VlQ29tbWVudDI5NDYyODI5NQ==,4992424,2017-04-17T23:44:08Z,2017-04-17T23:44:08Z,NONE,Turns out it was easy enough to add an accessor for `ds['time.time']`; that's already provided via pandas.,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,220011864
https://github.com/pydata/xarray/pull/1356#issuecomment-294520064,https://api.github.com/repos/pydata/xarray/issues/1356,294520064,MDEyOklzc3VlQ29tbWVudDI5NDUyMDA2NA==,4992424,2017-04-17T16:21:19Z,2017-04-17T16:21:19Z,NONE,"There's a test-case relating to #367 (**test_virtual_variable_same_name**) which is causing me a bit of grief as I re-factor the virtual variable logic. Should we really be able to access variables like `ds['time.time']`? This seems to break the logic of what a virtual variable *does*, and was implemented to help out with time GroupBys and resampling (something I'll eventually get around to finishing up a refactor for - #1272). 

Two options for fixing:

1. add a ""time"" field to `DateTimeAccessor`.
2. add some additional if-then-else logic to `_get_virtual_variable`.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,220011864
https://github.com/pydata/xarray/pull/1356#issuecomment-293881177,https://api.github.com/repos/pydata/xarray/issues/1356,293881177,MDEyOklzc3VlQ29tbWVudDI5Mzg4MTE3Nw==,4992424,2017-04-13T12:26:24Z,2017-04-13T12:26:24Z,NONE,"Finished clean-up, added some documentation, etc. I mangled resolving a merge conflict with my update to `whats-new.rst` (5ae4e08) in terms of the commit text, but other than that I think we're getting closer to finishing this.

wrt to the virtual variables, I think some more thinking is necessary so we can come up with a plan of approach. Do we want to deprecate this feature entirely? Do we just want to wrap the datetime component virtual variables to the `.dt` accessor if they're datetime-like? We could very easily do the latter for 0.9.3, but maybe we should target a future major release to deprecate the virtual variables and instead encourage adding a few specialized (but commonly-used) accessors to xarray?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,220011864
https://github.com/pydata/xarray/pull/1356#issuecomment-293280073,https://api.github.com/repos/pydata/xarray/issues/1356,293280073,MDEyOklzc3VlQ29tbWVudDI5MzI4MDA3Mw==,4992424,2017-04-11T14:25:27Z,2017-04-11T14:25:27Z,NONE,Updated with support for multi-dimensional time data stored as dask array.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,220011864
https://github.com/pydata/xarray/issues/1352#issuecomment-292930254,https://api.github.com/repos/pydata/xarray/issues/1352,292930254,MDEyOklzc3VlQ29tbWVudDI5MjkzMDI1NA==,4992424,2017-04-10T12:06:52Z,2017-04-10T12:07:03Z,NONE,"Yeah, I tend to agree, there should be some sort of auto-magic happening. But, I can think of at least two options:

1. Coerce to array-like, like you do manually in your first comment here. That makes sense if the dimension is important, i.e. it carries useful metadata or encodes something important.

2. Coerce to an attribute on the Dataset.

I use workflows where I concatenate things like multiple ensemble members into a single file, and I wind up with this pattern all the time. I usually just `drop()` the offending coordinate, and save it as part of the output filename. This is because tools like `cdo` really, really don't like non lat-lon-time dimensions, so that can interrupt my workflow sometimes. Saving as an attribute bypasses this issue, but then you lose the ability to retain any metadata that was associated with that coordinate.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,219321876
https://github.com/pydata/xarray/issues/1352#issuecomment-292926691,https://api.github.com/repos/pydata/xarray/issues/1352,292926691,MDEyOklzc3VlQ29tbWVudDI5MjkyNjY5MQ==,4992424,2017-04-10T11:48:37Z,2017-04-10T11:48:37Z,NONE,"@andreas-h you can drop the 0D dimensions:

``` python
d_ = d_.drop(['category', 'species'])
d_.to_netcdf(...)
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,219321876
https://github.com/pydata/xarray/pull/1356#issuecomment-292569100,https://api.github.com/repos/pydata/xarray/issues/1356,292569100,MDEyOklzc3VlQ29tbWVudDI5MjU2OTEwMA==,4992424,2017-04-07T15:30:43Z,2017-04-07T15:30:43Z,NONE,"@shoyer I corrected things based on your comments. The last commit is an attempt to refactor things to match the way that methods like rolling/groupby functions are injected into the class; this might be totally superfluous here, but I thought it was worth trying.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,220011864
https://github.com/pydata/xarray/issues/358#issuecomment-291568964,https://api.github.com/repos/pydata/xarray/issues/358,291568964,MDEyOklzc3VlQ29tbWVudDI5MTU2ODk2NA==,4992424,2017-04-04T17:14:18Z,2017-04-04T17:14:18Z,NONE,"[Proof of concept, borrowing liberally from pandas](https://gist.github.com/anonymous/6254b1f682e05d10aaf51d93ed343534). I think this will be pretty straightforward to hook up into xarray. I wonder, is there any way to register such an accessor with `DataArray`s  that have a specific dtype? Ideally we'd only want to expose this accessor if a DataArray was a numpy.datetime64 type under the hood.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,59720901
https://github.com/pydata/xarray/issues/358#issuecomment-291228898,https://api.github.com/repos/pydata/xarray/issues/358,291228898,MDEyOklzc3VlQ29tbWVudDI5MTIyODg5OA==,4992424,2017-04-03T18:20:32Z,2017-04-03T18:20:32Z,NONE,"Working on a project today which would greatly benefit from having the .dt accessors. Given that this issue is nearly two years old, any thoughts on what it would take to resolve in the present codebase? Still as straightforward as wrappers on the pandas time series methods?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,59720901
https://github.com/pydata/xarray/issues/1092#issuecomment-290133148,https://api.github.com/repos/pydata/xarray/issues/1092,290133148,MDEyOklzc3VlQ29tbWVudDI5MDEzMzE0OA==,4992424,2017-03-29T15:47:57Z,2017-03-29T15:48:17Z,NONE,"Ah, thanks for the heads-up @benbovy! I see the difference now, and I agree
both approaches could co-exist. I may play around with building some of
your proposed `DatasetNode` functionality into my `Experiment` tool.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,187859705
https://github.com/pydata/xarray/issues/1092#issuecomment-290106782,https://api.github.com/repos/pydata/xarray/issues/1092,290106782,MDEyOklzc3VlQ29tbWVudDI5MDEwNjc4Mg==,4992424,2017-03-29T14:26:15Z,2017-03-29T14:26:15Z,NONE,"Would the domain for this just be to simulate the tree-like structure that NetCDF permits, or could it extend to multiple datasets on disk? One of the ideas that we had [during the aospy hackathon](https://aospy.hackpad.com/Data-StorageDiscovery-Design-Document-fM6LgfwrJ2K) involved some sort of idiom based on xarray for packing multiple, similar datasets together. For instance, it's very common in climate science to re-run a model multiple times nearly identically, but changing a parameter or boundary condition. So you end up with large archives of data on disk which are identical in shape and metadata, and you want to be able to quickly analyze across them.

As an example, I built [a helper tool](https://github.com/darothen/experiment/blob/master/experiment/experiment.py) during my dissertation to automate much of this, allowing you to dump your processed output in some sort of directory structure and consistent naming scheme, and then easily ingest what you need for a given analysis. It's actually working great for a much larger, Monte Carlo set of model simulations right now (3 factor levels with 3-5 values at each level, for a total of 1500 years of simulation). My tool works by concatenating each experimental factor as a new dimension, which lets you use xarray's selection tools to perform analyses across the ensemble. You can pre-process things before concatenating too, if the data ends up being too big to fit in memory (e.g. for every simulation in the experiment, compute time-zonal averages before concatenation). 

Going back to @shoyer's [comment](https://github.com/pydata/xarray/issues/1092#issuecomment-259206339), it still seems as though there is room to build some sort of collection of `Dataset`s, in the same way that a `Dataset` is a collection of `DataArray`s. Maybe this is different than @lamorton's grouping example, but it would be really, really cool if you could use the same sort of syntactic sugar to select across multiple `Dataset`s with like-dimensions just as you could slice into groups inside a `Dataset` as proposed here. It would certainly make things much more manageable than concatenating huge combinations of `Dataset`s in memory!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,187859705
https://github.com/pydata/xarray/issues/1327#issuecomment-289106737,https://api.github.com/repos/pydata/xarray/issues/1327,289106737,MDEyOklzc3VlQ29tbWVudDI4OTEwNjczNw==,4992424,2017-03-24T18:25:40Z,2017-03-24T18:25:40Z,NONE,"I saw your PR #1328 on this, but just a heads up that there is an open issue #1269 and pull-request #1272 to re-factor the resampling API to match the GroupBy-like API used by pandas. `count()` works without any issues on my feature branch.

I've been extremely busy but can try to carve out some more time in the near future to settle some remaining issues on that PR, which would resolve this issue too.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216833414
https://github.com/pydata/xarray/pull/1272#issuecomment-283448614,https://api.github.com/repos/pydata/xarray/issues/1272,283448614,MDEyOklzc3VlQ29tbWVudDI4MzQ0ODYxNA==,4992424,2017-03-01T19:46:46Z,2017-03-01T19:46:46Z,NONE,"Should `.apply()` really work on non-aggregation functions? Based on the [pandas documentation](http://pandas.pydata.org/pandas-docs/stable/timeseries.html#resampling) it seems like ""resample"" is truly just a synonym for a transformation of the time dimension. I can't really find many examples of people using this as a substitute for time group-bys... it seems that's what the `pd.TimeGrouper` is for, in conjunction with a normal `.groupby()`.

As written, non-aggregation (""transformation""?) doesn't work because the call in `_combine()` to `_maybe_reorder()` messes things up (it drops all of the data along the resampled dimension). It shouldn't be too hard to fix this, although I'm leaning more and more to making stand-alone `Data{Array,set}Resample` classes in a separate file which only loosely inherit from their `Data{Array,set}GroupBy` cousins, since they need to re-write some really critical parts of the underlying machinery.  ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208215185
https://github.com/pydata/xarray/pull/1272#issuecomment-281208031,https://api.github.com/repos/pydata/xarray/issues/1272,281208031,MDEyOklzc3VlQ29tbWVudDI4MTIwODAzMQ==,4992424,2017-02-20T23:51:01Z,2017-02-20T23:51:01Z,NONE,"Thanks for the feedback, @shoyer! Will circle back around to continue
working on this in a few days when I have some free time.

- Daniel
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208215185
https://github.com/pydata/xarray/pull/1272#issuecomment-281186680,https://api.github.com/repos/pydata/xarray/issues/1272,281186680,MDEyOklzc3VlQ29tbWVudDI4MTE4NjY4MA==,4992424,2017-02-20T21:36:06Z,2017-02-20T21:36:06Z,NONE,"Smoothed out most of the problems from earlier and missing details. Still not sure if it's wise to refactor most of the resampling logic into a new **resample.py**, like what was done with **rolling**, but it still makes some sense to keep things in **groupby.py** because we're just subclassing existing machinery from there. 

The only issue now is the signature for **__init__()** in `Data{set,Array}Resample`, where we have to add in two keyword arguments. Python 2.x doesn't like named arguments after *\*args*. There are a few options here, mostly just playing with *\*\*kwargs* as in [this StackOverflow thread](http://stackoverflow.com/questions/15301999/python-2-x-default-arguments-with-args-and-kwargs). ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208215185
https://github.com/pydata/xarray/issues/1273#issuecomment-280663975,https://api.github.com/repos/pydata/xarray/issues/1273,280663975,MDEyOklzc3VlQ29tbWVudDI4MDY2Mzk3NQ==,4992424,2017-02-17T14:28:21Z,2017-02-17T14:28:21Z,NONE,+1 from me; adding this as a method on `Dataset` and `DataArray` would be great. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,208312826
https://github.com/pydata/xarray/issues/1269#issuecomment-280104546,https://api.github.com/repos/pydata/xarray/issues/1269,280104546,MDEyOklzc3VlQ29tbWVudDI4MDEwNDU0Ng==,4992424,2017-02-15T18:59:17Z,2017-02-15T18:59:17Z,NONE,"@MaximilianR Oh, the interface is easy enough to do, even maintaining backwards-compatibility (already have that working). I was considering going the route done with [GroupBy](https://github.com/pydata/xarray/blob/93d6963315026f87841c7cf39cc39bb78f555345/xarray/core/groupby.py#L165) and the classes that compose it, like [DatasetGroupBy](https://github.com/pydata/xarray/blob/93d6963315026f87841c7cf39cc39bb78f555345/xarray/core/groupby.py#L586)... basically, we just record the wanted resampling dimension and inject the grouping/resampling operations we want. Also adds the ability to specialize methods like `.first()` and `.last()`, which is done under the current implementation.

*But*.... if there's a simpler way, that might be preferable!","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,207587161
https://github.com/pydata/xarray/issues/1269#issuecomment-279845588,https://api.github.com/repos/pydata/xarray/issues/1269,279845588,MDEyOklzc3VlQ29tbWVudDI3OTg0NTU4OA==,4992424,2017-02-14T21:44:11Z,2017-02-14T21:44:11Z,NONE,"Assuming we want to stick with `pd.TimeGrouper` under the hood, the only sticking point I've come across so far is how to have the resulting `Data{Array,set}GroupBy` object ""remember"" the resampling dimension, e.g. if you have multi-dimensional data and want to compute time means you have to call

``` python
ds.resample(time='24H').mean('time')
```

or else `mean` will operate across all dimensions. Any thoughts, @shoyer?

","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,207587161
https://github.com/pydata/xarray/issues/1269#issuecomment-279810604,https://api.github.com/repos/pydata/xarray/issues/1269,279810604,MDEyOklzc3VlQ29tbWVudDI3OTgxMDYwNA==,4992424,2017-02-14T19:32:01Z,2017-02-14T19:32:01Z,NONE,Let me dig into this a bit right now. My analysis project for this afternoon was already going to require digging into pandas' resampling in more depth anyways.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,207587161
https://github.com/pydata/xarray/issues/988#issuecomment-243124532,https://api.github.com/repos/pydata/xarray/issues/988,243124532,MDEyOklzc3VlQ29tbWVudDI0MzEyNDUzMg==,4992424,2016-08-29T13:32:11Z,2016-08-29T13:32:11Z,NONE,"I definitely see the logic with regards to encouraging users to use a context manager, and from the perspective of someone building a third-party library on top of xarray it would be fine. However, I think that from the perspective of an end-user (for example, a scientist) crunching numbers and analyzing data with xarray simply as a convenience library, this produces much too obfuscated code - a standard library import (`contextlib`, which isn't something many scientific coders would regularly use or necessarily know about) and a lot of boiler-plate ""enabling"" the extra features they want in their calculation.

I think your earlier proposal of an `xarray.set_options` is a cleaner and simpler way forward, even if it does have thorns. Do you have any estimate of the performance penalty checking hooks on all xarray objects would incur? 
","{""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,173612265
https://github.com/pydata/xarray/issues/987#issuecomment-242912131,https://api.github.com/repos/pydata/xarray/issues/987,242912131,MDEyOklzc3VlQ29tbWVudDI0MjkxMjEzMQ==,4992424,2016-08-27T11:34:28Z,2016-08-27T11:34:28Z,NONE,"@joonro, I think there's a strong case to be made about returning a `DataArray` with some metadata appended. Referring to the latest [draft of the CF Metadata Conventions](http://cfconventions.org/cf-conventions/cf-conventions.html), there is a clear way to indicate when operations such as `mean`, `max`, or `min` have been applied to a variable by using the [**cell_methods**](http://cfconventions.org/cf-conventions/cf-conventions.html#cell-methods) attribute. 

It might be more prudent to add this attribute whenever we apply these operations to a `DataArray` (or perhaps variable-wise when applied to a `Dataset`). That way, there is a clear reason to not return a scalar - the documentation of what operations were applied to produce that final result.

I can whip up a working example/pull request if people think this is a direction to go. I'd probably build a decorator which handles inspection of the operator name and arguments and uses that to add the **cell_methods** attribute, that way people can add the same functionality to homegrown methods/operators.
","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,173494017
https://github.com/pydata/xarray/issues/463#issuecomment-224049602,https://api.github.com/repos/pydata/xarray/issues/463,224049602,MDEyOklzc3VlQ29tbWVudDIyNDA0OTYwMg==,4992424,2016-06-06T18:42:06Z,2016-06-06T18:42:06Z,NONE,"@mangecoeur, although it's not an xarray-based solution, I've found that by far the best solution to this problem is to transform your dataset from the ""timeslice"" format (which is convenient for models to write out - all the data at a given point in time, often in separate files for each time step) to ""timeseries"" format - a continuous format, where you have all the data for a single variable in a single (or much smaller collection of) files. 

NCAR published a great utility for converting batches of NetCDF output from timeslice to timeseries format [here](https://github.com/NCAR/PyReshaper); it's significantly faster than any shell-script/CDO/NCO solution I've ever encountered, and it parallelizes extremely easily.

Adding a simple post-processing step to convert my simulation output to timeseries format dramatically reduced my overall work time. Before, I had a separate handler which re-implemented open_mfdataset(), performed an intermediate reduction (usually extracting a variable), and then concatenated within xarray. This could get around the open file limit, but it wasn't fast. My pre-processed data is often still big - barely fitting within memory - but it's far easier to handle, and you can throw dask at it no problem to get huge speedups in analysis.
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,94328498
https://github.com/pydata/xarray/issues/851#issuecomment-220334426,https://api.github.com/repos/pydata/xarray/issues/851,220334426,MDEyOklzc3VlQ29tbWVudDIyMDMzNDQyNg==,4992424,2016-05-19T14:05:34Z,2016-05-19T14:05:34Z,NONE,"@byersiiasa, what happens if you just concatenate them using the NCO command `ncrcat`? 
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,155741762
https://github.com/pydata/xarray/issues/784#issuecomment-192357422,https://api.github.com/repos/pydata/xarray/issues/784,192357422,MDEyOklzc3VlQ29tbWVudDE5MjM1NzQyMg==,4992424,2016-03-04T16:58:59Z,2016-03-04T16:58:59Z,NONE,"The `reindex_like()` approach works super well in my case. Since only my latitudes are screwed up (and they're spaced by a tad more than a degree), a low tolerance 1e-2-1e-3 worked perfectly. 
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,138443211
https://github.com/pydata/xarray/issues/784#issuecomment-192332830,https://api.github.com/repos/pydata/xarray/issues/784,192332830,MDEyOklzc3VlQ29tbWVudDE5MjMzMjgzMA==,4992424,2016-03-04T15:56:58Z,2016-03-04T15:56:58Z,NONE,"Hi @mathause, I actually just ran into a very similar problem to your second bullet point. I had some limited success by manually re-building the re-gridded dataset onto the CESM coordinate system, swapping out the not-exactly-but-actually-close-enough coordinates for the CESM reference data's coordinates. In my case, I was re-gridding with CDO, but even when I explicitly pull out the CESM grid definition it wouldn't match precisely.

Since there was a lot of boilerplate code to do this in xarray (although I had a lot of success defining a callback to pass in with open_dataset), it was far easier just to use NCO to copy the correct coordinate variables into the re-gridded data.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,138443211
https://github.com/pydata/xarray/issues/768#issuecomment-187245860,https://api.github.com/repos/pydata/xarray/issues/768,187245860,MDEyOklzc3VlQ29tbWVudDE4NzI0NTg2MA==,4992424,2016-02-22T16:04:39Z,2016-02-22T16:04:39Z,NONE,"Hi @jonathanstrong,

Just thought it would be useful to point out that the people who maintain NetCDF is [Unidata](http://www.unidata.ucar.edu/), a branch of the University Corporation for Atmospheric Research. In fact, netCDF-4 is essentially built on top of HDF5 - a much more widely-known file format, with [first-class support](http://www.pytables.org/) including an I/O layer in pandas. While it would certainly be great to ""sell"" netCDF as a format in the documentation, those of us who still have to write netCDF-based I/O modules for our Fortran models might have to throw up a little in our mouths when we do so...
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,134376872
https://github.com/pydata/xarray/issues/704#issuecomment-169057010,https://api.github.com/repos/pydata/xarray/issues/704,169057010,MDEyOklzc3VlQ29tbWVudDE2OTA1NzAxMA==,4992424,2016-01-05T16:44:41Z,2016-01-05T16:44:41Z,NONE,"I also like `import xarray as xr`.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,124867009
https://github.com/pydata/xarray/issues/624#issuecomment-148376642,https://api.github.com/repos/pydata/xarray/issues/624,148376642,MDEyOklzc3VlQ29tbWVudDE0ODM3NjY0Mg==,4992424,2015-10-15T12:57:04Z,2015-10-15T12:57:04Z,NONE,"Is there another easy way to add a constant offset to all the values of a dimension (e.g. add, say, 10 meters to every value in the dimension)? I don't typically use operations like that, but I can see where they might be useful.

If not, then rolling in integer space is the way to go.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,111471076
https://github.com/pydata/xarray/issues/624#issuecomment-148206569,https://api.github.com/repos/pydata/xarray/issues/624,148206569,MDEyOklzc3VlQ29tbWVudDE0ODIwNjU2OQ==,4992424,2015-10-14T21:24:35Z,2015-10-14T21:24:35Z,NONE,"Using an API like `ds.roll(time=100)` would be more consistent with other aggregation/manipulation routines, and there's nothing in @rabernat 's code that forbids that call signature.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,111471076
https://github.com/pydata/xarray/issues/531#issuecomment-131214583,https://api.github.com/repos/pydata/xarray/issues/531,131214583,MDEyOklzc3VlQ29tbWVudDEzMTIxNDU4Mw==,4992424,2015-08-14T19:26:18Z,2015-08-14T19:26:18Z,NONE,"Hi @jsbj,

The fancy indexing notation you're trying to use only works when xray successfully decodes the time dimension. As discussed in the documentation [here](http://xray.readthedocs.org/en/stable/time-series.html#creating-datetime64-data), this only works when the year of record falls between 1678 and 2262. Since you have years 2262-2300 in your dataset, this is a feature - xray is failing gracefully.

There are a few current open discussions on this behavior, which is an issue higher up the python chain with numpy:
1. [time decoding error with ""days since""](https://github.com/xray/xray/issues/521)
2. [Fix datetime decoding when time units are 'days since 0000-01-01 00:00:00'](https://github.com/xray/xray/pull/522)
3. [ocefpaf - Loading non-standard dates with cf_units](https://ocefpaf.github.io/python4oceanographers/blog/2015/08/10/cf_units_and_time/)
4. [numpy - Non-standard Calendar Support](https://github.com/numpy/numpy/issues/6207)

For now, a very simple hack would be to re-compute your time units so that they're re-based, say, with units 'days since 1700-01-01 00:00:00'. That way all of them would fit within the permissible range to use the decoding routine built into xray. You could simply pass the **decode_cf=False** flag when you open the dataset, modify the non-decoded time array and units, then run **xray.decode_cf()** on the modified dataset.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,100980878