id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 39264845,MDU6SXNzdWUzOTI2NDg0NQ==,197,We need some way to identify non-index coordinates,1217238,closed,0,,740776,3,2014-08-01T06:36:13Z,2014-12-19T07:16:14Z,2014-09-10T06:07:15Z,MEMBER,,,,"I am currently working with station data. In order to keep around latitude and longitude (I use station_id as the coordinate variable), I need to resort to some ridiculous contortions: ``` python residuals = results['y'] - observations['y'] residuals.dataset.update(results.select_vars('longitude', 'latitude')) ``` There has got to be an easier way to handle this. --- I don't want to revert to some primitive guessing strategy (e.g, looking at `attrs['coordinates']`) to figure out which extra variables can be safely kept after mathematical operations. Another approach would be to try to preserve _everything_ in the dataset linked to an DataArray when doing math. But I don't really like this option, either, because it would lead to serious propagation of ""linked dataset variables"", which are rather surprising and can have unexpected performance consequences (though at least they appear in repr as of #128). --- This leaves me to a final alternative: restructuring xray's internals to provide first-class support for coordinates that are not indexes. For example, this would mean promoting `ds.coordinates` to an actual dictionary stored on a dataset, and allowing it to hold objects that aren't an `xray.Coordinate`. Making this change transparent to users would likely require changing the `Dataset` signature to something like `Dataset(variables, coords, attrs)`. We might (yet again) want to rename `Coordinate`, to something like `IndexVar`, to emphasis the notion of ""index"" and ""non-index"" coordinates. And we could get rid of the terrible ""linked dataset variable"". Once we have non-index coordinates, we need a policy for what to do when adding with two DataArrays for which they differ. I think my preferred approach is to not enforce that they be found on both arrays, but to raise an exception if there are any conflicting values -- unless they are scalar valued, in which case the dropped or turned into a tuple or given different names. (Otherwise there would be cases where you couldn't calculate `x[1] - x[0]`.) We might even able to keep around multi-dimension coordinates this way (e.g., 2D lat/lon arrays for projected data).... I'll need to think about that one some more. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/197/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 43350752,MDExOlB1bGxSZXF1ZXN0MjE1NTQ5Mjk=,235,"Better formatting for coordinates, getting rid of ""index coordinates"" (and assorted doc improvements)",1217238,closed,0,,740776,0,2014-09-22T00:56:12Z,2014-09-22T02:17:33Z,2014-09-22T02:17:31Z,MEMBER,,0,pydata/xarray/pulls/235,"Fixes #234 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/235/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 43098072,MDU6SXNzdWU0MzA5ODA3Mg==,234,"remove the notion of ""Index coordinates"", especially from the Dataset repr?",1217238,closed,0,,740776,1,2014-09-18T06:16:38Z,2014-09-22T02:17:31Z,2014-09-22T02:17:31Z,MEMBER,,,,"@perrette mentioned that he found the distinction between ""index"" and ""other"" coordinates in the dev version of xray confusing (see the [dev build of the docs](http://xray.readthedocs.org/en/latest/data-structures.html#coordinates)). I agree -- the differences are subtle, and difficult to convey. On the whole, they're mostly both just ""Coordinates"", although coordinates with the same name as a dimension are special because they're also used like indexes. So I would like to visit `repr(Dataset)` to make this less confusing. Here are 5 options: 1. Current implementation (on master): ``` Dimensions: (time: 3, x: 2, y: 2) Index Coordinates: time (time) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08 x (x) int64 0 1 y (y) int64 0 1 Other Coordinates: lat (x, y) float64 42.25 42.21 42.63 42.59 lon (x, y) float64 -99.83 -99.32 -99.79 -99.23 reference_time datetime64[ns] 2014-09-05 Variables: temp (x, y, time) float64 11.04 23.57 20.77 9.346 6.683 17.17 11.6 19.54 ... precip (x, y, time) float64 5.904 2.453 3.404 9.847 9.195 0.3777 8.615 7.536 ... ``` 2. Switch ""Index Coordinates"" to ""Coordinates/Indexes"" (to emphasize ""Coordinates"") ``` Dimensions: (time: 3, x: 2, y: 2) Coordinates/Indexes: time (time) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08 x (x) int64 0 1 y (y) int64 0 1 Coordinates/Other: lat (x, y) float64 42.25 42.21 42.63 42.59 lon (x, y) float64 -99.83 -99.32 -99.79 -99.23 reference_time datetime64[ns] 2014-09-05 Variables: temp (x, y, time) float64 11.04 23.57 20.77 9.346 6.683 17.17 11.6 19.54 ... precip (x, y, time) float64 5.904 2.453 3.404 9.847 9.195 0.3777 8.615 7.536 ... ``` 3. Rename ""Other Coordinates"" to ""Non-index Coordinates"": ``` Dimensions: (time: 3, x: 2, y: 2) Index Coordinates: time (time) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08 x (x) int64 0 1 y (y) int64 0 1 Non-index Coordinates: lat (x, y) float64 42.25 42.21 42.63 42.59 lon (x, y) float64 -99.83 -99.32 -99.79 -99.23 reference_time datetime64[ns] 2014-09-05 Variables: temp (x, y, time) float64 11.04 23.57 20.77 9.346 6.683 17.17 11.6 19.54 ... precip (x, y, time) float64 5.904 2.453 3.404 9.847 9.195 0.3777 8.615 7.536 ... ``` 4. Consolidate ""Index"" and ""Other"" coordinates (the info about indexing is implicit in the dimension names): ``` Dimensions: (time: 3, x: 2, y: 2) Coordinates: time (time) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08 x (x) int64 0 1 y (y) int64 0 1 lat (x, y) float64 42.25 42.21 42.63 42.59 lon (x, y) float64 -99.83 -99.32 -99.79 -99.23 reference_time datetime64[ns] 2014-09-05 Variables: temp (x, y, time) float64 11.04 23.57 20.77 9.346 6.683 17.17 11.6 19.54 ... precip (x, y, time) float64 5.904 2.453 3.404 9.847 9.195 0.3777 8.615 7.536 ... ``` 5. Consolidate coordinates, but mark indexes with `*` (indexes could still be all grouped at the top, but wouldn't need to be): ``` Dimensions: (time: 3, x: 2, y: 2) Coordinates: * time (time) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08 * x (x) int64 0 1 * y (y) int64 0 1 lat (x, y) float64 42.25 42.21 42.63 42.59 lon (x, y) float64 -99.83 -99.32 -99.79 -99.23 reference_time datetime64[ns] 2014-09-05 Variables: temp (x, y, time) float64 11.04 23.57 20.77 9.346 6.683 17.17 11.6 19.54 ... precip (x, y, time) float64 5.904 2.453 3.404 9.847 9.195 0.3777 8.615 7.536 ... ``` I am leaning towards option (5). It introduces less terminology and is easier to scan / count at a glance than separate categories of coordinates. The asterisk is still there as a reminder that these coordinates are special, and the distinctions will be highlighted under ""Coordinates"" in the docs for anyone who wants more details. @ToddSmall @akleeman @jhamman Any opinions? (by the way, it's worth checking out @perrette's [dimarray](https://github.com/perrette/dimarray) project... lots of nice ideas and overlap with xray) ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/234/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 40225000,MDU6SXNzdWU0MDIyNTAwMA==,212,"Get ride of ""noncoordinates"" as a name?",1217238,closed,0,,740776,8,2014-08-14T05:52:30Z,2014-09-22T00:55:22Z,2014-09-22T00:55:22Z,MEMBER,,,,"As @ToddSmall has pointed out (in #202), ""noncoordinates"" is a confusing name -- it's something defined by what it isn't, not what it is. Unfortunately, our best alternative is ""variables"", which already has a lot of meaning from the netCDF world (and which we already use). Related: #211 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/212/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 42973685,MDExOlB1bGxSZXF1ZXN0MjEzNDkxOTE=,233,Revised documentation in preparation for v0.3,1217238,closed,0,,740776,0,2014-09-17T07:32:00Z,2014-09-17T07:46:09Z,2014-09-17T07:46:05Z,MEMBER,,0,pydata/xarray/pulls/233,"The ""tutorial"" has been split out into a number of separate chapters. This should significantly enhance readability and findability. You can preview these docs at http://xray.readthedocs.org/en/docs-v0.3/ (but I also intend to merge this into master shortly) ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/233/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 42380133,MDExOlB1bGxSZXF1ZXN0MjEwMDA1NTc=,229,Support math with GroupBy objects,1217238,closed,0,,740776,0,2014-09-10T05:50:08Z,2014-09-12T01:18:05Z,2014-09-12T01:18:04Z,MEMBER,,0,pydata/xarray/pulls/229,"Fixes #203. You can now calculate anomalies with something like: ``` python grouped = ds.groupby('time.month') anom = grouped - grouped.mean('time') ``` Still needs documentation (that will go in my current major refactor of the docs for v0.3). ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/229/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 39385095,MDU6SXNzdWUzOTM4NTA5NQ==,203,"Support mathematical operators (+-*/, etc) for GroupBy objects",1217238,closed,0,,740776,0,2014-08-04T01:40:11Z,2014-09-12T01:18:04Z,2014-09-12T01:18:04Z,MEMBER,,,,"Building on #200, we could add support for mathematical operations to GroupBy objects. Math with groupby objects should automatically ""broadcast"" across group labels, so we can write something like: ``` climatology = ds.groupby('time.month').mean('time') anomalies = ds.groupby('time.month') - climatology ``` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/203/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 42170560,MDExOlB1bGxSZXF1ZXN0MjA4NzM3NTA=,228,BUG: Fix datetime components on DataArrays,1217238,closed,0,,740776,0,2014-09-08T07:24:28Z,2014-09-08T08:46:14Z,2014-09-08T08:46:11Z,MEMBER,,0,pydata/xarray/pulls/228,"We didn't have any tests, so they were broken. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/228/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 36625519,MDU6SXNzdWUzNjYyNTUxOQ==,176,Proposal: Dataset transpose or order_dims method,1217238,closed,0,,740776,0,2014-06-26T23:49:43Z,2014-09-07T04:18:06Z,2014-09-07T04:18:06Z,MEMBER,,,,"It should transpose all variables so that they have dimensions in the same given order, ignoring any dimensions that are not used by variable. E.g., `ds.transpose('x', 'y', 'z')` should give me a dataset with all data in the same order. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/176/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 42112479,MDExOlB1bGxSZXF1ZXN0MjA4NDcwNDE=,227,ndarray methods and arithmetic operators for xray.Dataset,1217238,closed,0,,740776,0,2014-09-06T09:02:23Z,2014-09-07T04:18:05Z,2014-09-07T04:18:05Z,MEMBER,,0,pydata/xarray/pulls/227,"Fixes #200. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/227/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 42018203,MDExOlB1bGxSZXF1ZXN0MjA3OTAyNzA=,225,Cleanup storage of Dataset internal state,1217238,closed,0,,740776,0,2014-09-05T06:30:57Z,2014-09-05T06:46:19Z,2014-09-05T06:46:16Z,MEMBER,,0,pydata/xarray/pulls/225,"`_variables` and `_dims` are now stored as OrderedDict and dict, not my funny dict subclasses VariablesDict (which I'm pleased to say is gone) and SortedKeysDict (which is now created on demand when accessing the `dims` property). This speeds things up a bit and makes the internal state more obvious. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/225/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 42004314,MDExOlB1bGxSZXF1ZXN0MjA3ODM0OTg=,224,"progress towards removing ""non-coordinates"" as a concept",1217238,closed,0,,740776,0,2014-09-05T01:28:41Z,2014-09-05T03:16:55Z,2014-09-05T03:16:53Z,MEMBER,,0,pydata/xarray/pulls/224,"Fixes related to #211 and #212. I haven't renamed `Variable` or `Dataset.variables` yet, though, pending the resolution of that discussion. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/224/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 39919261,MDU6SXNzdWUzOTkxOTI2MQ==,211,Should iterating over a Dataset include coordinates?,1217238,closed,0,,740776,0,2014-08-10T23:13:47Z,2014-09-05T03:16:53Z,2014-09-05T03:16:53Z,MEMBER,,,,"My inclination is **no**: the contents of a Dataset (e.g., `list(ds)`, `ds.keys()` and `ds.values()`) should only include non-coordinates. `__contains__` checks for a coordinate (e.g., `'time'`) would need to look in `ds.dimensions` or `ds.coordinates` instead of `ds`, but I see no need to `__getitem__`: `ds['time']` can still work. Pluses: 1. This change would more closely align `xray.Dataset` with `pandas.DataFrame`, which also does not include any elements of the index in the contents of the frame. 2. It would eliminate the need for using `ds.noncoordinates` -- which, as @ToddSmall has pointed out, is not very intuitive. 3. In my experience, I have been using `ds.noncoordinates.items()` more often than `ds.items()` (which contains redundant information, as coordinates are repeated). The only time I really want to iterate over all variables in a dataset is when I'm using the lower level `Variable` API. Negatives: 1. This would break the existing API. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/211/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 41899496,MDExOlB1bGxSZXF1ZXN0MjA3MjQ0NDk=,223,Miscellaneous fixes,1217238,closed,0,,740776,0,2014-09-04T06:28:28Z,2014-09-04T06:33:31Z,2014-09-04T06:33:29Z,MEMBER,,0,pydata/xarray/pulls/223,,"{""url"": ""https://api.github.com/repos/pydata/xarray/issues/223/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 39875825,MDU6SXNzdWUzOTg3NTgyNQ==,208,Don't require variable dimensions in Dataset.__init__ for scalar or 1d arrays,1217238,closed,0,,740776,1,2014-08-09T01:55:45Z,2014-09-03T18:17:12Z,2014-09-03T18:17:12Z,MEMBER,,,,"The coerce to variable logic should only be performed if the argument is a tuple. For scalars, there is no ambiguity since their dimensions are empty. For 1-d arrays, we should default to creating a new coordinate variable. e.g., I should be able to write `xray.Dataset({'x': np.arange(10), 'y': 0})` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/208/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 41672867,MDExOlB1bGxSZXF1ZXN0MjA1OTA4ODc=,221,Nonindex coords,1217238,closed,0,,740776,0,2014-09-02T03:36:10Z,2014-09-03T18:17:12Z,2014-09-03T05:24:24Z,MEMBER,,0,pydata/xarray/pulls/221,"a number of changes related to #197. still needs doc updates and full tests for coordinates (especially merging coordinates) ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/221/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 40760695,MDU6SXNzdWU0MDc2MDY5NQ==,218,Support apply with DatasetGroupby returning a DataArray (and vice-versa),1217238,closed,0,,740776,0,2014-08-21T00:28:57Z,2014-09-03T05:24:26Z,2014-09-03T05:24:26Z,MEMBER,,,,"e.g., I should be able to write: ``` dataset.groupby('state').apply(lambda ds: (ds['tmin'] > ds['tmax']).mean('station'))) ``` This will be very simple once we write a generic `xray.concat` function which can handle either type of argument. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/218/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 40773097,MDExOlB1bGxSZXF1ZXN0MjAwOTAwNTc=,219,Fix concat str truncation,1217238,closed,0,,740776,0,2014-08-21T05:13:12Z,2014-08-21T05:17:30Z,2014-08-21T05:17:28Z,MEMBER,,0,pydata/xarray/pulls/219,"Fixes #217. I also took the opportunity to add two small optimizations, which add up to make `Variable.concat` about 35% faster. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/219/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 40536963,MDU6SXNzdWU0MDUzNjk2Mw==,217,Strings are truncated when concatenating Datasets.,2002703,closed,0,,740776,0,2014-08-18T21:58:36Z,2014-08-21T05:17:28Z,2014-08-21T05:17:28Z,CONTRIBUTOR,,,,"When concatenating Datasets, a variable's string length is limited to the length in the first of the Datasets being concatenated. ``` >>> import xray >>> first = xray.Dataset({'animal': ('animal', ['horse'])}) >>> second = xray.Dataset( {'animal': ('animal', ['aardvark_0'])}) >>> xray.Dataset.concat([first, second], dimension='animal')['animal'] array(['horse', 'aardv'], dtype='|S5') Coordinates: animal: Index([u'horse', u'aardv'], dtype='object') Attributes: Empty ``` (Note the `|S5` dtype and the truncated `aardv`) I think this is the offending line: https://github.com/xray/xray/blob/master/xray/core/variable.py#L623 May want to use `dtype=object` for strings to avoid this issue. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/217/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 40424011,MDExOlB1bGxSZXF1ZXN0MTk4ODUxMDE=,216,Internal code reorganization; creates xray.core module,1217238,closed,0,,740776,0,2014-08-17T00:46:02Z,2014-08-17T00:52:41Z,2014-08-17T00:52:38Z,MEMBER,,0,pydata/xarray/pulls/216,"Fixes #114 (close enough) ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/216/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 32926274,MDU6SXNzdWUzMjkyNjI3NA==,114,Fix circular imports,1217238,closed,0,,740776,0,2014-05-06T19:47:12Z,2014-08-17T00:52:38Z,2014-08-17T00:52:38Z,MEMBER,,,,"Thanks @takluyver for pointing this out in #113. We really should have resolved this some time ago. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/114/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue