home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

10 rows where milestone = 740776 and type = "issue" sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: user, comments, author_association, created_at (date), updated_at (date), closed_at (date)

type 1

  • issue · 10 ✖

state 1

  • closed 10

repo 1

  • xarray 10
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
39264845 MDU6SXNzdWUzOTI2NDg0NQ== 197 We need some way to identify non-index coordinates shoyer 1217238 closed 0   0.3 740776 3 2014-08-01T06:36:13Z 2014-12-19T07:16:14Z 2014-09-10T06:07:15Z MEMBER      

I am currently working with station data. In order to keep around latitude and longitude (I use station_id as the coordinate variable), I need to resort to some ridiculous contortions:

python residuals = results['y'] - observations['y'] residuals.dataset.update(results.select_vars('longitude', 'latitude'))

There has got to be an easier way to handle this.


I don't want to revert to some primitive guessing strategy (e.g, looking at attrs['coordinates']) to figure out which extra variables can be safely kept after mathematical operations.

Another approach would be to try to preserve everything in the dataset linked to an DataArray when doing math. But I don't really like this option, either, because it would lead to serious propagation of "linked dataset variables", which are rather surprising and can have unexpected performance consequences (though at least they appear in repr as of #128).


This leaves me to a final alternative: restructuring xray's internals to provide first-class support for coordinates that are not indexes. For example, this would mean promoting ds.coordinates to an actual dictionary stored on a dataset, and allowing it to hold objects that aren't an xray.Coordinate.

Making this change transparent to users would likely require changing the Dataset signature to something like Dataset(variables, coords, attrs). We might (yet again) want to rename Coordinate, to something like IndexVar, to emphasis the notion of "index" and "non-index" coordinates. And we could get rid of the terrible "linked dataset variable".

Once we have non-index coordinates, we need a policy for what to do when adding with two DataArrays for which they differ. I think my preferred approach is to not enforce that they be found on both arrays, but to raise an exception if there are any conflicting values -- unless they are scalar valued, in which case the dropped or turned into a tuple or given different names. (Otherwise there would be cases where you couldn't calculate x[1] - x[0].)

We might even able to keep around multi-dimension coordinates this way (e.g., 2D lat/lon arrays for projected data).... I'll need to think about that one some more.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/197/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
43098072 MDU6SXNzdWU0MzA5ODA3Mg== 234 remove the notion of "Index coordinates", especially from the Dataset repr? shoyer 1217238 closed 0   0.3 740776 1 2014-09-18T06:16:38Z 2014-09-22T02:17:31Z 2014-09-22T02:17:31Z MEMBER      

@perrette mentioned that he found the distinction between "index" and "other" coordinates in the dev version of xray confusing (see the dev build of the docs).

I agree -- the differences are subtle, and difficult to convey. On the whole, they're mostly both just "Coordinates", although coordinates with the same name as a dimension are special because they're also used like indexes.

So I would like to visit repr(Dataset) to make this less confusing. Here are 5 options: 1. Current implementation (on master):

<xray.Dataset> Dimensions: (time: 3, x: 2, y: 2) Index Coordinates: time (time) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08 x (x) int64 0 1 y (y) int64 0 1 Other Coordinates: lat (x, y) float64 42.25 42.21 42.63 42.59 lon (x, y) float64 -99.83 -99.32 -99.79 -99.23 reference_time datetime64[ns] 2014-09-05 Variables: temp (x, y, time) float64 11.04 23.57 20.77 9.346 6.683 17.17 11.6 19.54 ... precip (x, y, time) float64 5.904 2.453 3.404 9.847 9.195 0.3777 8.615 7.536 ... 2. Switch "Index Coordinates" to "Coordinates/Indexes" (to emphasize "Coordinates")

<xray.Dataset> Dimensions: (time: 3, x: 2, y: 2) Coordinates/Indexes: time (time) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08 x (x) int64 0 1 y (y) int64 0 1 Coordinates/Other: lat (x, y) float64 42.25 42.21 42.63 42.59 lon (x, y) float64 -99.83 -99.32 -99.79 -99.23 reference_time datetime64[ns] 2014-09-05 Variables: temp (x, y, time) float64 11.04 23.57 20.77 9.346 6.683 17.17 11.6 19.54 ... precip (x, y, time) float64 5.904 2.453 3.404 9.847 9.195 0.3777 8.615 7.536 ... 3. Rename "Other Coordinates" to "Non-index Coordinates":

<xray.Dataset> Dimensions: (time: 3, x: 2, y: 2) Index Coordinates: time (time) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08 x (x) int64 0 1 y (y) int64 0 1 Non-index Coordinates: lat (x, y) float64 42.25 42.21 42.63 42.59 lon (x, y) float64 -99.83 -99.32 -99.79 -99.23 reference_time datetime64[ns] 2014-09-05 Variables: temp (x, y, time) float64 11.04 23.57 20.77 9.346 6.683 17.17 11.6 19.54 ... precip (x, y, time) float64 5.904 2.453 3.404 9.847 9.195 0.3777 8.615 7.536 ... 4. Consolidate "Index" and "Other" coordinates (the info about indexing is implicit in the dimension names):

<xray.Dataset> Dimensions: (time: 3, x: 2, y: 2) Coordinates: time (time) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08 x (x) int64 0 1 y (y) int64 0 1 lat (x, y) float64 42.25 42.21 42.63 42.59 lon (x, y) float64 -99.83 -99.32 -99.79 -99.23 reference_time datetime64[ns] 2014-09-05 Variables: temp (x, y, time) float64 11.04 23.57 20.77 9.346 6.683 17.17 11.6 19.54 ... precip (x, y, time) float64 5.904 2.453 3.404 9.847 9.195 0.3777 8.615 7.536 ... 5. Consolidate coordinates, but mark indexes with * (indexes could still be all grouped at the top, but wouldn't need to be):

<xray.Dataset> Dimensions: (time: 3, x: 2, y: 2) Coordinates: * time (time) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08 * x (x) int64 0 1 * y (y) int64 0 1 lat (x, y) float64 42.25 42.21 42.63 42.59 lon (x, y) float64 -99.83 -99.32 -99.79 -99.23 reference_time datetime64[ns] 2014-09-05 Variables: temp (x, y, time) float64 11.04 23.57 20.77 9.346 6.683 17.17 11.6 19.54 ... precip (x, y, time) float64 5.904 2.453 3.404 9.847 9.195 0.3777 8.615 7.536 ...

I am leaning towards option (5). It introduces less terminology and is easier to scan / count at a glance than separate categories of coordinates. The asterisk is still there as a reminder that these coordinates are special, and the distinctions will be highlighted under "Coordinates" in the docs for anyone who wants more details.

@ToddSmall @akleeman @jhamman Any opinions?

(by the way, it's worth checking out @perrette's dimarray project... lots of nice ideas and overlap with xray)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/234/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
40225000 MDU6SXNzdWU0MDIyNTAwMA== 212 Get ride of "noncoordinates" as a name? shoyer 1217238 closed 0   0.3 740776 8 2014-08-14T05:52:30Z 2014-09-22T00:55:22Z 2014-09-22T00:55:22Z MEMBER      

As @ToddSmall has pointed out (in #202), "noncoordinates" is a confusing name -- it's something defined by what it isn't, not what it is.

Unfortunately, our best alternative is "variables", which already has a lot of meaning from the netCDF world (and which we already use).

Related: #211

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/212/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
39385095 MDU6SXNzdWUzOTM4NTA5NQ== 203 Support mathematical operators (+-*/, etc) for GroupBy objects shoyer 1217238 closed 0   0.3 740776 0 2014-08-04T01:40:11Z 2014-09-12T01:18:04Z 2014-09-12T01:18:04Z MEMBER      

Building on #200, we could add support for mathematical operations to GroupBy objects.

Math with groupby objects should automatically "broadcast" across group labels, so we can write something like:

climatology = ds.groupby('time.month').mean('time') anomalies = ds.groupby('time.month') - climatology

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/203/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
36625519 MDU6SXNzdWUzNjYyNTUxOQ== 176 Proposal: Dataset transpose or order_dims method shoyer 1217238 closed 0   0.3 740776 0 2014-06-26T23:49:43Z 2014-09-07T04:18:06Z 2014-09-07T04:18:06Z MEMBER      

It should transpose all variables so that they have dimensions in the same given order, ignoring any dimensions that are not used by variable.

E.g., ds.transpose('x', 'y', 'z') should give me a dataset with all data in the same order.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/176/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
39919261 MDU6SXNzdWUzOTkxOTI2MQ== 211 Should iterating over a Dataset include coordinates? shoyer 1217238 closed 0   0.3 740776 0 2014-08-10T23:13:47Z 2014-09-05T03:16:53Z 2014-09-05T03:16:53Z MEMBER      

My inclination is no: the contents of a Dataset (e.g., list(ds), ds.keys() and ds.values()) should only include non-coordinates.

__contains__ checks for a coordinate (e.g., 'time') would need to look in ds.dimensions or ds.coordinates instead of ds, but I see no need to __getitem__: ds['time'] can still work.

Pluses: 1. This change would more closely align xray.Dataset with pandas.DataFrame, which also does not include any elements of the index in the contents of the frame. 2. It would eliminate the need for using ds.noncoordinates -- which, as @ToddSmall has pointed out, is not very intuitive. 3. In my experience, I have been using ds.noncoordinates.items() more often than ds.items() (which contains redundant information, as coordinates are repeated). The only time I really want to iterate over all variables in a dataset is when I'm using the lower level Variable API.

Negatives: 1. This would break the existing API.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/211/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
39875825 MDU6SXNzdWUzOTg3NTgyNQ== 208 Don't require variable dimensions in Dataset.__init__ for scalar or 1d arrays shoyer 1217238 closed 0   0.3 740776 1 2014-08-09T01:55:45Z 2014-09-03T18:17:12Z 2014-09-03T18:17:12Z MEMBER      

The coerce to variable logic should only be performed if the argument is a tuple.

For scalars, there is no ambiguity since their dimensions are empty.

For 1-d arrays, we should default to creating a new coordinate variable.

e.g., I should be able to write xray.Dataset({'x': np.arange(10), 'y': 0})

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/208/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
40760695 MDU6SXNzdWU0MDc2MDY5NQ== 218 Support apply with DatasetGroupby returning a DataArray (and vice-versa) shoyer 1217238 closed 0   0.3 740776 0 2014-08-21T00:28:57Z 2014-09-03T05:24:26Z 2014-09-03T05:24:26Z MEMBER      

e.g., I should be able to write:

dataset.groupby('state').apply(lambda ds: (ds['tmin'] > ds['tmax']).mean('station')))

This will be very simple once we write a generic xray.concat function which can handle either type of argument.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/218/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
40536963 MDU6SXNzdWU0MDUzNjk2Mw== 217 Strings are truncated when concatenating Datasets. IamJeffG 2002703 closed 0   0.3 740776 0 2014-08-18T21:58:36Z 2014-08-21T05:17:28Z 2014-08-21T05:17:28Z CONTRIBUTOR      

When concatenating Datasets, a variable's string length is limited to the length in the first of the Datasets being concatenated.

```

import xray first = xray.Dataset({'animal': ('animal', ['horse'])}) second = xray.Dataset( {'animal': ('animal', ['aardvark_0'])}) xray.Dataset.concat([first, second], dimension='animal')['animal'] <xray.DataArray 'animal' (animal: 2)> array(['horse', 'aardv'], dtype='|S5') Coordinates: animal: Index([u'horse', u'aardv'], dtype='object') Attributes: Empty ```

(Note the |S5 dtype and the truncated aardv)

I think this is the offending line: https://github.com/xray/xray/blob/master/xray/core/variable.py#L623 May want to use dtype=object for strings to avoid this issue.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/217/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
32926274 MDU6SXNzdWUzMjkyNjI3NA== 114 Fix circular imports shoyer 1217238 closed 0   0.3 740776 0 2014-05-06T19:47:12Z 2014-08-17T00:52:38Z 2014-08-17T00:52:38Z MEMBER      

Thanks @takluyver for pointing this out in #113. We really should have resolved this some time ago.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/114/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 1043.39ms · About: xarray-datasette