home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

3 rows where comments = 13, state = "closed" and user = 1217238 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 2
  • pull 1

state 1

  • closed · 3 ✖

repo 1

  • xarray 3
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
205455788 MDU6SXNzdWUyMDU0NTU3ODg= 1251 Consistent naming for xarray's methods that apply functions shoyer 1217238 closed 0     13 2017-02-05T21:27:24Z 2022-04-27T20:06:25Z 2022-04-27T20:06:25Z MEMBER      

We currently have two types of methods that take a function to apply to xarray objects: - pipe (on DataArray and Dataset): apply a function to this entire object (array.pipe(func) -> func(array)) - apply (on Dataset and GroupBy): apply a function to each labeled object in this object (e.g., ds.apply(func) -> ds({k: func(v) for k, v in ds.data_vars.items()})).

And one more method that we want to add but isn't finalized yet -- currently named apply_ufunc: - Apply a function that acts on unlabeled (i.e., numpy) arrays to each array in the object

I'd like to have three distinct names that makes it clear what these methods do and how they are different. This has come up a few times recently, e.g., https://github.com/pydata/xarray/issues/1130

One proposal: rename apply to map, and then use apply only for methods that act on unlabeled arrays. This would require a deprecation cycle, but eventually it would let us add .apply methods for handling raw arrays to both Dataset and DataArray. (We could use a separate apply method from apply_ufunc to convert dim arguments to axis and not do automatic broadcasting.)

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1251/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
28376794 MDU6SXNzdWUyODM3Njc5NA== 25 Consistent rules for handling merges between variables with different attributes shoyer 1217238 closed 0     13 2014-02-26T22:37:01Z 2020-04-05T19:13:13Z 2014-09-04T06:50:49Z MEMBER      

Currently, variable attributes are checked for equality before allowing for a merge via a call to xarray_equal. It should be possible to merge datasets even if some of the variable metadata disagrees (conflicting attributes should be dropped). This is already the behavior for global attributes.

The right design of this feature should probably include some optional argument to Dataset.merge indicating how strict we want the merge to be. I can see at least three versions that could be useful: 1. Drop conflicting metadata silently. 2. Don't allow for conflicting values, but drop non-matching keys. 3. Require all keys and values to match.

We can argue about which of these should be the default option. My inclination is to be as flexible as possible by using 1 or 2 in most cases.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/25/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
124700322 MDExOlB1bGxSZXF1ZXN0NTQ5NDUxNzE= 702 Basic multiIndex support and stack/unstack methods shoyer 1217238 closed 0     13 2016-01-04T05:48:49Z 2016-06-01T16:48:54Z 2016-01-18T00:11:11Z MEMBER   0 pydata/xarray/pulls/702

Fixes #164, #700

Example usage:

``` In [3]: df = pd.DataFrame({'foo': range(3), ...: 'x': ['a', 'b', 'b'], ...: 'y': [0, 0, 1]}) ...:

In [4]: s = df.set_index(['x', 'y'])['foo']

In [5]: arr = xray.DataArray(s, dims='z')

In [6]: arr Out[6]: <xray.DataArray 'foo' (z: 3)> array([0, 1, 2]) Coordinates: * z (z) object ('a', 0) ('b', 0) ('b', 1)

In [7]: arr.indexes['z'] Out[7]: MultiIndex(levels=[[u'a', u'b'], [0, 1]], labels=[[0, 1, 1], [0, 0, 1]], names=[u'x', u'y'])

In [8]: arr.unstack('z') Out[8]: <xray.DataArray 'foo' (x: 2, y: 2)> array([[ 0., nan], [ 1., 2.]]) Coordinates: * x (x) object 'a' 'b' * y (y) int64 0 1

In [9]: arr.unstack('z').stack(z=('x', 'y')) Out[9]: <xray.DataArray 'foo' (z: 4)> array([ 0., nan, 1., 2.]) Coordinates: * z (z) object ('a', 0) ('a', 1) ('b', 0) ('b', 1) ```

TODO (maybe not necessary yet, but eventually): - [x] Multi-index support working with .loc and .sel() - [x] Multi-dimensional stack/unstack - [ ] Serialization to NetCDF - [ ] Better repr, showing level names/dtypes? - [ ] Make levels accessible as coordinate variables (e.g., ds['time'] can pull out the 'time' level of a multi-index) - [ ] Make isel_points/sel_points return objects with a MultiIndex? (probably after the previous TODO, so we can preserve basic backwards compatibility) - [ ] Add set_index/reset_index/swaplevel to make it easier to create and manipulate multi-indexes

It would be nice to eventually build a full example showing how stack can be combined with lazy loading / dask to do out-of-core PCA on a large geophysical dataset (e.g., identify El Nino).

cc @MaximilianR @jreback @jhamman

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/702/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 96.351ms · About: xarray-datasette