home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

2 rows where comments = 14 and user = 5635139 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

state 2

  • closed 1
  • open 1

type 1

  • issue 2

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
124154674 MDU6SXNzdWUxMjQxNTQ2NzQ= 688 Keep attrs & Add a 'keep_coords' argument to Dataset.apply max-sixty 5635139 closed 0     14 2015-12-29T02:42:48Z 2023-09-30T18:47:07Z 2023-09-30T18:47:07Z MEMBER      

Generally this isn't a problem, since the coords are carried over by the resulting DataArrays:

``` python In [11]:

ds = xray.Dataset({ 'a':pd.DataFrame(pd.np.random.rand(10,3)), 'b':pd.Series(pd.np.random.rand(10)) }) ds.coords['c'] = pd.Series(pd.np.random.rand(10)) ds Out[11]: <xray.Dataset> Dimensions: (dim_0: 10, dim_1: 3) Coordinates: * dim_0 (dim_0) int64 0 1 2 3 4 5 6 7 8 9 * dim_1 (dim_1) int64 0 1 2 c (dim_0) float64 0.9318 0.2899 0.3853 0.6235 0.9436 0.7928 ... Data variables: a (dim_0, dim_1) float64 0.5707 0.9485 0.3541 0.5987 0.406 0.7992 ... b (dim_0) float64 0.4106 0.2316 0.5804 0.6393 0.5715 0.6463 ... In [12]:

ds.apply(lambda x: x*2) Out[12]: <xray.Dataset> Dimensions: (dim_0: 10, dim_1: 3) Coordinates: c (dim_0) float64 0.9318 0.2899 0.3853 0.6235 0.9436 0.7928 ... * dim_0 (dim_0) int64 0 1 2 3 4 5 6 7 8 9 * dim_1 (dim_1) int64 0 1 2 Data variables: a (dim_0, dim_1) float64 1.141 1.897 0.7081 1.197 0.812 1.598 ... b (dim_0) float64 0.8212 0.4631 1.161 1.279 1.143 1.293 0.3507 ... ```

But if there's an operation that removes the coords from the DataArrays, the coords are not there on the result (notice c below). Should the Dataset retain them? Either always or with a keep_coords argument, similar to keep_attrs.

``` python In [13]:

ds = xray.Dataset({ 'a':pd.DataFrame(pd.np.random.rand(10,3)), 'b':pd.Series(pd.np.random.rand(10)) }) ds.coords['c'] = pd.Series(pd.np.random.rand(10)) ds Out[13]: <xray.Dataset> Dimensions: (dim_0: 10, dim_1: 3) Coordinates: * dim_0 (dim_0) int64 0 1 2 3 4 5 6 7 8 9 * dim_1 (dim_1) int64 0 1 2 c (dim_0) float64 0.4121 0.2507 0.6326 0.4031 0.6169 0.441 0.1146 ... Data variables: a (dim_0, dim_1) float64 0.4813 0.2479 0.5158 0.2787 0.06672 ... b (dim_0) float64 0.2638 0.5788 0.6591 0.7174 0.3645 0.5655 ... In [14]:

ds.apply(lambda x: x.to_pandas()*2) Out[14]: <xray.Dataset> Dimensions: (dim_0: 10, dim_1: 3) Coordinates: * dim_0 (dim_0) int64 0 1 2 3 4 5 6 7 8 9 * dim_1 (dim_1) int64 0 1 2 Data variables: a (dim_0, dim_1) float64 0.9627 0.4957 1.032 0.5574 0.1334 0.8289 ... b (dim_0) float64 0.5275 1.158 1.318 1.435 0.7291 1.131 0.1903 ... ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/688/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
587895591 MDU6SXNzdWU1ODc4OTU1OTE= 3891 Keep attrs by default? (keep_attrs) max-sixty 5635139 open 0     14 2020-03-25T18:17:35Z 2023-09-22T02:27:50Z   MEMBER      

I've held this view in low confidence for a while and wanted to socialize it to see whether there's something to it: Should we keep attrs in operations by default?

Advantages: - I think most of the time people want to keep attrs after operations - Is that right? Are there cases where it wouldn't be a reasonable default? e.g. good points here for not always keeping coords around - It's easy to remove them with a (currently unimplemented) drop_attrs method when people do want to remove them

Disadvantages: - Backward incompatible change with an expensive deprecate cycle (would be impractical to have a deprecation warning every time someone ran a function on an object with attrs I think? At least without adding a once filter warning) - ?

Here are some existing relevant discussions: - https://github.com/pydata/xarray/issues/3815#issuecomment-603974527 - https://github.com/pydata/xarray/issues/688 - https://github.com/pydata/xarray/pull/2482 - https://github.com/pydata/xarray/issues/3304

I think this is an easy situation to get into: - We make an incorrect-but-insignificant design decision; e.g. some methods don't keep attrs - We want to change that, but avoid breaking backward-compatibility - So we add kwargs and eventually a global config - But now we have a global config that requires global context and lots of kwargs! :(

I'm up for leaning towards breaking changes if it makes the library better: I think xarray will grow immensely, and so the narrow immediate pain is worth the broader future positive impact. Clearly if the immediate pain stops xarray growing, then it's not a good tradeoff.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3891/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 2005.702ms · About: xarray-datasette