home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where issue = 323839238 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 4

  • fmaussion 2
  • shoyer 1
  • dcherian 1
  • malmans2 1

author_association 2

  • MEMBER 4
  • CONTRIBUTOR 1

issue 1

  • Dataset.resample() adds time dimension to independant variables · 5 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1073493914 https://github.com/pydata/xarray/issues/2145#issuecomment-1073493914 https://api.github.com/repos/pydata/xarray/issues/2145 IC_kwDOAMm_X84__Dea dcherian 2448579 2022-03-21T05:15:52Z 2022-03-21T05:15:52Z MEMBER

There is compatibility code in GroupBy._binary_op that could be removed when this is fixed. (See #6160)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Dataset.resample() adds time dimension to independant variables 323839238
391114129 https://github.com/pydata/xarray/issues/2145#issuecomment-391114129 https://api.github.com/repos/pydata/xarray/issues/2145 MDEyOklzc3VlQ29tbWVudDM5MTExNDEyOQ== shoyer 1217238 2018-05-22T19:34:48Z 2018-05-22T19:47:09Z MEMBER

This is not really desirable behavior, but it's an implication of how xarray implements ds.resample(time='1M').mean(): - Resample is converted into a groupby call, e.g., ds.groupby(time_starts).mean('time') - .mean('time') for each grouped dataset averages over the 'time' dimension, resulting in a dataset with only a 'space' dimension, e.g., ```

list(ds.resample(time='1M'))[0][1].mean('time') <xarray.Dataset> Dimensions: (space: 10) Coordinates: * space (space) int64 0 1 2 3 4 5 6 7 8 9 Data variables: var_withtime1 (space) float64 0.008982 -0.09879 0.1361 -0.2485 -0.023 ... var_withtime2 (space) float64 0.2621 0.06009 -0.1686 0.07397 0.1095 ... var_timeless1 (space) float64 0.8519 -0.4253 -0.8581 0.9085 -0.4797 ... var_timeless2 (space) float64 0.8006 1.954 -0.5349 0.3317 1.778 -0.7954 ... `` -concat()` is used to combine grouped datasets into the final result, but it doesn't know anything about which variables were aggregated, so every data variable gets the "time" dimension added.

To fix this I would suggest three steps: 1. Add a keep_dims argument to xarray reductions like mean(), indicating that a dimension should be preserved with length 1, like keep_dims=True for numpy reductions (https://github.com/pydata/xarray/issues/2170). 2. Fix concat to only concatenate variables that already have the concatenated dimension, as discussed in https://github.com/pydata/xarray/issues/2064 3. Use keep_dims=True in groupby reductions. Then the result should automatically only include aggregated dimensions. This would convenient allow us to remove existing logic in groupby() for restoring the original order of aggregated dimensions (see _restore_dim_order()).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Dataset.resample() adds time dimension to independant variables 323839238
390267025 https://github.com/pydata/xarray/issues/2145#issuecomment-390267025 https://api.github.com/repos/pydata/xarray/issues/2145 MDEyOklzc3VlQ29tbWVudDM5MDI2NzAyNQ== malmans2 22245117 2018-05-18T16:50:47Z 2018-05-22T19:18:34Z CONTRIBUTOR

In my previous comment I said that this would be useful for staggered grids, but then I realized that resample only operates on the time dimension. Anyway, here is my example:

```python import xarray as xr import pandas as pd import numpy as np

Create coordinates

time = pd.date_range('1/1/2018', periods=365, freq='D') space = pd.np.arange(10)

Create random variables

var_withtime1 = np.random.randn(len(time), len(space)) var_withtime2 = np.random.randn(len(time), len(space)) var_timeless1 = np.random.randn(len(space)) var_timeless2 = np.random.randn(len(space))

Create dataset

ds = xr.Dataset({'var_withtime1': (['time', 'space'], var_withtime1), 'var_withtime2': (['time', 'space'], var_withtime2), 'var_timeless1': (['space'], var_timeless1), 'var_timeless2': (['space'], var_timeless2)}, coords={'time': (['time',], time), 'space': (['space',], space)})

Standard resample: this add the time dimension to the timeless variables

ds_resampled = ds.resample(time='1M').mean()

My workaround: this does not add the time dimension to the timeless variables

ds_withtime = ds.drop([ var for var in ds.variables if not 'time' in ds[var].dims ]) ds_timeless = ds.drop([ var for var in ds.variables if 'time' in ds[var].dims ]) ds_workaround = xr.merge([ds_timeless, ds_withtime.resample(time='1M').mean()]) ```

Datasets: ```

ds <xarray.Dataset> Dimensions: (space: 10, time: 365) Coordinates: * time (time) datetime64[ns] 2018-01-01 2018-01-02 2018-01-03 ... * space (space) int64 0 1 2 3 4 5 6 7 8 9 Data variables: var_withtime1 (time, space) float64 -1.137 -0.5727 -1.287 0.8102 ... var_withtime2 (time, space) float64 1.406 0.8448 1.276 0.02579 0.5684 ... var_timeless1 (space) float64 0.02073 -2.117 -0.2891 1.735 -1.535 0.209 ... var_timeless2 (space) float64 0.4357 -0.3257 -0.8321 0.8409 0.1454 ...

ds_resampled <xarray.Dataset> Dimensions: (space: 10, time: 12) Coordinates: * time (time) datetime64[ns] 2018-01-31 2018-02-28 2018-03-31 ... * space (space) int64 0 1 2 3 4 5 6 7 8 9 Data variables: var_withtime1 (time, space) float64 0.08149 0.02121 -0.05635 0.1788 ... var_withtime2 (time, space) float64 0.08991 0.5728 0.05394 0.214 0.3523 ... var_timeless1 (time, space) float64 0.02073 -2.117 -0.2891 1.735 -1.535 ... var_timeless2 (time, space) float64 0.4357 -0.3257 -0.8321 0.8409 ...

ds_workaround <xarray.Dataset> Dimensions: (space: 10, time: 12) Coordinates: * space (space) int64 0 1 2 3 4 5 6 7 8 9 * time (time) datetime64[ns] 2018-01-31 2018-02-28 2018-03-31 ... Data variables: var_timeless1 (space) float64 0.4582 -0.6946 -0.3451 1.183 -1.14 0.1849 ... var_timeless2 (space) float64 1.658 -0.1719 -0.2202 -0.1789 -1.247 ... var_withtime1 (time, space) float64 -0.3901 0.3725 0.02935 -0.1315 ... var_withtime2 (time, space) float64 0.07145 -0.08536 0.07049 0.1025 ... ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Dataset.resample() adds time dimension to independant variables 323839238
390275158 https://github.com/pydata/xarray/issues/2145#issuecomment-390275158 https://api.github.com/repos/pydata/xarray/issues/2145 MDEyOklzc3VlQ29tbWVudDM5MDI3NTE1OA== fmaussion 10050469 2018-05-18T17:21:11Z 2018-05-18T17:21:11Z MEMBER

I see. Note that groupby does the same. I don't know what the rationale is behind that decision, but there might be a reason...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Dataset.resample() adds time dimension to independant variables 323839238
389886807 https://github.com/pydata/xarray/issues/2145#issuecomment-389886807 https://api.github.com/repos/pydata/xarray/issues/2145 MDEyOklzc3VlQ29tbWVudDM4OTg4NjgwNw== fmaussion 10050469 2018-05-17T14:29:46Z 2018-05-17T14:29:46Z MEMBER

Thanks for the report! Do you think you can craft a minimal working example ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Dataset.resample() adds time dimension to independant variables 323839238

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 313.718ms · About: xarray-datasette
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows