home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where issue = 753517739 and user = 2448579 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • dcherian · 5 ✖

issue 1

  • Non lazy behavior for weighted average when using resampled data · 5 ✖

author_association 1

  • MEMBER 5
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
736713989 https://github.com/pydata/xarray/issues/4625#issuecomment-736713989 https://api.github.com/repos/pydata/xarray/issues/4625 MDEyOklzc3VlQ29tbWVudDczNjcxMzk4OQ== dcherian 2448579 2020-12-01T17:47:44Z 2020-12-01T17:47:44Z MEMBER

Yes something like what you have with python with raise_if_dask_computes(): ds.resample(time='3AS').map(mean_func)

BUT something is wrong with my explanation above. The error is only triggered when the number of timesteps is not divisble by the resampling frequency. If you set periods=3 when creating t, the old version works fine, if you change it to 4 it computes. But setting deep=False fixes it in all cases. I am v. confused!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Non lazy behavior for weighted average when using resampled data 753517739
736221445 https://github.com/pydata/xarray/issues/4625#issuecomment-736221445 https://api.github.com/repos/pydata/xarray/issues/4625 MDEyOklzc3VlQ29tbWVudDczNjIyMTQ0NQ== dcherian 2448579 2020-12-01T05:08:12Z 2020-12-01T05:08:12Z MEMBER

Untested but specifying deep=False in the call to copy should fix it

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Non lazy behavior for weighted average when using resampled data 753517739
736195480 https://github.com/pydata/xarray/issues/4625#issuecomment-736195480 https://api.github.com/repos/pydata/xarray/issues/4625 MDEyOklzc3VlQ29tbWVudDczNjE5NTQ4MA== dcherian 2448579 2020-12-01T03:34:29Z 2020-12-01T03:34:29Z MEMBER

PRs are always welcome!

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 1,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Non lazy behavior for weighted average when using resampled data 753517739
736131299 https://github.com/pydata/xarray/issues/4625#issuecomment-736131299 https://api.github.com/repos/pydata/xarray/issues/4625 MDEyOklzc3VlQ29tbWVudDczNjEzMTI5OQ== dcherian 2448579 2020-12-01T00:12:41Z 2020-12-01T00:12:41Z MEMBER

Ah this works (but we lose weights as a coord var).

``` python

simple customized weighted mean function

def mean_func(ds): return ds.weighted(ds.weights.reset_coords(drop=True)).mean('time') ```

Adding reset_coords fixes this because it gets rid of the non-dim coord weights.

https://github.com/pydata/xarray/blob/180e76d106c697b1dd94b814a49dc2d7e58c8551/xarray/core/weighted.py#L149 dot compares the weights coord var on ds and weights to decide if it should keep it.

The new call to .copy ends up making a copy of weights coord on the weights dataarray, so the lazy equality check fails. One solution is to avoid the call to copy and create the DataArray directly

python enc = weights.encoding weights = DataArray( weights.data.map_blocks(_weight_check, dtype=weights.dtype), dims=weights.dims, coords=weights.coords, attrs=weights.attrs ) weights.encoding = enc This works locally.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Non lazy behavior for weighted average when using resampled data 753517739
736101365 https://github.com/pydata/xarray/issues/4625#issuecomment-736101365 https://api.github.com/repos/pydata/xarray/issues/4625 MDEyOklzc3VlQ29tbWVudDczNjEwMTM2NQ== dcherian 2448579 2020-11-30T22:46:27Z 2020-11-30T22:51:34Z MEMBER

The weighted fix in #4559 is correct, that's why python with ProgressBar(): mean_func(ds) does not compute.

This is more instructive: ``` python from xarray.tests import raise_if_dask_computes

with raise_if_dask_computes(): ds.resample(time='3AS').map(mean_func) ```

``` python .... 150 151 def _sum_of_weights(

~/work/python/xarray/xarray/core/computation.py in dot(dims, arrays, kwargs) 1483 output_core_dims=output_core_dims, 1484 join=join, -> 1485 dask="allowed", 1486 ) 1487 return result.transpose([d for d in all_dims if d in result.dims])

~/work/python/xarray/xarray/core/computation.py in apply_ufunc(func, input_core_dims, output_core_dims, exclude_dims, vectorize, join, dataset_join, dataset_fill_value, keep_attrs, kwargs, dask, output_dtypes, output_sizes, meta, dask_gufunc_kwargs, *args) 1132 join=join, 1133 exclude_dims=exclude_dims, -> 1134 keep_attrs=keep_attrs, 1135 ) 1136 # feed Variables directly through apply_variable_ufunc

~/work/python/xarray/xarray/core/computation.py in apply_dataarray_vfunc(func, signature, join, exclude_dims, keep_attrs, *args) 266 else: 267 name = result_name(args) --> 268 result_coords = build_output_coords(args, signature, exclude_dims) 269 270 data_vars = [getattr(a, "variable", a) for a in args]

~/work/python/xarray/xarray/core/computation.py in build_output_coords(args, signature, exclude_dims) 231 # TODO: save these merged indexes, instead of re-computing them later 232 merged_vars, unused_indexes = merge_coordinates_without_align( --> 233 coords_list, exclude_dims=exclude_dims 234 ) 235

~/work/python/xarray/xarray/core/merge.py in merge_coordinates_without_align(objects, prioritized, exclude_dims) 327 filtered = collected 328 --> 329 return merge_collected(filtered, prioritized) 330 331

~/work/python/xarray/xarray/core/merge.py in merge_collected(grouped, prioritized, compat) 227 variables = [variable for variable, _ in elements_list] 228 try: --> 229 merged_vars[name] = unique_variable(name, variables, compat) 230 except MergeError: 231 if compat != "minimal":

~/work/python/xarray/xarray/core/merge.py in unique_variable(name, variables, compat, equals) 132 if equals is None: 133 # now compare values with minimum number of computes --> 134 out = out.compute() 135 for var in variables[1:]: 136 equals = getattr(out, compat)(var)

~/work/python/xarray/xarray/core/variable.py in compute(self, kwargs) 459 """ 460 new = self.copy(deep=False) --> 461 return new.load(kwargs) 462 463 def dask_tokenize(self):

~/work/python/xarray/xarray/core/variable.py in load(self, kwargs) 435 """ 436 if is_duck_dask_array(self._data): --> 437 self._data = as_compatible_data(self._data.compute(kwargs)) 438 elif not is_duck_array(self._data): 439 self._data = np.asarray(self._data)

~/miniconda3/envs/dcpy/lib/python3.7/site-packages/dask/base.py in compute(self, kwargs) 165 dask.base.compute 166 """ --> 167 (result,) = compute(self, traverse=False, kwargs) 168 return result 169

~/miniconda3/envs/dcpy/lib/python3.7/site-packages/dask/base.py in compute(args, kwargs) 450 postcomputes.append(x.dask_postcompute()) 451 --> 452 results = schedule(dsk, keys, kwargs) 453 return repack([f(r, a) for r, (f, a) in zip(results, postcomputes)]) 454

~/work/python/xarray/xarray/tests/init.py in call(self, dsk, keys, kwargs) 112 raise RuntimeError( 113 "Too many computes. Total: %d > max: %d." --> 114 % (self.total_computes, self.max_computes) 115 ) 116 return dask.get(dsk, keys, kwargs)

RuntimeError: Too many computes. Total: 1 > max: 0. ```

It looks like we're repeatedly checking weights for equality (if you navigate to merge_collected in the stack, name = "weights". The lazy_array_equal check is failing, because a copy is made somewhere.

``` python ipdb> up

/home/deepak/work/python/xarray/xarray/core/merge.py(229)merge_collected() 227 variables = [variable for variable, _ in elements_list] 228 try: --> 229 merged_vars[name] = unique_variable(name, variables, compat) 230 except MergeError: 231 if compat != "minimal":

ipdb> name

'weights'

ipdb> variables

[<xarray.Variable (time: 1)> dask.array<getitem, shape=(1,), dtype=float64, chunksize=(1,), chunktype=numpy.ndarray>, <xarray.Variable (time: 1)> dask.array<copy, shape=(1,), dtype=float64, chunksize=(1,), chunktype=numpy.ndarray>]

ipdb> variables[0].data.name

'getitem-2a74b8ca20ae20100597e397404ba17b'

ipdb> variables[1].data.name

'copy-fff901a87f4a2293c750766c554aa68d' ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Non lazy behavior for weighted average when using resampled data 753517739

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 160.732ms · About: xarray-datasette