home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

20 rows where issue = 396102183 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 8

  • max-sixty 6
  • r-beer 5
  • shoyer 2
  • hrishikeshac 2
  • fujiisoup 2
  • Hoeze 1
  • dcherian 1
  • pep8speaks 1

author_association 2

  • MEMBER 11
  • NONE 9

issue 1

  • cov() and corr() · 20 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
633652366 https://github.com/pydata/xarray/pull/2652#issuecomment-633652366 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDYzMzY1MjM2Ng== max-sixty 5635139 2020-05-25T16:57:28Z 2020-05-25T16:57:28Z MEMBER

@hrishikeshac in case you come back to see this: thank you for taking it so far; your code was helpful to eventually getting this feature in. And we'd of course appreciate any additional contributions.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
556046073 https://github.com/pydata/xarray/pull/2652#issuecomment-556046073 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDU1NjA0NjA3Mw== max-sixty 5635139 2019-11-20T15:11:27Z 2019-11-20T15:11:27Z MEMBER

@r-beer great—you were right to start your own PR

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
555915343 https://github.com/pydata/xarray/pull/2652#issuecomment-555915343 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDU1NTkxNTM0Mw== r-beer 45787861 2019-11-20T09:20:39Z 2019-11-20T09:30:31Z NONE

Alright, I have done so and changed basestrings into str, now the tests run (mostly) through locally. bash 1 failed, 6539 passed, 1952 skipped, 37 xfailed, 31 warnings in 86.34s A general question concerning collaboration on existing PRs: Should I push the changes to my fork and then create a new PR that replaces @hrishikeshac's PR or shall I push to @hrishikeshac's fork and see whether it works to update the existing PR?

Or is there another option?

PS: Permission to push to hrishikeshac:corr is denied for me.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
555756460 https://github.com/pydata/xarray/pull/2652#issuecomment-555756460 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDU1NTc1NjQ2MA== max-sixty 5635139 2019-11-19T23:02:37Z 2019-11-19T23:02:37Z MEMBER

Yeah those are pretty normal. Your suggestions look right: you can keep both on both sets and eliminate any duplicate imports in the first,

(FYI I edited your comment so it displayed properly, seems you need a line break after <details>)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
555745623 https://github.com/pydata/xarray/pull/2652#issuecomment-555745623 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDU1NTc0NTYyMw== r-beer 45787861 2019-11-19T22:27:10Z 2019-11-19T23:00:31Z NONE

Alright, I only got two merge conflicts in dataarray.py:

minor merge conflict concerning imports: 1. accessors -> accessors_td 2. broadcast has been dropped in master?

```python <<<<<<< HEAD from . import ( computation, dtypes, groupby, indexing, ops, pdcompat, resample, rolling, utils, ) from .accessor_dt import DatetimeAccessor from .accessor_str import StringAccessor from .alignment import ( _broadcast_helper, _get_broadcast_dims_map_common_coords, align, reindex_like_indexers, ) ======= from .accessors import DatetimeAccessor from .alignment import align, reindex_like_indexers, broadcast >>>>>>> added da.corr() and da.cov() to dataarray.py. Test added in test_dataarray.py, and tested using pytest. ```

Secondly, some bigger merge conflicts concerning some of dataarray's methods, but they seem to be not in conflict with each other: 1. integrate(), unify_chunks() and map_blocks added in master 2. cov() and corr() added in corr branch 3. It seems like str = property(StringAccessor) should be at the end of dataarray's definition, for mypy reasons...

``` <<<<<<< HEAD def integrate( self, dim: Union[Hashable, Sequence[Hashable]], datetime_unit: str = None ) -> "DataArray": """ integrate the array with the trapezoidal rule. .. note:: This feature is limited to simple cartesian geometry, i.e. dim must be one dimensional. Parameters ---------- dim: hashable, or a sequence of hashable Coordinate(s) used for the integration. datetime_unit: str, optional Can be used to specify the unit if datetime coordinate is used. One of {'Y', 'M', 'W', 'D', 'h', 'm', 's', 'ms', 'us', 'ns', 'ps', 'fs', 'as'} Returns ------- integrated: DataArray See also -------- numpy.trapz: corresponding numpy function Examples -------- >>> da = xr.DataArray(np.arange(12).reshape(4, 3), dims=['x', 'y'], ... coords={'x': [0, 0.1, 1.1, 1.2]}) >>> da <xarray.DataArray (x: 4, y: 3)> array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 9, 10, 11]]) Coordinates: * x (x) float64 0.0 0.1 1.1 1.2 Dimensions without coordinates: y >>> >>> da.integrate('x') <xarray.DataArray (y: 3)> array([5.4, 6.6, 7.8]) Dimensions without coordinates: y """ ds = self._to_temp_dataset().integrate(dim, datetime_unit) return self._from_temp_dataset(ds) def unify_chunks(self) -> "DataArray": """ Unify chunk size along all chunked dimensions of this DataArray. Returns ------- DataArray with consistent chunk sizes for all dask-array variables See Also -------- dask.array.core.unify_chunks """ ds = self._to_temp_dataset().unify_chunks() return self._from_temp_dataset(ds) def map_blocks( self, func: "Callable[..., T_DSorDA]", args: Sequence[Any] = (), kwargs: Mapping[str, Any] = None, ) -> "T_DSorDA": """ Apply a function to each chunk of this DataArray. This method is experimental and its signature may change. Parameters ---------- func: callable User-provided function that accepts a DataArray as its first parameter. The function will receive a subset of this DataArray, corresponding to one chunk along each chunked dimension. ``func`` will be executed as ``func(obj_subset, *args, **kwargs)``. The function will be first run on mocked-up data, that looks like this array but has sizes 0, to determine properties of the returned object such as dtype, variable names, new dimensions and new indexes (if any). This function must return either a single DataArray or a single Dataset. This function cannot change size of existing dimensions, or add new chunked dimensions. args: Sequence Passed verbatim to func after unpacking, after the sliced DataArray. xarray objects, if any, will not be split by chunks. Passing dask collections is not allowed. kwargs: Mapping Passed verbatim to func after unpacking. xarray objects, if any, will not be split by chunks. Passing dask collections is not allowed. Returns ------- A single DataArray or Dataset with dask backend, reassembled from the outputs of the function. Notes ----- This method is designed for when one needs to manipulate a whole xarray object within each chunk. In the more common case where one can work on numpy arrays, it is recommended to use apply_ufunc. If none of the variables in this DataArray is backed by dask, calling this method is equivalent to calling ``func(self, *args, **kwargs)``. See Also -------- dask.array.map_blocks, xarray.apply_ufunc, xarray.map_blocks, xarray.Dataset.map_blocks """ from .parallel import map_blocks return map_blocks(func, self, args, kwargs) # this needs to be at the end, or mypy will confuse with `str` # https://mypy.readthedocs.io/en/latest/common_issues.html#dealing-with-conflicting-names str = property(StringAccessor) ======= def cov(self, other, dim = None): """Compute covariance between two DataArray objects along a shared dimension. Parameters ---------- other: DataArray The other array with which the covariance will be computed dim: The dimension along which the covariance will be computed Returns ------- covariance: DataArray """ # 1. Broadcast the two arrays self, other = broadcast(self, other) # 2. Ignore the nans valid_values = self.notnull() & other.notnull() self = self.where(valid_values, drop=True) other = other.where(valid_values, drop=True) valid_count = valid_values.sum(dim) #3. Compute mean and standard deviation along the given dim demeaned_self = self - self.mean(dim = dim) demeaned_other = other - other.mean(dim = dim) #4. Compute covariance along the given dim cov = (demeaned_self*demeaned_other).sum(dim=dim)/(valid_count) return cov def corr(self, other, dim = None): """Compute correlation between two DataArray objects along a shared dimension. Parameters ---------- other: DataArray The other array with which the correlation will be computed dim: The dimension along which the correlation will be computed Returns ------- correlation: DataArray """ # 1. Broadcast the two arrays self, other = broadcast(self, other) # 2. Ignore the nans valid_values = self.notnull() & other.notnull() self = self.where(valid_values, drop=True) other = other.where(valid_values, drop=True) # 3. Compute correlation based on standard deviations and cov() self_std = self.std(dim=dim) other_std = other.std(dim=dim) return self.cov(other, dim = dim)/(self_std*other_std) >>>>>>> added da.corr() and da.cov() to dataarray.py. Test added in test_dataarray.py, and tested using pytest. ```

Can you please comment my suggested changes (accepting either changes from master or both, if no conflicts).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
555738102 https://github.com/pydata/xarray/pull/2652#issuecomment-555738102 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDU1NTczODEwMg== max-sixty 5635139 2019-11-19T22:06:49Z 2019-11-19T22:06:49Z MEMBER

Yes 100%! Let me know if that doesn't work!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
555737000 https://github.com/pydata/xarray/pull/2652#issuecomment-555737000 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDU1NTczNzAwMA== r-beer 45787861 2019-11-19T22:03:46Z 2019-11-19T22:03:46Z NONE

@max-sixty, thanks for the fast response!

Yeah, I get the traceback and already started diving into it. However, I assumed that @hrishikeshac's branch "corr" wasn't up-to-date.

Shall I merge changes from master or develop into corr, before looking further into the tests?

I read http://xarray.pydata.org/en/stable/contributing.html, is this identical to contributing.rst? Following those guidelines, I would use the following commands to "retrieve the changes from the master branch": git fetch upstream git rebase upstream/master

Where upstream = https://github.com/pydata/xarray.git?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
555734331 https://github.com/pydata/xarray/pull/2652#issuecomment-555734331 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDU1NTczNDMzMQ== r-beer 45787861 2019-11-19T21:57:02Z 2019-11-19T21:57:02Z NONE

@max-sixty, thanks for the fast response!

Yeah, I get the traceback and already started diving into it. However, I assumed that @hrishikeshac's branch "corr" wasn't up-to-date.

Shall I merge changes from master or develop into corr, before looking further into the tests?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
555733366 https://github.com/pydata/xarray/pull/2652#issuecomment-555733366 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDU1NTczMzM2Ng== max-sixty 5635139 2019-11-19T21:54:26Z 2019-11-19T21:54:26Z MEMBER

Great @r-beer , we can be helpful in getting you up & running

Given this branch has diverged from master, I would make your own fork and merge in master; looks like you'll have some minor conflicts. (more details in our contributing.rst docs, or post here if confused). You can then open up your own draft PR.

Re the tests: pytest should print a list of the tests that failed and their stack traces, do you not see anything?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
555730897 https://github.com/pydata/xarray/pull/2652#issuecomment-555730897 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDU1NTczMDg5Nw== r-beer 45787861 2019-11-19T21:48:07Z 2019-11-19T21:48:07Z NONE

Dear @Hoeze, I will (try to) finalize this merge request, as I am also very interested in this functionality.

I am new to xarray and contribution. I downloaded @hrishikeshac's code and ran the pytest tests locally.

I get 17 failed, 2088 passed, 2896 skipped, 7 xfailed, 754 warnings in 49.95s.

Is there an elegant way to share "which tests failed where" in order to avoid that I try to fix tests, that might already have been fixed in other branches?

I will already start to get a better understanding of why the tests fail and try to fix them in the meantime.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
511165269 https://github.com/pydata/xarray/pull/2652#issuecomment-511165269 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDUxMTE2NTI2OQ== Hoeze 1200058 2019-07-14T01:17:54Z 2019-07-14T01:17:54Z NONE

Is this pull request still up to date?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
452782113 https://github.com/pydata/xarray/pull/2652#issuecomment-452782113 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDQ1Mjc4MjExMw== hrishikeshac 6334793 2019-01-09T17:32:12Z 2019-01-09T17:32:12Z NONE

I also think making this a function is probably a good idea, even though it's different from pandas.

One question: how should these functions align their arguments? Recall that xarray does an inner join for arithmetic (though there's an option to control this), and an outer join in most other cases. It's not entirely obvious to me what the right choice is here (or if it really even matters).

I always assumed an inner join is the way to go. I had initially just implemented align, but later changed to broadcast since the align doesn't add dimension/ labels (if missing in one of the inputs) to the output, but broadcast does. Without this, the where(valid_values) doesn't work if one input is 1-D and the other is N-D.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
452014148 https://github.com/pydata/xarray/pull/2652#issuecomment-452014148 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDQ1MjAxNDE0OA== shoyer 1217238 2019-01-07T17:29:45Z 2019-01-07T17:29:45Z MEMBER

I agree that the case for DataArray.dot is questionable. It sort of makes sense because numpy and pandas both have it as a method, but the @ operator is a really a cleaner way to express this now that we're Python 3 only. (Speaking of which, why don't we support @ in xarray yet? :).)

On Mon, Jan 7, 2019 at 1:43 AM Keisuke Fujii notifications@github.com wrote:

@max-sixty https://github.com/max-sixty I am not sure whether DataArray.dot is a right choice. But I am wondering for cov case, it sounds like to compute a covariance of the DataArray itself rather than the cross covariance with another DataArray.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/pull/2652#issuecomment-451877705, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKS1tDVgVcOe7Sw08cWI1II-HfgVAnIks5vAxa2gaJpZM4ZuRR9 .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
451877705 https://github.com/pydata/xarray/pull/2652#issuecomment-451877705 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDQ1MTg3NzcwNQ== fujiisoup 6815844 2019-01-07T09:43:17Z 2019-01-07T09:43:17Z MEMBER

@max-sixty I am not sure whether DataArray.dot is a right choice. But I am wondering for cov case, it sounds like to compute a covariance of the DataArray itself rather than the cross covariance with another DataArray.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
451840629 https://github.com/pydata/xarray/pull/2652#issuecomment-451840629 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDQ1MTg0MDYyOQ== max-sixty 5635139 2019-01-07T07:04:46Z 2019-01-07T07:04:46Z MEMBER

Probably a function like xr.cov(x, y) is better than method?

We should be more concerned with correctness than consistency - but is having DataArray.dot consistent with DataArray.cov? If not, what's the salient difference?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
451703221 https://github.com/pydata/xarray/pull/2652#issuecomment-451703221 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDQ1MTcwMzIyMQ== shoyer 1217238 2019-01-06T00:05:49Z 2019-01-06T00:05:49Z MEMBER

I also think making this a function is probably a good idea, even though it's different from pandas.

One question: how should these functions align their arguments? Recall that xarray does an inner join for arithmetic (though there's an option to control this), and an outer join in most other cases. It's not entirely obvious to me what the right choice is here (or if it really even matters).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
451661466 https://github.com/pydata/xarray/pull/2652#issuecomment-451661466 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDQ1MTY2MTQ2Ng== dcherian 2448579 2019-01-05T14:44:28Z 2019-01-05T14:44:28Z MEMBER

I agree with @fujisoup

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
451658613 https://github.com/pydata/xarray/pull/2652#issuecomment-451658613 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDQ1MTY1ODYxMw== fujiisoup 6815844 2019-01-05T14:07:30Z 2019-01-05T14:07:30Z MEMBER

I am a little worrying that users could misunderstand cov is for (auto-)covariance rather than cross-covariance, which we are implementing here. Probably a function like xr.cov(x, y) is better than method?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
451602256 https://github.com/pydata/xarray/pull/2652#issuecomment-451602256 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDQ1MTYwMjI1Ng== hrishikeshac 6334793 2019-01-04T23:44:10Z 2019-01-04T23:44:10Z NONE

Made the code PEP8 compatible. Apologies for not doing so earlier.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183
451600030 https://github.com/pydata/xarray/pull/2652#issuecomment-451600030 https://api.github.com/repos/pydata/xarray/issues/2652 MDEyOklzc3VlQ29tbWVudDQ1MTYwMDAzMA== pep8speaks 24736507 2019-01-04T23:30:48Z 2019-01-04T23:39:41Z NONE

Hello @hrishikeshac! Thanks for updating the PR.

  • In the file xarray/core/dataarray.py, following are the PEP8 issues :

Line 2417:80: E501 line too long (85 > 79 characters) Line 2448:80: E501 line too long (86 > 79 characters)

  • In the file xarray/tests/test_dataarray.py, following are the PEP8 issues :

Line 3312:80: E501 line too long (92 > 79 characters) Line 3313:80: E501 line too long (89 > 79 characters) Line 3342:80: E501 line too long (100 > 79 characters)

Comment last updated on January 04, 2019 at 23:39 Hours UTC
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() 396102183

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 800.487ms · About: xarray-datasette