home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

7 rows where author_association = "CONTRIBUTOR" and user = 9466648 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 2

  • Make xr.corr and xr.map_blocks work without dask 5
  • Dask error on xarray.corr 2

user 1

  • Gijom · 7 ✖

author_association 1

  • CONTRIBUTOR · 7 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
977725089 https://github.com/pydata/xarray/pull/5731#issuecomment-977725089 https://api.github.com/repos/pydata/xarray/issues/5731 IC_kwDOAMm_X846Ruah Gijom 9466648 2021-11-24T10:08:04Z 2021-11-24T10:08:04Z CONTRIBUTOR

Sorry for the unresponsiveness. I confirm that using the asfloat conversion also works for me. Should I close the issue or should I let you do it ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make xr.corr and xr.map_blocks work without dask 976790237
904403621 https://github.com/pydata/xarray/pull/5731#issuecomment-904403621 https://api.github.com/repos/pydata/xarray/issues/5731 IC_kwDOAMm_X8416Bql Gijom 9466648 2021-08-24T07:43:28Z 2021-08-31T14:41:01Z CONTRIBUTOR

I included the proposed changes, thanks for the help.

However the new test I implemented does not go through and I fail to understand why. Would you be able to check ? The idea of the test is to ensure that lazy computations give the same results than normal ones. I tracked down the problem to line 1377 of computation.py: python demeaned_da_a = da_a - da_a.mean(dim=dim) # <-- this one returns nan upon computation (.compute()) demeaned_da_b = da_b - da_b.mean(dim=dim) # <-- this one returns the correct value although the input is masked with nans in the same way

The values before the mean call are: da_a.compute() = <xarray.DataArray <this-array> (x: 2, time: 2)> array([[ 1., 2.], [ 2., nan]]) da_b.compute() = <xarray.DataArray <this-array> (x: 2, time: 2)> array([[ 1., 2.], [ 1., nan]])

And for the means (dim is None): da_a.mean(dim=dim).compute()=<xarray.DataArray <this-array> ()> array(nan) da_b.mean(dim=dim).compute()=<xarray.DataArray <this-array> ()> array(1.33333333)

For non lazy computations everything seems fine.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make xr.corr and xr.map_blocks work without dask 976790237
909300955 https://github.com/pydata/xarray/pull/5731#issuecomment-909300955 https://api.github.com/repos/pydata/xarray/issues/5731 IC_kwDOAMm_X842MtTb Gijom 9466648 2021-08-31T14:40:24Z 2021-08-31T14:40:24Z CONTRIBUTOR

I was referring to: https://github.com/pydata/xarray/pull/5731#issuecomment-904403621

Otherwise I am working on the conflict with the main branch. Should be done soon.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make xr.corr and xr.map_blocks work without dask 976790237
909095076 https://github.com/pydata/xarray/pull/5731#issuecomment-909095076 https://api.github.com/repos/pydata/xarray/issues/5731 IC_kwDOAMm_X842L7Ck Gijom 9466648 2021-08-31T10:10:15Z 2021-08-31T10:10:15Z CONTRIBUTOR

@keewis indeed this was the problem. I removed it and corrected one test which did not have the @require_dask decorator.

There is still a test error with the test arrays 5 and 6: the lazy array version do not return the same value than the usual corr. I consider this a different bug and just removed the tests with a TODO in the code.

The last commit pass the tests on my machine and I added the whats-new text to version 0.19.1.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make xr.corr and xr.map_blocks work without dask 976790237
903574187 https://github.com/pydata/xarray/pull/5731#issuecomment-903574187 https://api.github.com/repos/pydata/xarray/issues/5731 IC_kwDOAMm_X84123Kr Gijom 9466648 2021-08-23T08:57:36Z 2021-08-23T08:57:36Z CONTRIBUTOR

Concerning the tests I added a test to check that lazy correlations have identical results to non-lazy correlations. But this is not directly related to the bug fix. Indeed there is a need to have a test that checks if the non-lazy correlation works without dask installed. The already existing test function test_corr does not have a @requires_dask decorator. But it did not fail as it should have. Not sure why.

About documenting the changes, I am not sure of the whats-new.rst format and I thus did not add it. Should I just add a bullet item on the top of the file ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Make xr.corr and xr.map_blocks work without dask 976790237
902657635 https://github.com/pydata/xarray/issues/5715#issuecomment-902657635 https://api.github.com/repos/pydata/xarray/issues/5715 IC_kwDOAMm_X841zXZj Gijom 9466648 2021-08-20T12:30:19Z 2021-08-20T12:30:33Z CONTRIBUTOR

I had a look to it this morning and I think I managed to solve the issue by replacing the calls to dask.is_dask_collection by is_duck_dask_array from the pycompat module.

For (successful) testing I used the same code as above plus the following: ```python ds_dask = ds.chunk({"t": 10})

yy = xr.corr(ds['y'], ds['y']).to_numpy() yy_dask = xr.corr(ds_dask['y'], ds_dask['y']).to_numpy() yx = xr.corr(ds['y'], ds['x']).to_numpy() yx_dask = xr.corr(ds_dask['y'], ds_dask['x']).to_numpy() np.testing.assert_allclose(yy, yy_dask), "YY: {} is different from {}".format(yy, yy_dask) np.testing.assert_allclose(yx, yx_dask), "YX: {} is different from {}".format(yx, yx_dask) ``` The results are not exactly identical but almost which is probably due to numerical approximations of multiple computations in the dask case.

I also tested the correlation of simple DataArrays without dask installed and the result seem coherent (close to 0 for uncorrelated data and very close to 1 when correlating identical variables).

Should I make a pull request ? Should I implement this test ? Any others ?

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
  Dask error on xarray.corr 974488736
902516397 https://github.com/pydata/xarray/issues/5715#issuecomment-902516397 https://api.github.com/repos/pydata/xarray/issues/5715 IC_kwDOAMm_X841y06t Gijom 9466648 2021-08-20T08:09:46Z 2021-08-20T08:10:50Z CONTRIBUTOR

The responsible code for the error originally comes from the call to da_a = da_a.map_blocks(_get_valid_values, args=[da_b]), which aim is to remove nan values from both DataArrays. I am confused by this given that the code lines below seems to accumplish something similar (despite of the comment saying it should not): ```python

4. Compute covariance along the given dim

N.B. skipna=False is required or there is a bug when computing

auto-covariance. E.g. Try xr.cov(da,da) for

da = xr.DataArray([[1, 2], [1, np.nan]], dims=["x", "time"])

cov = (demeaned_da_a * demeaned_da_b).sum(dim=dim, skipna=True, min_count=1) / ( valid_count ) ```

In any case, the parrallel module imports dask in a try catch block to ignore the import error. So this is not a surprise that when using dask latter there is an error if it was not imported. I can see two possibilities: - encapsulate all dask calls in a similar try/catch block - set a boolean in the first place and do the tests only if dask is correctly imported

Now I do not have any big picure there so there are probably better solutions.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Dask error on xarray.corr 974488736

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.363ms · About: xarray-datasette