home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

20 rows where author_association = "MEMBER" and issue = 777153550 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 4

  • max-sixty 16
  • shoyer 2
  • mrocklin 1
  • keewis 1

issue 1

  • Faster unstacking · 20 ✖

author_association 1

  • MEMBER · 20 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
766462310 https://github.com/pydata/xarray/pull/4746#issuecomment-766462310 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc2NjQ2MjMxMA== max-sixty 5635139 2021-01-24T23:48:00Z 2021-01-24T23:48:00Z MEMBER

Let me know any post-merge feedback and I'll make the changes

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
763903671 https://github.com/pydata/xarray/pull/4746#issuecomment-763903671 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc2MzkwMzY3MQ== max-sixty 5635139 2021-01-20T20:12:36Z 2021-01-20T20:12:36Z MEMBER

Any final feedback before merging?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
760538940 https://github.com/pydata/xarray/pull/4746#issuecomment-760538940 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc2MDUzODk0MA== keewis 14808389 2021-01-14T23:18:16Z 2021-01-15T15:33:15Z MEMBER

it definitely is not: look at the builds of #4809: one build failed and the next (about 12 minutes later) fails. I guess either sphinx, nbsphinx or one of their dependencies got updated on conda-forge. nbformat (5.0.8 → 5.1.0), maybe?

Edit: nbformat=5.1.2 is out which should fix that issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
760532922 https://github.com/pydata/xarray/pull/4746#issuecomment-760532922 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc2MDUzMjkyMg== max-sixty 5635139 2021-01-14T23:07:16Z 2021-01-14T23:07:16Z MEMBER

Would anyone know whether the docs failure is related to this PR? I can't see anything in the log apart from matplotlib warnings? https://readthedocs.org/projects/xray/builds/12768566/

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
760532153 https://github.com/pydata/xarray/pull/4746#issuecomment-760532153 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc2MDUzMjE1Mw== max-sixty 5635139 2021-01-14T23:05:16Z 2021-01-14T23:05:16Z MEMBER

I double-checked the benchmarks and added a pandas comparison. That involved ensuring the missing value was handled corre them and ensured the setup wasn't in the benchmark.

I don't get the 100x speed-up that I thought I saw initially; it's now more like 8x. Still decent! I'm not sure whether that's because I misread the benchmark previously or because the benchmarks are slightly different — I guess the first.

Pasting below the results so we have something concrete.

Existing asv profile unstacking.Unstacking.time_unstack_slow master | head -n 20 ··· unstacking.Unstacking.time_unstack_slow 861±20ms

Proposed asv profile unstacking.Unstacking.time_unstack_slow HEAD | head -n 20 ··· unstacking.Unstacking.time_unstack_slow 108±3ms

Pandas asv profile unstacking.Unstacking.time_unstack_pandas_slow master | head -n 20 ··· unstacking.Unstacking.time_unstack_pandas_slow 207±10ms

Are we OK with the claim vs pandas? I think it's important that we make accurate comparisons (both good and bad) but open-minded if it seems a bit aggressive. Worth someone reviewing the code in the benchmark to ensure I haven't made a mistake.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
758143460 https://github.com/pydata/xarray/pull/4746#issuecomment-758143460 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc1ODE0MzQ2MA== max-sixty 5635139 2021-01-11T18:37:48Z 2021-01-11T18:37:48Z MEMBER

Any final comments before merging?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
755421422 https://github.com/pydata/xarray/pull/4746#issuecomment-755421422 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc1NTQyMTQyMg== mrocklin 306380 2021-01-06T16:50:12Z 2021-01-06T16:50:12Z MEMBER

If anyone here has time to review https://github.com/dask/dask/pull/7033 that would be greatly appreciated :)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
755420263 https://github.com/pydata/xarray/pull/4746#issuecomment-755420263 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc1NTQyMDI2Mw== max-sixty 5635139 2021-01-06T16:48:30Z 2021-01-06T16:48:30Z MEMBER

As discussed in dev meeting, https://github.com/dask/dask/pull/7033 would allow dask to use the fast path, and likely eventually for our existing path to be removed

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
754401859 https://github.com/pydata/xarray/pull/4746#issuecomment-754401859 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc1NDQwMTg1OQ== max-sixty 5635139 2021-01-05T05:14:27Z 2021-01-05T05:14:27Z MEMBER

Tests now pass after merging master, not sure whether the previous tests were flaky vs something upstream changed...

Ready for a final review

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
753778517 https://github.com/pydata/xarray/pull/4746#issuecomment-753778517 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc1Mzc3ODUxNw== max-sixty 5635139 2021-01-04T06:11:17Z 2021-01-04T06:11:17Z MEMBER

This still seems to be getting a bunch of pint failures, like this one: https://dev.azure.com/xarray/xarray/_build/results?buildId=4657&view=ms.vss-test-web.build-test-results-tab&runId=73796&resultId=110407&paneView=debug

I confused, since this PR now has no mention of pint and I don't see any mention of unstacking in those test failures. I suspect I'm missing something. Any ideas for what could be happening?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
753683611 https://github.com/pydata/xarray/pull/4746#issuecomment-753683611 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc1MzY4MzYxMQ== max-sixty 5635139 2021-01-03T22:14:17Z 2021-01-03T22:14:17Z MEMBER

Until https://github.com/pydata/xarray/pull/4751 is resolved, I've taken out the explicit pint check and replaced with a numpy check.

The code is a bit messy now — now two levels of comments. But I've put references in, so it should be tractable. Lmk any feedback.

Otherwise this is ready to go from my end.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
753430518 https://github.com/pydata/xarray/pull/4746#issuecomment-753430518 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc1MzQzMDUxOA== max-sixty 5635139 2021-01-02T04:37:26Z 2021-01-02T04:37:26Z MEMBER

I'm not sure whether I'm making some very basic mistake, but I'm seeing what seems like a very surprising error.

After the most recent commit, which seems to do very little apart from import pint iff it's available: https://github.com/pydata/xarray/pull/4746/commits/b33adedbfbd92df0f4188568691c7e2915bf8c19, I'm getting a lot of pint errors, unrelated to unstack / stack.

Here's the results from the run prior: https://dev.azure.com/xarray/xarray/_build/results?buildId=4650&view=ms.vss-test-web.build-test-results-tab

And from this run: https://dev.azure.com/xarray/xarray/_build/results?buildId=4651&view=ms.vss-test-web.build-test-results-tab

Any ideas what's happening? As ever, 30% chance that I made a obvious typo that I can't see... Thanks in advance.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
753427050 https://github.com/pydata/xarray/pull/4746#issuecomment-753427050 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc1MzQyNzA1MA== max-sixty 5635139 2021-01-02T03:56:19Z 2021-01-02T03:56:19Z MEMBER

Thank you @jthielen !

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
753425993 https://github.com/pydata/xarray/pull/4746#issuecomment-753425993 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc1MzQyNTk5Mw== max-sixty 5635139 2021-01-02T03:43:06Z 2021-01-02T03:43:06Z MEMBER

I imagine I'm making some basic error here, but what's the best approach for evaluating whether an array is a pint array?

isinstance(self.data, unit_registry.Quantity) returns False, though seems to be what we do in test_units.py?

(Pdb) unit_registry <pint.registry.UnitRegistry object at 0x13d4574c0> (Pdb) unit_registry.Quantity <class 'pint.quantity.build_quantity_class.<locals>.Quantity'> (Pdb) isinstance(self.data, unit_registry.Quantity) False (Pdb) self.data <Quantity([ 0. 0.20408163 0.40816327 0.6122449 0.81632653 1.02040816 1.2244898 1.42857143 1.63265306 1.83673469 2.04081633 2.24489796 2.44897959 2.65306122 2.85714286 3.06122449 3.26530612 3.46938776 3.67346939 3.87755102 4.08163265 4.28571429 4.48979592 4.69387755 4.89795918 5.10204082 5.30612245 5.51020408 5.71428571 5.91836735 6.12244898 6.32653061 6.53061224 6.73469388 6.93877551 7.14285714 7.34693878 7.55102041 7.75510204 7.95918367 8.16326531 8.36734694 8.57142857 8.7755102 8.97959184 9.18367347 9.3877551 9.59183673 9.79591837 10. ], 'meter')> (Pdb) self.data.__class__ <class 'pint.quantity.build_quantity_class.<locals>.Quantity'>

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
753425173 https://github.com/pydata/xarray/pull/4746#issuecomment-753425173 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc1MzQyNTE3Mw== max-sixty 5635139 2021-01-02T03:30:35Z 2021-01-02T03:30:35Z MEMBER

Thanks for responding @jthielen .

Yes, so full_like isn't creating a pint array:

(Pdb) self.data <Quantity([ 0. 0.20408163 0.40816327 0.6122449 0.81632653 1.02040816 1.2244898 1.42857143 1.63265306 1.83673469 2.04081633 2.24489796 2.44897959 2.65306122 2.85714286 3.06122449 3.26530612 3.46938776 3.67346939 3.87755102 4.08163265 4.28571429 4.48979592 4.69387755 4.89795918 5.10204082 5.30612245 5.51020408 5.71428571 5.91836735 6.12244898 6.32653061 6.53061224 6.73469388 6.93877551 7.14285714 7.34693878 7.55102041 7.75510204 7.95918367 8.16326531 8.36734694 8.57142857 8.7755102 8.97959184 9.18367347 9.3877551 9.59183673 9.79591837 10. ], 'meter')> (Pdb) np.full_like(self.data, fill_value=fill_value) array([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]) (Pdb) fill_value nan

I do think this is a bit surprising — while fill_value isn't typed, it's compatible with the existing type.

For the moment, I'll direct pint arrays to take the existing code path — I'm more confident that we don't want to special-case pint in the unstack routine.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
753416850 https://github.com/pydata/xarray/pull/4746#issuecomment-753416850 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc1MzQxNjg1MA== shoyer 1217238 2021-01-02T02:00:19Z 2021-01-02T02:00:19Z MEMBER

@keewis any thoughts on the pint issue?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
753406775 https://github.com/pydata/xarray/pull/4746#issuecomment-753406775 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc1MzQwNjc3NQ== max-sixty 5635139 2021-01-02T00:03:00Z 2021-01-02T00:03:00Z MEMBER

Would anyone be able be familiar with this pint error? https://dev.azure.com/xarray/xarray/_build/results?buildId=4650&view=ms.vss-test-web.build-test-results-tab&runId=73654&resultId=111454&paneView=debug

It seems to be failing on the assignment: data[(..., *indexer)] = reordered, rather than anything specific to unstacking.

Here's the stack trace from there:

```python /home/vsts/work/1/s/xarray/core/variable.py:1627: in _unstack_once data[(..., *indexer)] = reordered /home/vsts/work/1/s/xarray/core/common.py:131: in array return np.asarray(self.values, dtype=dtype) /home/vsts/work/1/s/xarray/core/variable.py:543: in values return _as_array_or_item(self._data) /home/vsts/work/1/s/xarray/core/variable.py:275: in _as_array_or_item data = np.asarray(data) /usr/share/miniconda/envs/xarray-tests/lib/python3.8/site-packages/numpy/core/_asarray.py:83: in asarray return array(a, dtype, copy=False, order=order)


self = <Quantity([ 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9 9 10], 'meter')> t = None

def __array__(self, t=None):
  warnings.warn(
        "The unit of the quantity is stripped when downcasting to ndarray.",
        UnitStrippedWarning,
        stacklevel=2,
    )

E pint.errors.UnitStrippedWarning: The unit of the quantity is stripped when downcasting to ndarray.

/usr/share/miniconda/envs/xarray-tests/lib/python3.8/site-packages/pint/quantity.py:1683: UnitStrippedWarning ```

Worst case, I can direct pint arrays to the existing unstack path, but ideally this would work.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
753391636 https://github.com/pydata/xarray/pull/4746#issuecomment-753391636 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc1MzM5MTYzNg== max-sixty 5635139 2021-01-01T22:03:19Z 2021-01-01T22:03:19Z MEMBER

Great, thanks, I'm making that change.

Is there any need to keep the sparse kwarg? My inclination is to remove it and retain types — so to get a sparse array back, convert to sparse before unstacking?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
753282125 https://github.com/pydata/xarray/pull/4746#issuecomment-753282125 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc1MzI4MjEyNQ== shoyer 1217238 2021-01-01T07:52:14Z 2021-01-01T07:52:14Z MEMBER

Very nice!

I think we solve the issue with sparse by using np.full_like, which does dispatching on NumPy 1.17+ via NEP-18 __array_function__ (which I'm pretty sure sparse supports).

The bigger challenge that I'm concerned about here are dask arrays, which don't support array assignment like this at all (https://github.com/dask/dask/issues/2000). We will probably need to keep around the slower option for dask, at least for now.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550
753225789 https://github.com/pydata/xarray/pull/4746#issuecomment-753225789 https://api.github.com/repos/pydata/xarray/issues/4746 MDEyOklzc3VlQ29tbWVudDc1MzIyNTc4OQ== max-sixty 5635139 2020-12-31T23:38:33Z 2020-12-31T23:38:33Z MEMBER

Any ideas on how sparse arrays should be handled in unstack? Currently we use reindex, so this seems to pass through without much effort on our part.

In the new code, we're creating an array with np.full and then assigning to the appropriate locations. Can we do something similar that's not dependent on the underlying numpy / sparse / dask storage?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Faster unstacking 777153550

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 12.819ms · About: xarray-datasette