home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where user = 1270651 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • nvictus · 5 ✖

issue 1

  • Support for __array_function__ implementers (sparse arrays) [WIP] 5

author_association 1

  • CONTRIBUTOR 5
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
518349031 https://github.com/pydata/xarray/pull/3117#issuecomment-518349031 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxODM0OTAzMQ== nvictus 1270651 2019-08-05T18:35:43Z 2019-08-05T18:35:43Z CONTRIBUTOR

Sounds good!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
518060701 https://github.com/pydata/xarray/pull/3117#issuecomment-518060701 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxODA2MDcwMQ== nvictus 1270651 2019-08-05T02:10:58Z 2019-08-05T02:12:39Z CONTRIBUTOR

So, tests are passing now and I've documented the expected failures on sparse arrays. :)

As mentioned before, most fall into the categories of (1) implicit coercion to dense and (2) missing operations on sparse arrays.

Turns out a lot of the implicit coercion is due to the use of routines from bottleneck. A few other cases of using np.asarray remain. Sometimes this is through the use of the .values attribute. At the moment, the behavior of Variables and DataArrays is such that .data provides the duck array and .values coerces to numpy, following the original behavior for dask arrays -- which made me realize, we never asked if this behavior is desired in general?

I also modified Variable.load() to be a no-op for duck arrays (unless it is a dask array), so now compute() works. To make equality comparisons work between dense and sparse, I modified as_like_arrays() to coerce all arguments to sparse if at least one is sparse. As a side-effect, a.identical(b) can return True if one is dense and the other sparse. Again, here we're sort of mirroring the existing behavior when the duck array is a dask array, but this may seem questionable to extend to other duck arrays.

If bottleneck is turned off, most remaining issues are due to missing implementations in sparse. Some of these should be easy to add: PR pydata/sparse/pull/261 already exposes the result_type function.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
517367105 https://github.com/pydata/xarray/pull/3117#issuecomment-517367105 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxNzM2NzEwNQ== nvictus 1270651 2019-08-01T16:43:36Z 2019-08-01T16:43:36Z CONTRIBUTOR

Thanks for bumping this @mrocklin! I've put in some extra work on my free time, which hasn't been pushed yet. I'll try to write up a summary of my findings today. Briefly though, it seems like the two limiting factors for NEP18 duck array support are:

  1. Operations which ultimately coerce duck arrays to ndarrays (e.g. via np.asarray). Many, but maybe not all of these operations can fixed to dispatch to the duck array's implementation. But that leads to:

  2. Operations not supported by the duck type. This happens in a few cases with pydata/sparse, and would have to be solved upstream, unless it's a special case where it might be okay to coerce. e.g. what happens with binary operations that mix array types?

I think NEP18-backed xarray structures can be supported in principle, but it won't prevent some operations from simply failing in some contexts. So maybe xarray will need to define a minimum required implementation subset of the array API for duck arrays.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
512577283 https://github.com/pydata/xarray/pull/3117#issuecomment-512577283 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxMjU3NzI4Mw== nvictus 1270651 2019-07-17T21:33:26Z 2019-07-17T21:33:26Z CONTRIBUTOR

Hmm, looks like the DataWithCoords base type might come in handy here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
512573950 https://github.com/pydata/xarray/pull/3117#issuecomment-512573950 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxMjU3Mzk1MA== nvictus 1270651 2019-07-17T21:22:47Z 2019-07-17T21:22:47Z CONTRIBUTOR

After writing more tests, turns out sparse-backed xarrays were generating strange coordinates. On closer inspection, this is because sparse.COO objects have their own coords attribute which stores the indices of its nonzeros.

With a serendipitous shape and density of a sparse array, there were the right number of coords such that they got converted into xarray coords:

```

S = sparse.random((100,100)) S.coords
array([[ 0, 0, 3, 3, 4, 5, 6, 7, 7, 7, 9, 10, 11, 14, 14, 16, 17, 18, 19, 19, 21, 21, 21, 21, 22, 23, 23, 24, 24, 25, 26, 28, 29, 31, 35, 35, 36, 36, 38, 39, 41, 42, 42, 43, 44, 46, 46, 47, 48, 48, 49, 49, 49, 49, 50, 52, 53, 54, 55, 56, 57, 57, 58, 60, 60, 61, 61, 62, 64, 64, 68, 70, 72, 73, 76, 77, 78, 79, 79, 80, 81, 83, 84, 85, 88, 89, 90, 90, 90, 92, 92, 93, 93, 94, 94, 96, 97, 98, 98, 99], [14, 28, 58, 67, 66, 37, 50, 9, 66, 67, 6, 22, 44, 51, 64, 26, 53, 91, 45, 81, 13, 25, 42, 86, 47, 22, 67, 77, 81, 22, 96, 18, 23, 96, 34, 57, 6, 96, 91, 86, 75, 86, 88, 91, 30, 6, 65, 47, 61, 74, 14, 73, 82, 83, 32, 42, 53, 68, 21, 38, 48, 50, 87, 8, 89, 22, 57, 60, 92, 94, 19, 79, 38, 53, 32, 95, 69, 22, 46, 17, 17, 86, 36, 7, 71, 35, 9, 58, 79, 22, 68, 10, 47, 48, 54, 72, 24, 47, 63, 86]]) xr.DataArray(S) <xarray.DataArray (dim_0: 100, dim_1: 100)> <COO: shape=(100, 100), dtype=float64, nnz=100, fill_value=0.0> Coordinates: * dim_0 (dim_0) int64 0 0 3 3 4 5 6 7 7 7 ... 92 93 93 94 94 96 97 98 98 99 * dim_1 (dim_1) int64 14 28 58 67 66 37 50 9 66 ... 47 48 54 72 24 47 63 86 ```

A simple fix is to special-case SparseArrays in the DataArray constructor and not extract the sparse coords attribute in that case, but that would require importing sparse.

Would it make sense to just assume that all non-DataArray NEP-18 compliant arrays do not contain an xarray-compliant coords attribute?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.472ms · About: xarray-datasette