home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

17 rows where issue = 467771005 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 8

  • shoyer 5
  • nvictus 5
  • mrocklin 2
  • dcherian 1
  • max-sixty 1
  • andersy005 1
  • codecov[bot] 1
  • pep8speaks 1

author_association 3

  • MEMBER 10
  • CONTRIBUTOR 5
  • NONE 2

issue 1

  • Support for __array_function__ implementers (sparse arrays) [WIP] · 17 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
519591292 https://github.com/pydata/xarray/pull/3117#issuecomment-519591292 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxOTU5MTI5Mg== max-sixty 5635139 2019-08-08T16:27:09Z 2019-08-08T16:27:09Z MEMBER

Thanks @nvictus !

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
518355147 https://github.com/pydata/xarray/pull/3117#issuecomment-518355147 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxODM1NTE0Nw== mrocklin 306380 2019-08-05T18:53:39Z 2019-08-05T18:53:39Z MEMBER

Woot! Thanks @nvictus !

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
518354323 https://github.com/pydata/xarray/pull/3117#issuecomment-518354323 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxODM1NDMyMw== dcherian 2448579 2019-08-05T18:51:12Z 2019-08-05T18:51:12Z MEMBER

Thanks a lot, @nvictus

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
518349031 https://github.com/pydata/xarray/pull/3117#issuecomment-518349031 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxODM0OTAzMQ== nvictus 1270651 2019-08-05T18:35:43Z 2019-08-05T18:35:43Z CONTRIBUTOR

Sounds good!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
518320289 https://github.com/pydata/xarray/pull/3117#issuecomment-518320289 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxODMyMDI4OQ== shoyer 1217238 2019-08-05T17:14:15Z 2019-08-05T17:14:15Z MEMBER

@nvictus we are good to go ahead and merge, and do follow-ups in other PRs?

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
511157136 https://github.com/pydata/xarray/pull/3117#issuecomment-511157136 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxMTE1NzEzNg== pep8speaks 24736507 2019-07-13T22:06:04Z 2019-08-05T15:49:20Z NONE

Hello @nvictus! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:

Comment last updated at 2019-08-05 15:49:19 UTC
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
511157703 https://github.com/pydata/xarray/pull/3117#issuecomment-511157703 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxMTE1NzcwMw== codecov[bot] 22429695 2019-07-13T22:17:54Z 2019-08-05T14:27:54Z NONE

Codecov Report

Merging #3117 into master will increase coverage by 0.03%. The diff coverage is 91.89%.

```diff @@ Coverage Diff @@

master #3117 +/-

========================================== + Coverage 95.7% 95.73% +0.03%
========================================== Files 63 63
Lines 12861 12870 +9
========================================== + Hits 12308 12321 +13
+ Misses 553 549 -4 ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
518074867 https://github.com/pydata/xarray/pull/3117#issuecomment-518074867 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxODA3NDg2Nw== shoyer 1217238 2019-08-05T03:43:22Z 2019-08-05T03:43:22Z MEMBER

At the moment, the behavior of Variables and DataArrays is such that .data provides the duck array and .values coerces to numpy, following the original behavior for dask arrays -- which made me realize, we never asked if this behavior is desired in general?

I think the right behavior is probably for .values to be implemented by calling np.asarray() on .data. That means it should raise on sparse arrays.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
518060701 https://github.com/pydata/xarray/pull/3117#issuecomment-518060701 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxODA2MDcwMQ== nvictus 1270651 2019-08-05T02:10:58Z 2019-08-05T02:12:39Z CONTRIBUTOR

So, tests are passing now and I've documented the expected failures on sparse arrays. :)

As mentioned before, most fall into the categories of (1) implicit coercion to dense and (2) missing operations on sparse arrays.

Turns out a lot of the implicit coercion is due to the use of routines from bottleneck. A few other cases of using np.asarray remain. Sometimes this is through the use of the .values attribute. At the moment, the behavior of Variables and DataArrays is such that .data provides the duck array and .values coerces to numpy, following the original behavior for dask arrays -- which made me realize, we never asked if this behavior is desired in general?

I also modified Variable.load() to be a no-op for duck arrays (unless it is a dask array), so now compute() works. To make equality comparisons work between dense and sparse, I modified as_like_arrays() to coerce all arguments to sparse if at least one is sparse. As a side-effect, a.identical(b) can return True if one is dense and the other sparse. Again, here we're sort of mirroring the existing behavior when the duck array is a dask array, but this may seem questionable to extend to other duck arrays.

If bottleneck is turned off, most remaining issues are due to missing implementations in sparse. Some of these should be easy to add: PR pydata/sparse/pull/261 already exposes the result_type function.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
517400198 https://github.com/pydata/xarray/pull/3117#issuecomment-517400198 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxNzQwMDE5OA== shoyer 1217238 2019-08-01T18:14:38Z 2019-08-01T18:14:38Z MEMBER

2. Operations not supported by the duck type. This happens in a few cases with pydata/sparse, and would have to be solved upstream, unless it's a special case where it might be okay to coerce. e.g. what happens with binary operations that mix array types?

This is totally fine for now, as long as there are clear errors when attempting to do an unsupported operation. We can write unit tests with expected failures, which should provide a clear roadmap for things to fix upstream in sparse.

We could attempt to define a minimum required implementation, but in practice I suspect this will be hard to nail down definitively. The ultimate determinant of what works will be xarray's implementation.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
517367105 https://github.com/pydata/xarray/pull/3117#issuecomment-517367105 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxNzM2NzEwNQ== nvictus 1270651 2019-08-01T16:43:36Z 2019-08-01T16:43:36Z CONTRIBUTOR

Thanks for bumping this @mrocklin! I've put in some extra work on my free time, which hasn't been pushed yet. I'll try to write up a summary of my findings today. Briefly though, it seems like the two limiting factors for NEP18 duck array support are:

  1. Operations which ultimately coerce duck arrays to ndarrays (e.g. via np.asarray). Many, but maybe not all of these operations can fixed to dispatch to the duck array's implementation. But that leads to:

  2. Operations not supported by the duck type. This happens in a few cases with pydata/sparse, and would have to be solved upstream, unless it's a special case where it might be okay to coerce. e.g. what happens with binary operations that mix array types?

I think NEP18-backed xarray structures can be supported in principle, but it won't prevent some operations from simply failing in some contexts. So maybe xarray will need to define a minimum required implementation subset of the array API for duck arrays.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
517311370 https://github.com/pydata/xarray/pull/3117#issuecomment-517311370 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxNzMxMTM3MA== mrocklin 306380 2019-08-01T14:27:13Z 2019-08-01T14:27:13Z MEMBER

Checking in here. This was a fun project during SciPy Sprints that both showed a lot of potential and generated a lot of excitement. But of course as we all returned home other things came up and this has lingered for a while.

How can we best preserve this work? Two specific questions:

  1. @nvictus can you summarize, from your perspective, what still needs to be done here? If you aren't likely to have time to finish this up (which would not be surprising) where should someone else start if they wanted to push this forward?
  2. Xarray devs, are there parts of this PR that could go in quickly, even without the full implementation, just so that this doesn't grow stale?
{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
512583158 https://github.com/pydata/xarray/pull/3117#issuecomment-512583158 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxMjU4MzE1OA== shoyer 1217238 2019-07-17T21:53:50Z 2019-07-17T21:53:50Z MEMBER

Would it make sense to just assume that all non-DataArray NEP-18 compliant arrays do not contain an xarray-compliant coords attribute?

Yes, let's switch: coords = getattr(data, 'coords', None) to if isinstance(data, DataArray): coords = data.coords

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
512577283 https://github.com/pydata/xarray/pull/3117#issuecomment-512577283 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxMjU3NzI4Mw== nvictus 1270651 2019-07-17T21:33:26Z 2019-07-17T21:33:26Z CONTRIBUTOR

Hmm, looks like the DataWithCoords base type might come in handy here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
512573950 https://github.com/pydata/xarray/pull/3117#issuecomment-512573950 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxMjU3Mzk1MA== nvictus 1270651 2019-07-17T21:22:47Z 2019-07-17T21:22:47Z CONTRIBUTOR

After writing more tests, turns out sparse-backed xarrays were generating strange coordinates. On closer inspection, this is because sparse.COO objects have their own coords attribute which stores the indices of its nonzeros.

With a serendipitous shape and density of a sparse array, there were the right number of coords such that they got converted into xarray coords:

```

S = sparse.random((100,100)) S.coords
array([[ 0, 0, 3, 3, 4, 5, 6, 7, 7, 7, 9, 10, 11, 14, 14, 16, 17, 18, 19, 19, 21, 21, 21, 21, 22, 23, 23, 24, 24, 25, 26, 28, 29, 31, 35, 35, 36, 36, 38, 39, 41, 42, 42, 43, 44, 46, 46, 47, 48, 48, 49, 49, 49, 49, 50, 52, 53, 54, 55, 56, 57, 57, 58, 60, 60, 61, 61, 62, 64, 64, 68, 70, 72, 73, 76, 77, 78, 79, 79, 80, 81, 83, 84, 85, 88, 89, 90, 90, 90, 92, 92, 93, 93, 94, 94, 96, 97, 98, 98, 99], [14, 28, 58, 67, 66, 37, 50, 9, 66, 67, 6, 22, 44, 51, 64, 26, 53, 91, 45, 81, 13, 25, 42, 86, 47, 22, 67, 77, 81, 22, 96, 18, 23, 96, 34, 57, 6, 96, 91, 86, 75, 86, 88, 91, 30, 6, 65, 47, 61, 74, 14, 73, 82, 83, 32, 42, 53, 68, 21, 38, 48, 50, 87, 8, 89, 22, 57, 60, 92, 94, 19, 79, 38, 53, 32, 95, 69, 22, 46, 17, 17, 86, 36, 7, 71, 35, 9, 58, 79, 22, 68, 10, 47, 48, 54, 72, 24, 47, 63, 86]]) xr.DataArray(S) <xarray.DataArray (dim_0: 100, dim_1: 100)> <COO: shape=(100, 100), dtype=float64, nnz=100, fill_value=0.0> Coordinates: * dim_0 (dim_0) int64 0 0 3 3 4 5 6 7 7 7 ... 92 93 93 94 94 96 97 98 98 99 * dim_1 (dim_1) int64 14 28 58 67 66 37 50 9 66 ... 47 48 54 72 24 47 63 86 ```

A simple fix is to special-case SparseArrays in the DataArray constructor and not extract the sparse coords attribute in that case, but that would require importing sparse.

Would it make sense to just assume that all non-DataArray NEP-18 compliant arrays do not contain an xarray-compliant coords attribute?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
511180633 https://github.com/pydata/xarray/pull/3117#issuecomment-511180633 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxMTE4MDYzMw== shoyer 1217238 2019-07-14T07:35:45Z 2019-07-14T07:35:45Z MEMBER

Even though it failed when I tried applying an operation on the dataset, this is still awesome!

Yes, it really is!

For this specific failure, we should think about adding an option for the default skipna value, or maybe making the semantics depend on the array type.

If someone is using xarray to wrap a computation oriented library like CuPy, they probably almost always want to set skipna=False (along with join='exact'). I don't think I've seen any deep library that has bothered to implement nanmean.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005
511172148 https://github.com/pydata/xarray/pull/3117#issuecomment-511172148 https://api.github.com/repos/pydata/xarray/issues/3117 MDEyOklzc3VlQ29tbWVudDUxMTE3MjE0OA== andersy005 13301940 2019-07-14T04:25:07Z 2019-07-14T04:28:00Z MEMBER

@nvictus, thank you for your work!

I just tried this on CuPy arrays, and it seems to be working during array creation:

```python In [1]: import cupy as cp

In [2]: import xarray as xr

In [3]: x = cp.arange(6).reshape(2, 3).astype('f')

In [4]: y = cp.ones((2, 3), dtype='int')

In [5]: x
Out[5]: array([[0., 1., 2.], [3., 4., 5.]], dtype=float32)

In [6]: y
Out[6]: array([[1, 1, 1], [1, 1, 1]])

In [7]: y.device
Out[7]: <CUDA Device 0>

In [8]: x.device
Out[8]: <CUDA Device 0>

In [9]: ds = xr.Dataset()

In [10]: ds['x'] = xr.DataArray(x, dims=['lat', 'lon'])

In [11]: ds['y'] = xr.DataArray(y, dims=['lat', 'lon'])

In [12]: ds
Out[12]: <xarray.Dataset> Dimensions: (lat: 2, lon: 3) Dimensions without coordinates: lat, lon Data variables: x (lat, lon) float32 ... y (lat, lon) int64 ...

In [13]: ds.x.data.device
Out[13]: <CUDA Device 0>

In [14]: ds.y.data.device
Out[14]: <CUDA Device 0> ```

Even though it failed when I tried applying an operation on the dataset, this is still awesome! I am very excited and looking forward to seeing this feature in xarray:

```python In [15]: m = ds.mean(dim='lat')


TypeError Traceback (most recent call last) <ipython-input-15-8e4d5e7d5ee3> in <module> ----> 1 m = ds.mean(dim='lat')

/glade/work/abanihi/devel/pangeo/xarray/xarray/core/common.py in wrapped_func(self, dim, skipna, kwargs) 65 return self.reduce(func, dim, skipna=skipna, 66 numeric_only=numeric_only, allow_lazy=True, ---> 67 kwargs) 68 else: 69 def wrapped_func(self, dim=None, **kwargs): # type: ignore

/glade/work/abanihi/devel/pangeo/xarray/xarray/core/dataset.py in reduce(self, func, dim, keep_attrs, keepdims, numeric_only, allow_lazy, kwargs) 3532 keepdims=keepdims, 3533 allow_lazy=allow_lazy, -> 3534 kwargs) 3535 3536 coord_names = set(k for k in self.coords if k in variables)

/glade/work/abanihi/devel/pangeo/xarray/xarray/core/variable.py in reduce(self, func, dim, axis, keep_attrs, keepdims, allow_lazy, kwargs) 1392 input_data = self.data if allow_lazy else self.values 1393 if axis is not None: -> 1394 data = func(input_data, axis=axis, kwargs) 1395 else: 1396 data = func(input_data, **kwargs)

/glade/work/abanihi/devel/pangeo/xarray/xarray/core/duck_array_ops.py in mean(array, axis, skipna, kwargs) 370 return _to_pytimedelta(mean_timedeltas, unit='us') + offset 371 else: --> 372 return _mean(array, axis=axis, skipna=skipna, kwargs) 373 374

/glade/work/abanihi/devel/pangeo/xarray/xarray/core/duck_array_ops.py in f(values, axis, skipna, kwargs) 257 258 try: --> 259 return func(values, axis=axis, kwargs) 260 except AttributeError: 261 if isinstance(values, dask_array_type):

/glade/work/abanihi/devel/pangeo/xarray/xarray/core/nanops.py in nanmean(a, axis, dtype, out) 158 return dask_array.nanmean(a, axis=axis, dtype=dtype) 159 --> 160 return np.nanmean(a, axis=axis, dtype=dtype) 161 162

/glade/work/abanihi/softwares/miniconda3/envs/xarray-tests/lib/python3.7/site-packages/numpy/core/overrides.py in public_api(args, kwargs) 163 relevant_args = dispatcher(args, **kwargs) 164 return implement_array_function( --> 165 implementation, public_api, relevant_args, args, kwargs) 166 167 if module is not None:

TypeError: no implementation found for 'numpy.nanmean' on types that implement array_function: [<class 'cupy.core.core.ndarray'>] ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support for __array_function__ implementers (sparse arrays) [WIP] 467771005

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 19.248ms · About: xarray-datasette