home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

21 rows where author_association = "MEMBER", issue = 320275317 and user = 6815844 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • fujiisoup · 21 ✖

issue 1

  • implement interp() · 21 ✖

author_association 1

  • MEMBER · 21 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
395608335 https://github.com/pydata/xarray/pull/2104#issuecomment-395608335 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM5NTYwODMzNQ== fujiisoup 6815844 2018-06-08T00:33:56Z 2018-06-08T00:33:56Z MEMBER

Merged!

Thanks, @shoyer, @fmaussion, @jhamman, for the detailed review.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
395595931 https://github.com/pydata/xarray/pull/2104#issuecomment-395595931 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM5NTU5NTkzMQ== fujiisoup 6815844 2018-06-07T23:20:35Z 2018-06-07T23:20:35Z MEMBER

Thanks, @shoyer.

I think the last thing to do here is remove keep_attrs?

Done! (I thought I reverted this change already...)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
395243708 https://github.com/pydata/xarray/pull/2104#issuecomment-395243708 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM5NTI0MzcwOA== fujiisoup 6815844 2018-06-06T23:18:25Z 2018-06-06T23:18:25Z MEMBER

interp_like sounds a good idea. I like it in another PR as this is already huge.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
394932923 https://github.com/pydata/xarray/pull/2104#issuecomment-394932923 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM5NDkzMjkyMw== fujiisoup 6815844 2018-06-06T04:00:18Z 2018-06-06T04:00:18Z MEMBER

Thanks, @shoyer

We do the same thing for aggregations like mean(), behavior originally copied from pandas: http://pandas.pydata.org/pandas-docs/version/0.22/generated/pandas.DataFrame.mean.html

Thanks. I have not been surprised when mean drops the object type array. So it would be OK also for interp.

Maybe this PR is ready, then.

BTW, I noticed that our mean() does not accept the numeric_only keyword argument. Our docs do not say about numeric_only keyword either. Is it intended? (This behavior looks already accepted, and I think it is OK if we do not support this.)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
394532951 https://github.com/pydata/xarray/pull/2104#issuecomment-394532951 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM5NDUzMjk1MQ== fujiisoup 6815844 2018-06-04T23:33:42Z 2018-06-05T00:56:41Z MEMBER

Done!

But, I am slightly worrying if silently dropping non-numeric array is surprising. Do we have other methods that behave similarly?

Or maybe we can raise a warning?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
393761310 https://github.com/pydata/xarray/pull/2104#issuecomment-393761310 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM5Mzc2MTMxMA== fujiisoup 6815844 2018-06-01T05:20:59Z 2018-06-01T05:20:59Z MEMBER
  1. I suppose it probably does make sense to interpolate numeric coordinates, too. So this probably doesn't need to change.

Agreed.

  1. I do think we should drop non-numeric variables. There just isn't any way to meaningfully interpolate them (especially strings). This means we will probably need to add a check in DataArray.interp to ensure that the variable dtype is numeric and otherwise raise an error.

Also agreed. In theory, datetime dtype could be safely interpolated. Maybe we can support this if there is a demand.

It might be necessary to consider the same design issue also for interpolate_na, bfill, ffill.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
393032031 https://github.com/pydata/xarray/pull/2104#issuecomment-393032031 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM5MzAzMjAzMQ== fujiisoup 6815844 2018-05-30T05:08:12Z 2018-05-30T05:08:12Z MEMBER

Thanks for the careful review. All done.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
388602254 https://github.com/pydata/xarray/pull/2104#issuecomment-388602254 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM4ODYwMjI1NA== fujiisoup 6815844 2018-05-13T05:15:53Z 2018-05-13T05:15:53Z MEMBER

I think this is ready for final review

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
388217237 https://github.com/pydata/xarray/pull/2104#issuecomment-388217237 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM4ODIxNzIzNw== fujiisoup 6815844 2018-05-10T23:37:26Z 2018-05-10T23:37:26Z MEMBER

I think it would be useful to have a table in the docs showing the differences between / when to use interp, interpolate_na, reindex, sel(..., method='nearest'), resample.

Agreed. Thanks for the nice suggestion :+1: We have discussed a cheat sheet #1552 previously, but no work has been done yet.

I added a simple illustration presenting the difference between advanced-sel and -interp for this addition, and I think we can enrich this kind of figure to clarify the differences among these set of methods.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
388052465 https://github.com/pydata/xarray/pull/2104#issuecomment-388052465 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM4ODA1MjQ2NQ== fujiisoup 6815844 2018-05-10T13:26:06Z 2018-05-10T13:26:06Z MEMBER

@fmaussion , thanks for the example! It's a nicely simple but descriptive example. Updated docs.

After your suggestion about attrs, I found a bug in apply_ufunc (#2114). The tests are added but now failing because of this issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
388005058 https://github.com/pydata/xarray/pull/2104#issuecomment-388005058 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM4ODAwNTA1OA== fujiisoup 6815844 2018-05-10T09:37:34Z 2018-05-10T09:37:34Z MEMBER

Thank you for the double checking :+1:

it would be great to implement the keep_attrs kwarg as well, because interpolation often conserves the units of data.

Agreed. Is there any problem if we always keep attrs in the interpolation? .sel always keeps attrs and I think it would be better if users have less choice in each method.

BTW, do you have any idea about an example for the new interp feature? It would be nice if we have a good-looking example on the doc.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
387914727 https://github.com/pydata/xarray/pull/2104#issuecomment-387914727 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM4NzkxNDcyNw== fujiisoup 6815844 2018-05-10T00:29:12Z 2018-05-10T08:45:30Z MEMBER

Thanks, @fmaussion I didn't realize that scipy.interpolate.interpn does not sort the original coordinates (interp1d does). Thanks for pointing this out!

The implementation for you would be straightforward, you'd just have to pass np.arange(NN) to scipy instead of the actual coordinates.

As the interpolation routine can also be used for non-uniform gridded data, I don't think passing np.arange(NN) to scipy is a good idea (it will change the result if higher order method such as 'cubic' is used). Instead, I would like to call sortby in our interp routine, so that the array passed to scipy is always sorted in the ascending order.

The new API I would propose is python da.interp(x=[0, 1, 2], method='linear', assume_sorted=False, kwargs={'bounds_error': False, 'fill_value': 'extrapolate'}) with a new keyword assume_sorted.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
387583395 https://github.com/pydata/xarray/pull/2104#issuecomment-387583395 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM4NzU4MzM5NQ== fujiisoup 6815844 2018-05-09T00:29:36Z 2018-05-09T00:29:36Z MEMBER

As I think I've mentioned before, I think we should save interpolation of unstructured grids for later -- if only to keep the complexity of this PR to the minimum.

I agreed that it would be better to concentrate on structured interpolation only in this PR. I concerned about the interpolation along 1-dimensional but non-dimensional coordinate, which @fmaussion suggested in a review comment.

It is actually the structured interpolation, but currently, I do not support this just because of the analogy to .sel(). If we want to support this, we may need to decide its behavior in particular the dimension or coordinate of the interpolated array. I prefer not to support this now for keeping the interpolation rule simple.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
387576123 https://github.com/pydata/xarray/pull/2104#issuecomment-387576123 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM4NzU3NjEyMw== fujiisoup 6815844 2018-05-08T23:43:29Z 2018-05-08T23:46:17Z MEMBER

Thanks, @fmaussion

The current design question is

  • should we support the interpolation along the non-dimensional coordinate?
  • what should be the default value of 'bounds_error' and fill_value. scipy's interpolator assumes bounds_error=True by default, but I think bounds_error=False is more xarray-like, but it may be unintuitive for users very familiar with scipy's behavior.
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
387388988 https://github.com/pydata/xarray/pull/2104#issuecomment-387388988 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM4NzM4ODk4OA== fujiisoup 6815844 2018-05-08T12:41:29Z 2018-05-08T12:41:57Z MEMBER

Thanks, @fmaussion. I think it is almost ready.

The remaining things are + I am looking for a good-looking example for our docs on rtd. + Just before the merge, I'm going to rename missing.py to interp.py so that the filename is more descriptive.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
387295942 https://github.com/pydata/xarray/pull/2104#issuecomment-387295942 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM4NzI5NTk0Mg== fujiisoup 6815844 2018-05-08T06:13:49Z 2018-05-08T06:13:49Z MEMBER

So I think it might make sense to save skipna=True for interpolation on unstructured grids.

I agreed. I updated some of this PR.

The proposed API is python da.interp(x=[0, 1, 2], method='linear', kwargs={'bounds_error': False, 'fill_value': 'extrapolate'})

Honestly, I am not very happy with kwargs={} arguments (it would be more beautiful if they can be keyword arguments), but I think it is a necessary pain.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
386876701 https://github.com/pydata/xarray/pull/2104#issuecomment-386876701 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM4Njg3NjcwMQ== fujiisoup 6815844 2018-05-06T12:41:07Z 2018-05-06T12:41:07Z MEMBER

Yes, I am thinking this for multidimensional interpolation of nan-including arrays. (For 1-dimensional interpolation, we can interpolate them row by row) Although this should be another work (PR), I am wondering what the best API is.

Is skipna=True option handy? If so, which should be the default?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
386874264 https://github.com/pydata/xarray/pull/2104#issuecomment-386874264 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM4Njg3NDI2NA== fujiisoup 6815844 2018-05-06T12:00:33Z 2018-05-06T12:00:33Z MEMBER

RegularGridInterpolator handles NaNs pretty much the same way @shoyer describes it.

I did not certainly realize how RegularGridInterpolator works, but it seems to work with nan including array too as you pointed out. But higher order interpolations (RegularGridInterpolator is linear), such as RectBivariateSpline, would not work (return an all nan array) when just 1 nan is included in the target array. It is the same with higher order interpolations in scipy.interpolate.interp1d such as cubic.

Why don't leave the NaN handling part to the actual function doing the interpolation?

It is the original behavior of scipy.interpolate functions (except for linear or nearest methods), which do some matrix inversions in its algorithm.

I think the addition of the linear interpolation is still nice, but it would be nicer if we could also implement nan-skipping interpolations, which will enable us to use higher order interpolation with nan-including arrays.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
386854560 https://github.com/pydata/xarray/pull/2104#issuecomment-386854560 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM4Njg1NDU2MA== fujiisoup 6815844 2018-05-06T05:13:09Z 2018-05-06T05:13:09Z MEMBER

@shoyer , Yes, linear and nearest can handle NaN intuitively, but other methods need to invert some matrices. For example, cubic spline can not be used with nan-including arrays, ```python In [1]: x = np.arange(4) ...: y = np.array([0, 1, np.nan, 3]) ...:

In [2]: # 'linear' works as expected ...: interp1d(x, y, kind='linear')([0.5, 1.5]) ...: Out[2]: array([0.5, nan])

In [3]: # 'cubic' returns an all nan array ...: interp1d(x, y, kind='cubic')([0.5, 1.5]) ...: Out[3]: array([nan, nan]) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
386796193 https://github.com/pydata/xarray/pull/2104#issuecomment-386796193 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM4Njc5NjE5Mw== fujiisoup 6815844 2018-05-05T10:37:12Z 2018-05-05T10:37:12Z MEMBER

Maybe the possible API would be python da.interp(x=[0, 1, 2], skipna=True) for nan-skipping interpolation and python da.interp(x=[0, 1, 2], skipna=False) for faster interpolation.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317
386744768 https://github.com/pydata/xarray/pull/2104#issuecomment-386744768 https://api.github.com/repos/pydata/xarray/issues/2104 MDEyOklzc3VlQ29tbWVudDM4Njc0NDc2OA== fujiisoup 6815844 2018-05-04T21:53:05Z 2018-05-04T21:53:05Z MEMBER

@jhamman,

I just gave your implementation a quick skim and I think there is currently some duplication of efforts so maybe we look there to see what can be combined.

Agreed. I will try to generalize your interpolator-adaptor more.

@shoyer

perhaps we should give it a shorter name. Maybe interp?

Agreed. interp sounds nice.

Which NaN are you referring to? 1. data on the arrays being indexed 2. Coordinates on the arrays being indexed 3. Points at which to interpolate

I meant 1.

Scipy.interpolate.interp1d does not support arrays with NaN. So if we want to support this, we need to interpolate the array line by line as interpolate_na does, i.e. no vectorization. For the case of multidimensional interpolation of arrays with NaN, it is equivalent with the interpolation on non-grided data. It is possible, but the performance will significantly drop.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  implement interp() 320275317

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 32.288ms · About: xarray-datasette