home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

42 rows where issue = 638909879 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 8

  • pums974 22
  • fujiisoup 10
  • cyhsu 5
  • chrisroat 1
  • rabernat 1
  • lazyoracle 1
  • keewis 1
  • pep8speaks 1

author_association 3

  • CONTRIBUTOR 23
  • MEMBER 12
  • NONE 7

issue 1

  • Implement interp for interpolating between chunks of data (dask) · 42 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
687776559 https://github.com/pydata/xarray/pull/4155#issuecomment-687776559 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY4Nzc3NjU1OQ== lazyoracle 11018951 2020-09-06T12:27:15Z 2020-09-06T12:27:15Z NONE

@max-sixty Is there a timeline on when we can expect this feature in a stable release? Is it scheduled for the next minor release and to be made available on conda and pip?

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
674585106 https://github.com/pydata/xarray/pull/4155#issuecomment-674585106 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY3NDU4NTEwNg== pums974 1005109 2020-08-16T22:20:37Z 2020-08-16T22:21:43Z CONTRIBUTOR

And I forgot to take into account that your interpolation only need 48² points of the input array, so the input array will be reduced at the start of the process (you can replace every 100 by 48 in my previous answers)

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
674584614 https://github.com/pydata/xarray/pull/4155#issuecomment-674584614 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY3NDU4NDYxNA== pums974 1005109 2020-08-16T22:14:55Z 2020-08-16T22:16:05Z CONTRIBUTOR

I forgot to take into account that the interpolations are orthogonal So in sequential we are doing 2 interpolation first x then y In parallel we do the same: The fist interpolation will have 20 000 tasks, each task will have the totality of the input array, and compute an interpolation of 5 point of the output (x) producing an array of 5x100 per task or 100 000x100 full result as an intermediate array. The second interpolation will have 20 000² tasks each task will have a block of 5x100 point of the intermediate array and compute an interpolation on 5 point of the output (y) resulting in a 5² array per task and the 100 000² full result.

So plenty of room for overhead...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
674582930 https://github.com/pydata/xarray/pull/4155#issuecomment-674582930 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY3NDU4MjkzMA== pums974 1005109 2020-08-16T21:56:45Z 2020-08-16T22:05:21Z CONTRIBUTOR

In your case, each task (20 000²) will have the entire input (100²), and interpolate a few points (5²).

Maybe the overhead comes with duplicating the input array 20 000² times, maybe it comes with the fact that you are doing 20 000² small interpolation instead of 1 big interpolation

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
674579300 https://github.com/pydata/xarray/pull/4155#issuecomment-674579300 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY3NDU3OTMwMA== cyhsu 5323645 2020-08-16T21:18:48Z 2020-08-16T21:48:06Z NONE

Gotcha! Yes, it is. If I have many points in lat, lon, depth, and time, I should better chunk my input arrays at this stage to speed up the performance. The reason why I asked this question is I thought chunking the input array to do the interpolation should faster than if I didn't chunk the input array. But in my test case, it is not. Please see the attached.

The results I show here is the parallel one way slower than the normal case.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
674579280 https://github.com/pydata/xarray/pull/4155#issuecomment-674579280 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY3NDU3OTI4MA== pums974 1005109 2020-08-16T21:18:36Z 2020-08-16T21:18:36Z CONTRIBUTOR

Do this answer your question?

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
674578943 https://github.com/pydata/xarray/pull/4155#issuecomment-674578943 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY3NDU3ODk0Mw== pums974 1005109 2020-08-16T21:15:42Z 2020-08-16T21:16:41Z CONTRIBUTOR

If the input array is chunked in the interpolated dimension, the chunks will be merged during the interpolation.

This may induce a large memory cost at some point, but I do not know how to avoid it...

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
674578856 https://github.com/pydata/xarray/pull/4155#issuecomment-674578856 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY3NDU3ODg1Ng== cyhsu 5323645 2020-08-16T21:14:46Z 2020-08-16T21:14:46Z NONE

@pums974 then how about if we do the interpolation by using chunk input array to the chunk interpolated dimension?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
674578524 https://github.com/pydata/xarray/pull/4155#issuecomment-674578524 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY3NDU3ODUyNA== pums974 1005109 2020-08-16T21:11:49Z 2020-08-16T21:11:49Z CONTRIBUTOR

@cyhsu I can answer this question.

For best performance you should chunk the input array on the non interpolated dimensions and chunk the destination. Aka :

``` datax = xr.DataArray(data=np.arange(0, 4), coords={"x": np.linspace(0, 1, 4)}, dims="x") datay = xr.DataArray(data=da.from_array(np.arange(0, 4), chunks=2), coords={"y": np.linspace(0, 1, 4)}, dims="y") data = datax * datay

x = xr.DataArray(data = da.from_array(np.linspace(0,1), chunks=2), dims='x')

res = data.interp(x=x) ```

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
674577513 https://github.com/pydata/xarray/pull/4155#issuecomment-674577513 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY3NDU3NzUxMw== cyhsu 5323645 2020-08-16T21:02:50Z 2020-08-16T21:02:50Z NONE

@fujiisoup Thanks for the response. Since I have not updated my xarray package through this beta version. I hope you can answer my additional question for me. By considering the interpolation, which way is faster? a. chunk the dataset, and then do interpolation or b. chunk the interpolation list and then do interpolation?

a.

datax = xr.DataArray(data=da.from_array(np.arange(0, 4), chunks=2),
                     coords={"x": np.linspace(0, 1, 4)},
                     dims="x")
datay = xr.DataArray(data=da.from_array(np.arange(0, 4), chunks=2),
                     coords={"y": np.linspace(0, 1, 4)},
                     dims="y")
data = datax * datay

# both of these interp call fails
res = datax.interp(x=np.linspace(0, 1))
print(res.load())

res = data.interp(x=np.linspace(0, 1), y=0.5)
print(res.load())

b.

datax = xr.DataArray(data=np.arange(0, 4),
                     coords={"x": np.linspace(0, 1, 4)},
                     dims="x")
datay = xr.DataArray(data=np.arange(0, 4),
                     coords={"y": np.linspace(0, 1, 4)},
                     dims="y")
data = datax * datay

x = xr.DataArray(data = da.from_array(np.linspace(0,1), chunks=2), dims='x') res = data.interp(x=x)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
674321185 https://github.com/pydata/xarray/pull/4155#issuecomment-674321185 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY3NDMyMTE4NQ== fujiisoup 6815844 2020-08-15T00:30:21Z 2020-08-15T00:30:21Z MEMBER

@cyhsu Yes, because it is not yet released. (I'm not sure when the next release will be, but maybe a few months later) If you do pip install git+https://github.com/pydata/xarray, the current master will be installed in your system and interpolation over the chunks can be used. But note that this means you will install (a kind of) beta version.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
674319860 https://github.com/pydata/xarray/pull/4155#issuecomment-674319860 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY3NDMxOTg2MA== cyhsu 5323645 2020-08-15T00:22:07Z 2020-08-15T00:22:07Z NONE

@fujiisoup Thanks for letting me know. But I am still unable to do even though I have updated my xarray via "conda update xarray".

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
674305570 https://github.com/pydata/xarray/pull/4155#issuecomment-674305570 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY3NDMwNTU3MA== fujiisoup 6815844 2020-08-14T23:07:03Z 2020-08-14T23:07:03Z MEMBER

@cyhsu Yes, in the current master.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
674288483 https://github.com/pydata/xarray/pull/4155#issuecomment-674288483 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY3NDI4ODQ4Mw== cyhsu 5323645 2020-08-14T21:57:02Z 2020-08-14T21:57:02Z NONE

Hi Just curious about this. I followed the discussion since this issue addressed. Is this chunk interpolation solved already?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
672880182 https://github.com/pydata/xarray/pull/4155#issuecomment-672880182 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY3Mjg4MDE4Mg== pums974 1005109 2020-08-12T13:45:40Z 2020-08-12T13:45:40Z CONTRIBUTOR

You're welcome :)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
672348216 https://github.com/pydata/xarray/pull/4155#issuecomment-672348216 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY3MjM0ODIxNg== fujiisoup 6815844 2020-08-11T23:16:07Z 2020-08-11T23:16:07Z MEMBER

Thanks @pums974 :)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
667412134 https://github.com/pydata/xarray/pull/4155#issuecomment-667412134 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY2NzQxMjEzNA== fujiisoup 6815844 2020-07-31T22:28:07Z 2020-07-31T22:28:07Z MEMBER

This PR looks good for me. Maybe we can wait for a few days in case anyone has some comments on it. If no comments, I'll merge this then.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
644178631 https://github.com/pydata/xarray/pull/4155#issuecomment-644178631 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY0NDE3ODYzMQ== pep8speaks 24736507 2020-06-15T14:43:57Z 2020-07-31T18:56:12Z NONE

Hello @pums974! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:

Comment last updated at 2020-07-31 18:56:12 UTC
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
667299944 https://github.com/pydata/xarray/pull/4155#issuecomment-667299944 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY2NzI5OTk0NA== pums974 1005109 2020-07-31T18:55:48Z 2020-07-31T18:55:48Z CONTRIBUTOR

Hi.

I agree, part of this work might belong in dask. But I don't know dask internals enough to go there. In this case, everything was already in place.

Moreover I do think that there is room for optimization. In particular, in this implementation, the work is distributed along chunks corresponding to destination. This means that one may have big intermediate array. For example interpolating one value in a chunked vector will load the full vector in memory (first localization aside). In my previous implementation (and uglier), the interpolation was done with the chunks of the starting array. This might be a better choice sometimes.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
667255046 https://github.com/pydata/xarray/pull/4155#issuecomment-667255046 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY2NzI1NTA0Ng== chrisroat 1053153 2020-07-31T17:56:15Z 2020-07-31T17:56:15Z CONTRIBUTOR

Hi! This work is interesting to me, as I was implementing in dask an image processing algo which needs an intermediate 1-d linear interpolation step. This bottlenecks the calculation through a single node. Your work here on distributed interpolation is intriguing, and I'm wondering if it would be useful in my work and if it could possibly become part of dask itself.

Here is the particular function, which you'll note has a dask.delayed wrapper around np.interp.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
667216168 https://github.com/pydata/xarray/pull/4155#issuecomment-667216168 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY2NzIxNjE2OA== pums974 1005109 2020-07-31T16:33:08Z 2020-07-31T16:33:08Z CONTRIBUTOR

OK, I'm happy with the results now (better than my first submission of course).

I did not add so much tests since the result replace what was done before, thus the previous tests applies.

I'm going for some holidays so I won't work that much for the time being. But I'll be able to answer any questions.

Thanks for the reviewing and pushing me into doing a much better job.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
666720655 https://github.com/pydata/xarray/pull/4155#issuecomment-666720655 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY2NjcyMDY1NQ== fujiisoup 6815844 2020-07-30T21:38:55Z 2020-07-30T21:38:55Z MEMBER

OK. If you have additional time, it would be nicer if you could add more comments on tests, like what is being tested there ;)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
666565278 https://github.com/pydata/xarray/pull/4155#issuecomment-666565278 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY2NjU2NTI3OA== pums974 1005109 2020-07-30T17:57:51Z 2020-07-30T17:57:51Z CONTRIBUTOR

FYI, don't merge yet. I fixed a bug today, but did not push it. And there is some work to do on the testing side.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
665751218 https://github.com/pydata/xarray/pull/4155#issuecomment-665751218 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY2NTc1MTIxOA== pums974 1005109 2020-07-29T15:59:36Z 2020-07-29T15:59:36Z CONTRIBUTOR

Since I was on it, I extended the decomposition of orthogonal interpolation. If you want I can break this into two PR.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
665665786 https://github.com/pydata/xarray/pull/4155#issuecomment-665665786 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY2NTY2NTc4Ng== pums974 1005109 2020-07-29T13:30:50Z 2020-07-29T13:30:50Z CONTRIBUTOR

Guys, I got it. I managed to use da.blockwise which allows me to overcome all the previous limitations.

The result is much more simple, much more reliable.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
664245609 https://github.com/pydata/xarray/pull/4155#issuecomment-664245609 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY2NDI0NTYwOQ== pums974 1005109 2020-07-27T09:45:40Z 2020-07-27T09:45:40Z CONTRIBUTOR

While at it, I added the missing bit to make it work with cubic or quadratic method. I'm not touching the code anymore, waiting for review.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
663788117 https://github.com/pydata/xarray/pull/4155#issuecomment-663788117 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY2Mzc4ODExNw== fujiisoup 6815844 2020-07-25T01:08:52Z 2020-07-25T01:08:52Z MEMBER

Thanks @pums974 for this update and sorry for my late response. It looks good but I'll take a deeper look in the next week.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
663580990 https://github.com/pydata/xarray/pull/4155#issuecomment-663580990 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY2MzU4MDk5MA== pums974 1005109 2020-07-24T14:58:31Z 2020-07-24T15:00:12Z CONTRIBUTOR

@fujiisoup I managed to implement the support of unsorted interpolation.

Also, I reworked the tests, I now test for much more situations.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
661958330 https://github.com/pydata/xarray/pull/4155#issuecomment-661958330 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY2MTk1ODMzMA== pums974 1005109 2020-07-21T16:17:08Z 2020-07-21T16:17:08Z CONTRIBUTOR

Thanks, @pums974 for this update. I left some comments.

Can you add some tests for more edge cases? Something we may want to check would be

* scalar interpolation
* interpolation into an unsorted dimension (e.g., `da.interp(x=[0, 3, 2])`)

Your welcome, thanks for the feedback

  • scalar interpolation: you mean a test like test_interpolate_nd_scalar but between chunks ?
  • Unsorted interpolation: As I said, I did not looked into it, This need some work, presumably an argsort at the begining in order to interpolate in a sorted dimension and reorder the result into the requested order. I'm trying to implement it, but this seems a bit more challenging than I thought.
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
661030314 https://github.com/pydata/xarray/pull/4155#issuecomment-661030314 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY2MTAzMDMxNA== pums974 1005109 2020-07-20T13:11:54Z 2020-07-20T13:11:54Z CONTRIBUTOR

@fujiisoup I managed to solve the issues you raised about AttributeError: 'memoryview' object has no attribute 'dtype' This was due to datetime index. Using an IndexVariable seems to solve it

Also I realize that for 1d interpolation cubic and quadratic method are allowed which may not give the same result with chunked data (or even crash if there is not enough data in the chunked direction). Now they are forbidden

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
651694736 https://github.com/pydata/xarray/pull/4155#issuecomment-651694736 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY1MTY5NDczNg== pums974 1005109 2020-06-30T10:02:41Z 2020-06-30T10:02:41Z CONTRIBUTOR

I mean, in this case you have to interpolate in another direction. You cannot consider having a 1d function.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
650078484 https://github.com/pydata/xarray/pull/4155#issuecomment-650078484 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY1MDA3ODQ4NA== pums974 1005109 2020-06-26T09:15:05Z 2020-06-30T09:38:47Z CONTRIBUTOR

Thanks, That's weird, I have no problem in mine... What are your versions of dask and numpy ?

As for implementing this in dask, you may be right, it probably belong there, But I am even less use to their code base, and have no clue where to put it.

And for unsorted destination, that's something I didn't think about. maybe we can add an argsort at the beggining.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
651682646 https://github.com/pydata/xarray/pull/4155#issuecomment-651682646 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY1MTY4MjY0Ng== pums974 1005109 2020-06-30T09:38:32Z 2020-06-30T09:38:32Z CONTRIBUTOR

ok, but what about python res = data.interp(y=0.5)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
651589183 https://github.com/pydata/xarray/pull/4155#issuecomment-651589183 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY1MTU4OTE4Mw== fujiisoup 6815844 2020-06-30T07:01:31Z 2020-06-30T07:01:31Z MEMBER

Hum, ok, but I don't see how it would work if all points are between chunks (see my second example)

Maybe we can support sequential interpolation only at this moment. In this case, python res = data.interp(x=np.linspace(0, 1), y=0.5) can be interpreted as python res = data.interp(x=np.linspace(0, 1)).interp(y=0.5) which might not be too difficult.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
651581831 https://github.com/pydata/xarray/pull/4155#issuecomment-651581831 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY1MTU4MTgzMQ== pums974 1005109 2020-06-30T06:47:51Z 2020-06-30T06:47:51Z CONTRIBUTOR

Hum, ok, but I don't see how it would work if all points are between chunks (see my second example)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
650428037 https://github.com/pydata/xarray/pull/4155#issuecomment-650428037 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY1MDQyODAzNw== fujiisoup 6815844 2020-06-26T22:17:22Z 2020-06-26T22:17:22Z MEMBER

As for implementing this in dask, you may be right, it probably belong there, But I am even less use to their code base, and have no clue where to put it.

OK. Even so, I would suggest restructuring the code base; maybe we can add an interp1d equivalence into core.dask_array_ops.interp1d which works with dask-arrays (non-xarray object). It'll be easier to test. The API should be the as same with scipy.interp.interp1d as possible.

In missing.py, we can call this function.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
650347250 https://github.com/pydata/xarray/pull/4155#issuecomment-650347250 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY1MDM0NzI1MA== keewis 14808389 2020-06-26T19:05:45Z 2020-06-26T19:05:45Z MEMBER

@pums974, the CI gets the same error (e.g. here) so you should be able to reproduce this by setting up an environment with something like sh conda env create -f ci/requirements/py38.yml -n xarray-py38 conda activate xarray-py38

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
649836609 https://github.com/pydata/xarray/pull/4155#issuecomment-649836609 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY0OTgzNjYwOQ== fujiisoup 6815844 2020-06-25T21:53:36Z 2020-06-25T21:53:36Z MEMBER

Also in my local environment, it gives AttributeError: 'memoryview' object has no attribute 'dtype'

The full stack trace is ``` _________ test_interpolate_1d[1-y-cubic] ____________

method = 'cubic', dim = 'y', case = 1

@pytest.mark.parametrize("method", ["linear", "cubic"])
@pytest.mark.parametrize("dim", ["x", "y"])
@pytest.mark.parametrize("case", [0, 1])
def test_interpolate_1d(method, dim, case):
    if not has_scipy:
        pytest.skip("scipy is not installed.")

    if not has_dask and case in [1]:
        pytest.skip("dask is not installed in the environment.")

    da = get_example_data(case)
    xdest = np.linspace(0.0, 0.9, 80)

    actual = da.interp(method=method, **{dim: xdest})

    # scipy interpolation for the reference
    def func(obj, new_x):
        return scipy.interpolate.interp1d(
            da[dim],
            obj.data,
            axis=obj.get_axis_num(dim),
            bounds_error=False,
            fill_value=np.nan,
            kind=method,
        )(new_x)

    if dim == "x":
        coords = {"x": xdest, "y": da["y"], "x2": ("x", func(da["x2"], xdest))}
    else:  # y
        coords = {"x": da["x"], "y": xdest, "x2": da["x2"]}

    expected = xr.DataArray(func(da, xdest), dims=["x", "y"], coords=coords)
  assert_allclose(actual, expected)

xarray/tests/test_interp.py:86:


xarray/testing.py:132: in compat_variable return a.dims == b.dims and (a._data is b._data or equiv(a.data, b.data)) xarray/testing.py:31: in _data_allclose_or_equiv return duck_array_ops.allclose_or_equiv(arr1, arr2, rtol=rtol, atol=atol) xarray/core/duck_array_ops.py:221: in allclose_or_equiv arr1 = np.array(arr1) ../../../anaconda3/envs/xarray/lib/python3.7/site-packages/dask/array/core.py:1314: in array x = self.compute() ../../../anaconda3/envs/xarray/lib/python3.7/site-packages/dask/base.py:165: in compute (result,) = compute(self, traverse=False, kwargs) ../../../anaconda3/envs/xarray/lib/python3.7/site-packages/dask/base.py:436: in compute results = schedule(dsk, keys, kwargs) ../../../anaconda3/envs/xarray/lib/python3.7/site-packages/dask/local.py:527: in get_sync return get_async(apply_sync, 1, dsk, keys, kwargs) ../../../anaconda3/envs/xarray/lib/python3.7/site-packages/dask/local.py:494: in get_async fire_task() ../../../anaconda3/envs/xarray/lib/python3.7/site-packages/dask/local.py:466: in fire_task callback=queue.put, ../../../anaconda3/envs/xarray/lib/python3.7/site-packages/dask/local.py:516: in apply_sync res = func(*args, kwds) ../../../anaconda3/envs/xarray/lib/python3.7/site-packages/dask/local.py:227: in execute_task result = pack_exception(e, dumps) ../../../anaconda3/envs/xarray/lib/python3.7/site-packages/dask/local.py:222: in execute_task result = _execute_task(task, data) ../../../anaconda3/envs/xarray/lib/python3.7/site-packages/dask/core.py:119: in _execute_task return func(args2) ../../../anaconda3/envs/xarray/lib/python3.7/site-packages/dask/optimization.py:982: in call return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args))) ../../../anaconda3/envs/xarray/lib/python3.7/site-packages/dask/core.py:149: in get result = _execute_task(task, cache) ../../../anaconda3/envs/xarray/lib/python3.7/site-packages/dask/core.py:119: in _execute_task return func(args2) xarray/core/missing.py:830: in _dask_aware_interpnd return _interpnd(var, old_x, new_x, func, kwargs) xarray/core/missing.py:793: in _interpnd x, new_x = _floatize_x(x, new_x) xarray/core/missing.py:577: in _floatize_x if _contains_datetime_like_objects(x[i]): xarray/core/common.py:1595: in _contains_datetime_like_objects return is_np_datetime_like(var.dtype) or contains_cftime_datetimes(var) xarray/core/common.py:1588: in contains_cftime_datetimes return _contains_cftime_datetimes(var.data)


array = <memory at 0x7f771d6daef0>

def _contains_cftime_datetimes(array) -> bool:
    """Check if an array contains cftime.datetime objects
    """
    try:
        from cftime import datetime as cftime_datetime
    except ImportError:
        return False
    else:
      if array.dtype == np.dtype("O") and array.size > 0:

E AttributeError: 'memoryview' object has no attribute 'dtype'

xarray/core/common.py:1574: AttributeError ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
649827797 https://github.com/pydata/xarray/pull/4155#issuecomment-649827797 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY0OTgyNzc5Nw== fujiisoup 6815844 2020-06-25T21:30:17Z 2020-06-25T21:30:17Z MEMBER

Hi @pums974

Thanks for sending the PR. I'm working to review it, but it may take more time.

A few comments; Does it work with an unsorted destination? e.g., python da.interp(y=[0, -1, 2])

I'm feeling that the basic algorithm, such as np.interp-equivalence, should be interpreted in upstream. I'm sure Dask community welcomes this addition. Do you have an interest on it?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
649009492 https://github.com/pydata/xarray/pull/4155#issuecomment-649009492 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY0OTAwOTQ5Mg== pums974 1005109 2020-06-24T19:05:11Z 2020-06-24T19:05:11Z CONTRIBUTOR

No problem, we are all very busy. But thanks for your message.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
648342302 https://github.com/pydata/xarray/pull/4155#issuecomment-648342302 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY0ODM0MjMwMg== rabernat 1197350 2020-06-23T18:36:11Z 2020-06-23T18:36:11Z MEMBER

Thanks for this contribution @pums974! We appreciate your patience in awaiting a review of your PR.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879
644204355 https://github.com/pydata/xarray/pull/4155#issuecomment-644204355 https://api.github.com/repos/pydata/xarray/issues/4155 MDEyOklzc3VlQ29tbWVudDY0NDIwNDM1NQ== pums974 1005109 2020-06-15T15:27:55Z 2020-06-15T15:27:55Z CONTRIBUTOR

On my computer it passes pytest: ``` $> pytest . ======================= test session starts ================================= platform linux -- Python 3.8.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1 [...] ===== 3822 passed, 2710 skipped, 77 xfailed, 24 xpassed, 32 warnings in 48.25s ========

$> pip freeze appdirs==1.4.4 attrs==19.3.0 black==19.10b0 click==7.1.2 dask==2.18.1 flake8==3.8.3 isort==4.3.21 mccabe==0.6.1 more-itertools==8.4.0 numpy==1.18.5 packaging==20.4 pandas==1.0.4 pathspec==0.8.0 pluggy==0.13.1 py==1.8.2 pycodestyle==2.6.0 pyflakes==2.2.0 pyparsing==2.4.7 pytest==5.4.3 python-dateutil==2.8.1 pytz==2020.1 PyYAML==5.3.1 regex==2020.6.8 scipy==1.4.1 six==1.15.0 toml==0.10.1 toolz==0.10.0 typed-ast==1.4.1 wcwidth==0.2.4 -e git+git@github.com:pums974/xarray.git@c47a1d5d8fd7ca401a0dddea67574af00c4d8e3b#egg=xarray ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Implement interp for interpolating between chunks of data (dask) 638909879

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 17.83ms · About: xarray-datasette