home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

8 rows where issue = 340486433 and user = 539688 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • fspaolo · 8 ✖

issue 1

  • Does interp() work on curvilinear grids (2D coordinates) ? · 8 ✖

author_association 1

  • NONE 8
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
497420732 https://github.com/pydata/xarray/issues/2281#issuecomment-497420732 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQ5NzQyMDczMg== fspaolo 539688 2019-05-30T17:50:32Z 2019-05-31T23:43:54Z NONE

@shoyer and @crusaderky That's right, that is how I was actually dealing with this problem prior trying xarray ... by flattening the grid coordinates and performing either gridding (with scipy's griddata) or interpolation (with scipy's map_coordinate) ... instead of performing proper regridding (from cube to cube without having to flatten anything).

As a rule of thumb, any fancy algorithm should first exist for numpy-only data and then potentially it can be wrapped by the xarray library.

This is important information.

For the record, here is so far what I found to be best performant:

``` import xarray as xr from scipy.interpolate import griddata

Here x/y are dummy 1D coords that wont be used.

da1 = xr.DataArray(cube1, [('t', t_cube1) , ('y', range(cube1.shape[1])), ('x', range(cube1.shape[2]))])

Regrid t_cube1 onto t_cube2 first since time will always map 1 to 1 between cubes.

This operation is very fast.

print('regridding in time ...') cube1 = da1.interp(t=t_cube2).values

Regrid each 2D field (X_cube1/Y_cube1 onto X_cube2/Y_cube2), one at a time

print('regridding in space ...') cube3 = np.full_like(cube2, np.nan) for k in range(t_cube2.shape[0]): print('regridding:', k) cube3[:,:,k] = griddata((X_cube1.ravel(), Y_cube1.ravel()), cube1[k,:,:].ravel(), (X_cube2, Y_cube2), fill_value=np.nan, method='linear') ```

Performance is not that bad... for ~150 time steps and ~1500 nodes in x and y it takes less than 10 min.

I think this can be sped up by computing the interpolation weights between grids in the first iteration and cache them (I think xESMF does this).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
497475534 https://github.com/pydata/xarray/issues/2281#issuecomment-497475534 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQ5NzQ3NTUzNA== fspaolo 539688 2019-05-30T20:32:24Z 2019-05-30T20:58:23Z NONE

@fspaolo where does that huge number come from? I thought you said you have 1500 nodes in total.

Not in total, I meant ~1500 on each x/y dimension (1500 x 1500). To be precise:

``` print(cube1.shape) # (t, y, x) (306, 240, 262)

print(cube2.shape) # (y, x, t) (1452, 1836, 104) ```

Did you select a single point on the t dimension before you applied bisplrep?

Yes.

Also, (pardon the ignorance, I never dealt with geographical data) what kind of information does having your lat and lon being bidimensional convey? Does it imply lat[i, j] < lat[i +1, j] and lon[i, j] < lon[i, j+1] for any possible (i, j)?

It does in my case, but cube1 is a rotated grid, meaning that lat is not the same along the x-axis neither lon is the same along the y-axis, while cube2 is a standard lon/lat grid so I can represent it simply by a 1D lon array (x-axis) and a 1D lat array (y-axis). To have them both with 1D coordinates I would have to either rotate cube2 and work with 1D rotated lon/lat, or unrotate cube1 and work with 1D standard lon/lat... but this gets me to the same problem as in the interpolation above since I have to transform (re-project) every 2D grid in the cube.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
497448335 https://github.com/pydata/xarray/issues/2281#issuecomment-497448335 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQ5NzQ0ODMzNQ== fspaolo 539688 2019-05-30T19:07:54Z 2019-05-30T19:08:47Z NONE

@crusaderky I also did the above using bisplrep and bispev, and it seems it cannot handle a grid of this size:

``` print(len(x), len(y), len(z)) 62880 62880 62880

print(len(x_new), len(y_new)) 2665872 2665872

Traceback (most recent call last): File "cubesmb.py", line 201, in <module> z_new = bisplev(x_new, y_new, tck) File "/Users/paolofer/anaconda3/lib/python3.7/site-packages/scipy/interpolate/_fitpack_impl.py", line 1047, in bisplev z, ier = _fitpack._bispev(tx, ty, c, kx, ky, x, y, dx, dy) RuntimeError: Cannot produce output of size 2665872x2665872 (size too large) ```

So griddata looks like my best option...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
497138085 https://github.com/pydata/xarray/issues/2281#issuecomment-497138085 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQ5NzEzODA4NQ== fspaolo 539688 2019-05-29T22:56:07Z 2019-05-30T00:14:09Z NONE

@crusaderky what I'm trying to do is what the title of this opened thread says "Does interp() work on curvilinear grids (2D coordinates)?" Working with (x, y, t) all 1D numpy arrays is an easy problem! But that's not the problem I'm dealing with.

Let me state exactly what my problem is. I have two data cubes, cube1 and cube2:

``` cube1 = xr.DataArray(cube1, dims=['t', 'y', 'x'], coords={'time': ('t', t_cube1), # 1D array 'lon': (['y', 'x'], X_cube1), # 2D array!!! 'lat': (['y', 'x'], Y_cube1)}) # 2D array!!!

cube2 = xr.DataArray(cube2, dims=['y', 'x', 't'], coords={'time': ('t', t_cube2), # 1D array 'lon': (['y', 'x'], X_cube2), # 2D array!!! 'lat': (['y', 'x'], Y_cube2)}) # 2D array!!! ```

All I want is to regrid cube1 onto cube2, or in other words, interpolate X_cube2/Y_cube2/t_cube2 onto cube1. Note that I cannot pass X_cube1/Y_cube1 as 1D arrays because they vary for every grid cell (this is a rotated grid). I can, however, pass X_cube2/Y_cube2 as 1D arrays (this is a standard lon/lat grid).

Also note the inverted time dimension between cube1 and cube2 (but that shouldn't be an issue).

So how to perform this operation... or am I missing something?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
497129090 https://github.com/pydata/xarray/issues/2281#issuecomment-497129090 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQ5NzEyOTA5MA== fspaolo 539688 2019-05-29T22:17:39Z 2019-05-29T22:17:39Z NONE

@JiaweiZhuang your approach doesn't work either. After installing you package and the dependencies... and following the documentation, I got

Fatal error in MPI_Init_thread: Other MPI error, error stack: MPIR_Init_thread(474)..............: MPID_Init(190).....................: channel initialization failed MPIDI_CH3_Init(89).................: MPID_nem_init(320).................: MPID_nem_tcp_init(173).............: MPID_nem_tcp_get_business_card(420): MPID_nem_tcp_init(379).............: gethostbyname failed, LMC-061769 (errno 1) [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=3191311 : system msg for write_line failure : Bad file descriptor

An MPI error?!

What bothers me are statements like this "For usability and simplicity"...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
497127039 https://github.com/pydata/xarray/issues/2281#issuecomment-497127039 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQ5NzEyNzAzOQ== fspaolo 539688 2019-05-29T22:09:17Z 2019-05-29T22:16:38Z NONE

@crusaderky It doesn't work. First, it tells me that if x_new/y_new are 2D, then I have to pass them as DataArray's... why do I have to do that? After doing that, it tells me there are conflicts between "overlapping" dimensions?! Ok, I then pass x_new/y_new as 1D arrays... and the result comes nonsense!

The only interpolation that works (both with DataArray.interp() and your splev()) is the one over the time dimension. This is because time is defined as a one-dimensional variable (t).

Why is it so hard to perform an interpolation with spatial coordinates defined with 2D variables?! I would think this is a pretty common operation on climate datasets...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
495790793 https://github.com/pydata/xarray/issues/2281#issuecomment-495790793 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQ5NTc5MDc5Mw== fspaolo 539688 2019-05-24T21:18:25Z 2019-05-24T21:20:57Z NONE

@crusaderky I don't think we need a "proper" 3d interpolation in most cases (i.e. predicting each 3d grid node considering all dimensions simultaneously). If you see my example above, DataArray.interp(x=x_new, y=y_new).interp(t=t_new), I am first interpolating over the spatial coordinates and then over time. If instead I do DataArray.interp(x=x_new, y=y_new, t=t_new), the computing time is prohibited for large ndarrays. In fact, I think DataArray.interp() should figure this out and do this decomposition internally when calling DataArray.interp(x=x_new, y=y_new, t=t_new).

The main limitation here, however, is being able to interpolate over the spatial coordinates when these are defined as 2d arrays. I'll check your package... thanks!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
495421770 https://github.com/pydata/xarray/issues/2281#issuecomment-495421770 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQ5NTQyMTc3MA== fspaolo 539688 2019-05-23T23:37:45Z 2019-05-23T23:53:33Z NONE

Great thread by @JiaweiZhuang! I just posted a question on stackoverflow about this exact problem. After hours navigating through the xarray documentation I finally found this opened issue... (since I'm new to xarray I thought the problem was my lack of understanding on how the package works)

It is surprising that a package targeting n-dimensional gridded datasets (particularly those from the geo/climate sciences) does not handle such a common task with spatial gridded data.

The problem on hand is this: I have two 3d arrays with different dimensions defined by 2d coordinates, all I want is to regrid the first cube onto the second. Is there a way to perform this operation with xarray?

This is what I've tried (which @JiaweiZhuang explained why it doesn't work):

``` da = xr.DataArray(cube, dims=['t', 'y', 'x'], coords={'t': time,
'xc': (['y', 'x'], X),
'yc': (['y', 'x'], Y)})

da_interp = da.interp(x=x_new, y=y_new).interp(t=t_new) ```

Here x_new/y_new should be mapped onto xc/yc (physical coordinates), not onto x/y (logical coordinates).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 11.672ms · About: xarray-datasette