home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

13 rows where author_association = "NONE" and issue = 340486433 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 4

  • fspaolo 8
  • JiaweiZhuang 3
  • tollers 1
  • lgramer 1

issue 1

  • Does interp() work on curvilinear grids (2D coordinates) ? · 13 ✖

author_association 1

  • NONE · 13 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
687336494 https://github.com/pydata/xarray/issues/2281#issuecomment-687336494 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDY4NzMzNjQ5NA== lgramer 49291260 2020-09-04T19:24:32Z 2020-09-04T19:24:32Z NONE

I find xarray so useful in many ways, for which I'm grateful. But there are some current limitations that force me to hesitate before recommending it to colleagues. One is this issue - lack of support (or rather, I suspect simply no "clearly stated support"?) for curvilinear coordinate systems, which are pretty much ubiquitous in the work I do. The other issue which causes me to pause before recommending xarray wholeheartedly is the complexity (and frequent slowness and errors, still - all previously reported) in dealing with GRIB2 file formats that include multiple vertical coordinate systems (e.g., products from the NCEP Unified Post Processing System - UPP). But that's an issue for another thread... Any movement on wrapping scipy griddata (or some suitably more sophisticated scipy tool) within xarray's interface?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
553920017 https://github.com/pydata/xarray/issues/2281#issuecomment-553920017 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDU1MzkyMDAxNw== tollers 42300091 2019-11-14T14:47:52Z 2019-11-14T14:47:52Z NONE

Have there been any updates on the handling of multi-dimensional co-ordinates in xarray, in particular interpolation/indexing using such co-ordinates, as discussed here? For geographical data with curvilinear grids (using latitudes and longitudes) this issue can become a real headache to deal with.

{
    "total_count": 12,
    "+1": 12,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
497420732 https://github.com/pydata/xarray/issues/2281#issuecomment-497420732 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQ5NzQyMDczMg== fspaolo 539688 2019-05-30T17:50:32Z 2019-05-31T23:43:54Z NONE

@shoyer and @crusaderky That's right, that is how I was actually dealing with this problem prior trying xarray ... by flattening the grid coordinates and performing either gridding (with scipy's griddata) or interpolation (with scipy's map_coordinate) ... instead of performing proper regridding (from cube to cube without having to flatten anything).

As a rule of thumb, any fancy algorithm should first exist for numpy-only data and then potentially it can be wrapped by the xarray library.

This is important information.

For the record, here is so far what I found to be best performant:

``` import xarray as xr from scipy.interpolate import griddata

Here x/y are dummy 1D coords that wont be used.

da1 = xr.DataArray(cube1, [('t', t_cube1) , ('y', range(cube1.shape[1])), ('x', range(cube1.shape[2]))])

Regrid t_cube1 onto t_cube2 first since time will always map 1 to 1 between cubes.

This operation is very fast.

print('regridding in time ...') cube1 = da1.interp(t=t_cube2).values

Regrid each 2D field (X_cube1/Y_cube1 onto X_cube2/Y_cube2), one at a time

print('regridding in space ...') cube3 = np.full_like(cube2, np.nan) for k in range(t_cube2.shape[0]): print('regridding:', k) cube3[:,:,k] = griddata((X_cube1.ravel(), Y_cube1.ravel()), cube1[k,:,:].ravel(), (X_cube2, Y_cube2), fill_value=np.nan, method='linear') ```

Performance is not that bad... for ~150 time steps and ~1500 nodes in x and y it takes less than 10 min.

I think this can be sped up by computing the interpolation weights between grids in the first iteration and cache them (I think xESMF does this).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
497475534 https://github.com/pydata/xarray/issues/2281#issuecomment-497475534 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQ5NzQ3NTUzNA== fspaolo 539688 2019-05-30T20:32:24Z 2019-05-30T20:58:23Z NONE

@fspaolo where does that huge number come from? I thought you said you have 1500 nodes in total.

Not in total, I meant ~1500 on each x/y dimension (1500 x 1500). To be precise:

``` print(cube1.shape) # (t, y, x) (306, 240, 262)

print(cube2.shape) # (y, x, t) (1452, 1836, 104) ```

Did you select a single point on the t dimension before you applied bisplrep?

Yes.

Also, (pardon the ignorance, I never dealt with geographical data) what kind of information does having your lat and lon being bidimensional convey? Does it imply lat[i, j] < lat[i +1, j] and lon[i, j] < lon[i, j+1] for any possible (i, j)?

It does in my case, but cube1 is a rotated grid, meaning that lat is not the same along the x-axis neither lon is the same along the y-axis, while cube2 is a standard lon/lat grid so I can represent it simply by a 1D lon array (x-axis) and a 1D lat array (y-axis). To have them both with 1D coordinates I would have to either rotate cube2 and work with 1D rotated lon/lat, or unrotate cube1 and work with 1D standard lon/lat... but this gets me to the same problem as in the interpolation above since I have to transform (re-project) every 2D grid in the cube.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
497448335 https://github.com/pydata/xarray/issues/2281#issuecomment-497448335 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQ5NzQ0ODMzNQ== fspaolo 539688 2019-05-30T19:07:54Z 2019-05-30T19:08:47Z NONE

@crusaderky I also did the above using bisplrep and bispev, and it seems it cannot handle a grid of this size:

``` print(len(x), len(y), len(z)) 62880 62880 62880

print(len(x_new), len(y_new)) 2665872 2665872

Traceback (most recent call last): File "cubesmb.py", line 201, in <module> z_new = bisplev(x_new, y_new, tck) File "/Users/paolofer/anaconda3/lib/python3.7/site-packages/scipy/interpolate/_fitpack_impl.py", line 1047, in bisplev z, ier = _fitpack._bispev(tx, ty, c, kx, ky, x, y, dx, dy) RuntimeError: Cannot produce output of size 2665872x2665872 (size too large) ```

So griddata looks like my best option...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
497155229 https://github.com/pydata/xarray/issues/2281#issuecomment-497155229 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQ5NzE1NTIyOQ== JiaweiZhuang 25473287 2019-05-30T00:24:36Z 2019-05-30T00:33:49Z NONE

An MPI error?!

@fspaolo Could you post a minimal reproducible example on xESMF's issue tracker? Just to keep this issue clean. The error looks like an ESMF installation problem that can happen on legacy OS, and it can be easily fixed by Docker or other containers.

It is surprising that a package targeting n-dimensional gridded datasets (particularly those from the geo/climate sciences) does not handle such a common task with spatial gridded data.

Just a side comment: This is a common but highly non-trivial task... Even small edges cases like periodic longitudes and polar singularities can cause interesting troubles. Otherwise I would just code up an algorithm in Xarray from scratch instead of relying on a heavy Fortran library. But things will get improved over time...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
497138085 https://github.com/pydata/xarray/issues/2281#issuecomment-497138085 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQ5NzEzODA4NQ== fspaolo 539688 2019-05-29T22:56:07Z 2019-05-30T00:14:09Z NONE

@crusaderky what I'm trying to do is what the title of this opened thread says "Does interp() work on curvilinear grids (2D coordinates)?" Working with (x, y, t) all 1D numpy arrays is an easy problem! But that's not the problem I'm dealing with.

Let me state exactly what my problem is. I have two data cubes, cube1 and cube2:

``` cube1 = xr.DataArray(cube1, dims=['t', 'y', 'x'], coords={'time': ('t', t_cube1), # 1D array 'lon': (['y', 'x'], X_cube1), # 2D array!!! 'lat': (['y', 'x'], Y_cube1)}) # 2D array!!!

cube2 = xr.DataArray(cube2, dims=['y', 'x', 't'], coords={'time': ('t', t_cube2), # 1D array 'lon': (['y', 'x'], X_cube2), # 2D array!!! 'lat': (['y', 'x'], Y_cube2)}) # 2D array!!! ```

All I want is to regrid cube1 onto cube2, or in other words, interpolate X_cube2/Y_cube2/t_cube2 onto cube1. Note that I cannot pass X_cube1/Y_cube1 as 1D arrays because they vary for every grid cell (this is a rotated grid). I can, however, pass X_cube2/Y_cube2 as 1D arrays (this is a standard lon/lat grid).

Also note the inverted time dimension between cube1 and cube2 (but that shouldn't be an issue).

So how to perform this operation... or am I missing something?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
497129090 https://github.com/pydata/xarray/issues/2281#issuecomment-497129090 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQ5NzEyOTA5MA== fspaolo 539688 2019-05-29T22:17:39Z 2019-05-29T22:17:39Z NONE

@JiaweiZhuang your approach doesn't work either. After installing you package and the dependencies... and following the documentation, I got

Fatal error in MPI_Init_thread: Other MPI error, error stack: MPIR_Init_thread(474)..............: MPID_Init(190).....................: channel initialization failed MPIDI_CH3_Init(89).................: MPID_nem_init(320).................: MPID_nem_tcp_init(173).............: MPID_nem_tcp_get_business_card(420): MPID_nem_tcp_init(379).............: gethostbyname failed, LMC-061769 (errno 1) [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=3191311 : system msg for write_line failure : Bad file descriptor

An MPI error?!

What bothers me are statements like this "For usability and simplicity"...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
497127039 https://github.com/pydata/xarray/issues/2281#issuecomment-497127039 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQ5NzEyNzAzOQ== fspaolo 539688 2019-05-29T22:09:17Z 2019-05-29T22:16:38Z NONE

@crusaderky It doesn't work. First, it tells me that if x_new/y_new are 2D, then I have to pass them as DataArray's... why do I have to do that? After doing that, it tells me there are conflicts between "overlapping" dimensions?! Ok, I then pass x_new/y_new as 1D arrays... and the result comes nonsense!

The only interpolation that works (both with DataArray.interp() and your splev()) is the one over the time dimension. This is because time is defined as a one-dimensional variable (t).

Why is it so hard to perform an interpolation with spatial coordinates defined with 2D variables?! I would think this is a pretty common operation on climate datasets...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
495790793 https://github.com/pydata/xarray/issues/2281#issuecomment-495790793 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQ5NTc5MDc5Mw== fspaolo 539688 2019-05-24T21:18:25Z 2019-05-24T21:20:57Z NONE

@crusaderky I don't think we need a "proper" 3d interpolation in most cases (i.e. predicting each 3d grid node considering all dimensions simultaneously). If you see my example above, DataArray.interp(x=x_new, y=y_new).interp(t=t_new), I am first interpolating over the spatial coordinates and then over time. If instead I do DataArray.interp(x=x_new, y=y_new, t=t_new), the computing time is prohibited for large ndarrays. In fact, I think DataArray.interp() should figure this out and do this decomposition internally when calling DataArray.interp(x=x_new, y=y_new, t=t_new).

The main limitation here, however, is being able to interpolate over the spatial coordinates when these are defined as 2d arrays. I'll check your package... thanks!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
495421770 https://github.com/pydata/xarray/issues/2281#issuecomment-495421770 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQ5NTQyMTc3MA== fspaolo 539688 2019-05-23T23:37:45Z 2019-05-23T23:53:33Z NONE

Great thread by @JiaweiZhuang! I just posted a question on stackoverflow about this exact problem. After hours navigating through the xarray documentation I finally found this opened issue... (since I'm new to xarray I thought the problem was my lack of understanding on how the package works)

It is surprising that a package targeting n-dimensional gridded datasets (particularly those from the geo/climate sciences) does not handle such a common task with spatial gridded data.

The problem on hand is this: I have two 3d arrays with different dimensions defined by 2d coordinates, all I want is to regrid the first cube onto the second. Is there a way to perform this operation with xarray?

This is what I've tried (which @JiaweiZhuang explained why it doesn't work):

``` da = xr.DataArray(cube, dims=['t', 'y', 'x'], coords={'t': time,
'xc': (['y', 'x'], X),
'yc': (['y', 'x'], Y)})

da_interp = da.interp(x=x_new, y=y_new).interp(t=t_new) ```

Here x_new/y_new should be mapped onto xc/yc (physical coordinates), not onto x/y (logical coordinates).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
404685906 https://github.com/pydata/xarray/issues/2281#issuecomment-404685906 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQwNDY4NTkwNg== JiaweiZhuang 25473287 2018-07-12T23:58:48Z 2018-07-13T18:24:02Z NONE

Do you have any proposal?

I guess it is not an API design problem yet... The algorithm is not here since interpn doesn't deal with curvilinear grids.

I think we could make dr.interp(xc=lon, yc=lat) work for the N-D -> M-D case by wrapping scipy.interpolate.griddata

My concern with scipy.interpolate.griddata is that the performance might be miserable... griddata takes an arbitrary stream of data points in a D-dimensional space. It doesn't know if those source data points have a gridded/mesh structure. A curvilinear grid mesh needs to be flatten into a stream of points before passed to griddata(). Might not be too bad for nearest-neighbour search, but very inefficient for linear/bilinear method, where knowing the mesh structure beforehand can save a lot of computation.

Utilizingscipy.interpolate.griddata would be a nice feature, but it should probably be used for data point streams (more like a Pandas dataframe method?), not as a way to handle curvilinear grids.

PS: I have some broader concerns regarding interp vs xESMF: JiaweiZhuang/xESMF#24

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
404387651 https://github.com/pydata/xarray/issues/2281#issuecomment-404387651 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQwNDM4NzY1MQ== JiaweiZhuang 25473287 2018-07-12T04:43:09Z 2018-07-13T00:11:44Z NONE

One way I can think of to make interp() work on this example: Define a new coordinate system (i.e. two new coordinate variables) on the source curvilinear grid, and rewrite the destination coordinate using those new coordinate variables (not lat, lon anymore).

But this is absolutely too convoluted...

Updated: see Gridded with Scipy for a similar idea.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1202.161ms · About: xarray-datasette