home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

18 rows where author_association = "NONE" and user = 25473287 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 8

  • API for multi-dimensional resampling/regridding 4
  • simple command line interface for xarray 3
  • Does interp() work on curvilinear grids (2D coordinates) ? 3
  • Allow DataArray to hold cell boundaries as coordinate variables 2
  • Use apply_ufunc in xESMF regridding package 2
  • apply_ufunc produces illegal coordinate sizes 2
  • apply_ufunc can generate an invalid object. 1
  • Adding tutorials to xarray documentation splash page 1

user 1

  • JiaweiZhuang · 18 ✖

author_association 1

  • NONE · 18 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
497155229 https://github.com/pydata/xarray/issues/2281#issuecomment-497155229 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQ5NzE1NTIyOQ== JiaweiZhuang 25473287 2019-05-30T00:24:36Z 2019-05-30T00:33:49Z NONE

An MPI error?!

@fspaolo Could you post a minimal reproducible example on xESMF's issue tracker? Just to keep this issue clean. The error looks like an ESMF installation problem that can happen on legacy OS, and it can be easily fixed by Docker or other containers.

It is surprising that a package targeting n-dimensional gridded datasets (particularly those from the geo/climate sciences) does not handle such a common task with spatial gridded data.

Just a side comment: This is a common but highly non-trivial task... Even small edges cases like periodic longitudes and polar singularities can cause interesting troubles. Otherwise I would just code up an algorithm in Xarray from scratch instead of relying on a heavy Fortran library. But things will get improved over time...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
416772369 https://github.com/pydata/xarray/issues/2378#issuecomment-416772369 https://api.github.com/repos/pydata/xarray/issues/2378 MDEyOklzc3VlQ29tbWVudDQxNjc3MjM2OQ== JiaweiZhuang 25473287 2018-08-28T23:24:46Z 2018-08-28T23:24:46Z NONE

Just FYI, I wrote a xarray tutorial at https://github.com/geoschem/GEOSChem-python-tutorial with Binder enabled.

I taught it in several GEOS-Chem user workshops and it turned out to work pretty well. Most of our users only know MATLAB&IDL, so I have to teach Python from scratch and then introduce xarray. I found IDL vs xarray a good example to "wow" new users. Manipulating NetCDF files is a real pain in those old languages. There is also a chapter on xESMF, of course😉

I use GEOS-Chem data as an example, but most contents are quite general and should be useful for other geoscience users.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Adding tutorials to xarray documentation splash page 353150483
404685906 https://github.com/pydata/xarray/issues/2281#issuecomment-404685906 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQwNDY4NTkwNg== JiaweiZhuang 25473287 2018-07-12T23:58:48Z 2018-07-13T18:24:02Z NONE

Do you have any proposal?

I guess it is not an API design problem yet... The algorithm is not here since interpn doesn't deal with curvilinear grids.

I think we could make dr.interp(xc=lon, yc=lat) work for the N-D -> M-D case by wrapping scipy.interpolate.griddata

My concern with scipy.interpolate.griddata is that the performance might be miserable... griddata takes an arbitrary stream of data points in a D-dimensional space. It doesn't know if those source data points have a gridded/mesh structure. A curvilinear grid mesh needs to be flatten into a stream of points before passed to griddata(). Might not be too bad for nearest-neighbour search, but very inefficient for linear/bilinear method, where knowing the mesh structure beforehand can save a lot of computation.

Utilizingscipy.interpolate.griddata would be a nice feature, but it should probably be used for data point streams (more like a Pandas dataframe method?), not as a way to handle curvilinear grids.

PS: I have some broader concerns regarding interp vs xESMF: JiaweiZhuang/xESMF#24

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
404387651 https://github.com/pydata/xarray/issues/2281#issuecomment-404387651 https://api.github.com/repos/pydata/xarray/issues/2281 MDEyOklzc3VlQ29tbWVudDQwNDM4NzY1MQ== JiaweiZhuang 25473287 2018-07-12T04:43:09Z 2018-07-13T00:11:44Z NONE

One way I can think of to make interp() work on this example: Define a new coordinate system (i.e. two new coordinate variables) on the source curvilinear grid, and rewrite the destination coordinate using those new coordinate variables (not lat, lon anymore).

But this is absolutely too convoluted...

Updated: see Gridded with Scipy for a similar idea.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Does interp() work on curvilinear grids (2D coordinates) ?  340486433
383454614 https://github.com/pydata/xarray/issues/2075#issuecomment-383454614 https://api.github.com/repos/pydata/xarray/issues/2075 MDEyOklzc3VlQ29tbWVudDM4MzQ1NDYxNA== JiaweiZhuang 25473287 2018-04-23T05:00:29Z 2018-04-23T05:00:29Z NONE

Looks like the same problem as #1931

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc can generate an invalid object. 316660970
378107700 https://github.com/pydata/xarray/issues/2034#issuecomment-378107700 https://api.github.com/repos/pydata/xarray/issues/2034 MDEyOklzc3VlQ29tbWVudDM3ODEwNzcwMA== JiaweiZhuang 25473287 2018-04-03T02:26:41Z 2018-04-03T02:26:41Z NONE

And this JupyterLab approach will be way better than ncview... Say, you can easily compare multiple NetCDF files by subdividing panels.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  simple command line interface for xarray 310547057
378106951 https://github.com/pydata/xarray/issues/2034#issuecomment-378106951 https://api.github.com/repos/pydata/xarray/issues/2034 MDEyOklzc3VlQ29tbWVudDM3ODEwNjk1MQ== JiaweiZhuang 25473287 2018-04-03T02:21:33Z 2018-04-03T02:21:33Z NONE

This would spawn a web server providing an interactive web-based GUI explorer for all variables in the dataset. You could use this locally or on a remote system.

Seems like JupyterLab is a perfect fit for this purpose. See this geojson extension for example. Notice that you can view a *.geojson file in a standalone window (shown as a map) and do not have to use Jupyter notebooks at all.

It should be possible to view a NetCDF file directly in JupyterLab, with an extension built on top of xarray+GeoViews. @philippjfr should have more insights on this...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  simple command line interface for xarray 310547057
378082894 https://github.com/pydata/xarray/issues/2034#issuecomment-378082894 https://api.github.com/repos/pydata/xarray/issues/2034 MDEyOklzc3VlQ29tbWVudDM3ODA4Mjg5NA== JiaweiZhuang 25473287 2018-04-02T23:45:40Z 2018-04-03T02:04:52Z NONE

thus replacement for ncview

GeoViews can make interactive plots of xarray data. There's an example.

An even more straightforward and customizable way is matplotlib + Jupyter Interact. It can easily replicate all ncview's functionalities.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  simple command line interface for xarray 310547057
367456457 https://github.com/pydata/xarray/issues/1931#issuecomment-367456457 https://api.github.com/repos/pydata/xarray/issues/1931 MDEyOklzc3VlQ29tbWVudDM2NzQ1NjQ1Nw== JiaweiZhuang 25473287 2018-02-21T20:13:29Z 2018-02-21T20:13:29Z NONE

@shoyer OK, I see that keeping the core dims does make sense in some cases. I am fine with doing something like

xr.apply_ufunc(apply_A, dr, input_core_dims=[['x']], output_core_dims=[['x_new']]).rename({'x_new': 'x'})

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc produces illegal coordinate sizes 298834332
367380855 https://github.com/pydata/xarray/issues/1931#issuecomment-367380855 https://api.github.com/repos/pydata/xarray/issues/1931 MDEyOklzc3VlQ29tbWVudDM2NzM4MDg1NQ== JiaweiZhuang 25473287 2018-02-21T16:17:44Z 2018-02-21T16:17:44Z NONE

@jhamman @rabernat Thanks for the help!

Raising an error when encountering this issue and adding keep_core_coords=False to optionally drop the coordinate would be a good solution for me.

But is there any case that we do want to keep the core coordinate? Since input_core_dims means "dimensions that should not be broadcast", I suppose that the output DataArray has no way to inherit these non-broadcasting dimensions? Should the core coordinate just be dropped by default?

Another more basic issue: Users are allowed the mess-up the coordinate dimension of an existing DataArray. Is this an expected behavior?

``` In [1]: import xarray as xr

In [2]: xr.DataArray([0, 1, 2, 3], dims='x', coords={'x':[0, 1]}) # this is not allowed (...) ValueError: conflicting sizes for dimension 'x': length 4 on the data but length 2 on coordinate 'x'

In [3]: dr = xr.DataArray([0, 1, 2, 3], dims='x', coords={'x':[0, 1, 2, 3]})

In [4]: dr['x'] = [0, 1] # but you can mess-up the coordinate dimension afterwards

In [5]: dr Out[5]: <xarray.DataArray (x: 4)> array([0, 1, 2, 3]) Coordinates: * x (x) int64 0 1

In [6]: dr.to_netcdf('wrong_coordinate.nc') (...) ValueError: conflicting sizes for dimension 'x': length 4 on 'xarray_dataarray_variable' and length 2 on 'x' ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc produces illegal coordinate sizes 298834332
357549143 https://github.com/pydata/xarray/issues/1822#issuecomment-357549143 https://api.github.com/repos/pydata/xarray/issues/1822 MDEyOklzc3VlQ29tbWVudDM1NzU0OTE0Mw== JiaweiZhuang 25473287 2018-01-14T22:44:15Z 2018-01-14T22:44:15Z NONE

I agree that they can be both implemented, and dask is useful for out-of-core. If anyone would like to contribute, please see JiaweiZhuang/xESMF#3 (comment) for my preliminary experiments with xr.apply_ufunc.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use apply_ufunc in xESMF regridding package 287969295
357533707 https://github.com/pydata/xarray/issues/1822#issuecomment-357533707 https://api.github.com/repos/pydata/xarray/issues/1822 MDEyOklzc3VlQ29tbWVudDM1NzUzMzcwNw== JiaweiZhuang 25473287 2018-01-14T19:05:29Z 2018-01-14T19:05:29Z NONE

Thanks for bringing this up... I've made more experiments and realized that Numba is actually faster than scipy.sparse, and also shows excellent parallel efficiency. See this notebook for all details. Thus I consider switch to Numba and add parallel support in the next version. It should fit better than xr.apply_ufunc in this case. Let's discuss in the original thread if you have further suggestions.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Use apply_ufunc in xESMF regridding package 287969295
325717752 https://github.com/pydata/xarray/issues/486#issuecomment-325717752 https://api.github.com/repos/pydata/xarray/issues/486 MDEyOklzc3VlQ29tbWVudDMyNTcxNzc1Mg== JiaweiZhuang 25473287 2017-08-29T16:23:07Z 2017-11-09T02:10:28Z NONE

I've wrapped ESMF/ESMPy by xarray: https://github.com/JiaweiZhuang/xESMF

It supports remapping between arbitrary quadrilateral grids, using ESMF's regridding algorithms including bilinear, conservative, nearest neighbour, etc... See this notebook for an example.

The package is still preliminary but it already works. See "Issues & Plans" in the main page for more details.

{
    "total_count": 7,
    "+1": 7,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  API for multi-dimensional resampling/regridding 96211612
343024897 https://github.com/pydata/xarray/issues/486#issuecomment-343024897 https://api.github.com/repos/pydata/xarray/issues/486 MDEyOklzc3VlQ29tbWVudDM0MzAyNDg5Nw== JiaweiZhuang 25473287 2017-11-09T02:09:13Z 2017-11-09T02:09:13Z NONE

I am thinking about the API design for xESMF (JiaweiZhuang/xESMF#9). Any comments are welcome 😃

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  API for multi-dimensional resampling/regridding 96211612
325998681 https://github.com/pydata/xarray/issues/486#issuecomment-325998681 https://api.github.com/repos/pydata/xarray/issues/486 MDEyOklzc3VlQ29tbWVudDMyNTk5ODY4MQ== JiaweiZhuang 25473287 2017-08-30T13:58:00Z 2017-08-30T13:58:00Z NONE

@ocefpaf Any plan for Python3-compatible ESMPy? I only see Python2.7 here: https://github.com/conda-forge/esmpy-feedstock

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  API for multi-dimensional resampling/regridding 96211612
325861556 https://github.com/pydata/xarray/issues/486#issuecomment-325861556 https://api.github.com/repos/pydata/xarray/issues/486 MDEyOklzc3VlQ29tbWVudDMyNTg2MTU1Ng== JiaweiZhuang 25473287 2017-08-30T02:36:08Z 2017-08-30T03:25:43Z NONE

@rabernat Thanks for the suggestion! I'll add tests&docs when time allows.

If you want to look into details: The package contains the two layers (explained in the "Design Idea" section). The first layer has nothing to do with xarray, but just provides a convenient way (only with numpy) to access a useful subset of ESMPy functions. This layer is important because ESMPy's API is too complicated, but once it is done it doesn't need to be changed too often. The second layer wraps the first layer using xarray. Most of the crafts will be added to the second layer.

As a temporary workaround, I've added another notebook for using the low-level wrapper, for interested developers.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  API for multi-dimensional resampling/regridding 96211612
314862336 https://github.com/pydata/xarray/issues/1475#issuecomment-314862336 https://api.github.com/repos/pydata/xarray/issues/1475 MDEyOklzc3VlQ29tbWVudDMxNDg2MjMzNg== JiaweiZhuang 25473287 2017-07-12T18:50:02Z 2017-07-13T01:48:32Z NONE

Probably the simplest option is to use structured dtypes, which should already work with the existing version of xarray, e.g.,

Thanks, that's a nice trick! Supporting da.x_bounds['start'] will definitely be helpful!

However, I am still concerned about 2D boundaries. Using the structured data type, 2D bounds will be an array of size (Nx,Ny,4) instead of (Nx+1,Ny+1). Although this matches the CF convention, it takes 4x memory and needs to be converted back to (Nx+1,Ny+1) for pcolormesh(). Not a big problem though. I will be happy to go this way if (Nx+1,Ny+1)-sized bounds cannot be implemented.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow DataArray to hold cell boundaries as coordinate variables 242181620
314604740 https://github.com/pydata/xarray/issues/1475#issuecomment-314604740 https://api.github.com/repos/pydata/xarray/issues/1475 MDEyOklzc3VlQ29tbWVudDMxNDYwNDc0MA== JiaweiZhuang 25473287 2017-07-11T23:58:20Z 2017-07-11T23:58:20Z NONE

See also #1079 and #1079 (comment)

Thanks! The idea of NDIntervalIndex mentioned at pandas-dev/pandas#7640 comment seems powerful but too complicated to implement? Could there be a simpler way to hook the boundary attribute to DataArray?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow DataArray to hold cell boundaries as coordinate variables 242181620

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 2490.551ms · About: xarray-datasette