home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 325777712

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1534#issuecomment-325777712 https://api.github.com/repos/pydata/xarray/issues/1534 325777712 MDEyOklzc3VlQ29tbWVudDMyNTc3NzcxMg== 4992424 2017-08-29T19:42:24Z 2017-08-29T19:42:24Z NONE

@mmartini-usgs, an entire netCDF file (as long as it only has 1 group, which it most likely does if we're talking about standard atmospheric/oceanic data) would be the equivalent of an xarray.Dataset. Each variable could be represented as a pandas.DataFrame, but with a MultiIndex - an index with multiple levels, but which are consist across each level.

To start with, you should read in your data using the chunks keyword to open_dataset(); this turns all of the data you read into dask arrays. Then, you use xarray Dataset and DataArray operations to manipulate them. So you can start, instead, by opening your data:

python ds = xr.open_dataset('hugefile.nc', chunks={<fill me in>}) ds_lp = ds.resample('H','time','mean')

You'd have to choose chunks based on the dimensions of your data. Like @rabernat previously mentioned, it's very likely you can perform your entire workflow within xarray without every having to drop down to pandas; let us know if you can share more details

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  253407851
Powered by Datasette · Queries took 0.801ms · About: xarray-datasette