home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 365657502

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/1899#issuecomment-365657502 https://api.github.com/repos/pydata/xarray/issues/1899 365657502 MDEyOklzc3VlQ29tbWVudDM2NTY1NzUwMg== 291576 2018-02-14T16:13:16Z 2018-02-14T16:13:16Z CONTRIBUTOR

Oh, wow... this worked like a charm for the netcdf4 backend! I have a ~13GB (uncompressed) 4-D netcdf4 variable that was giving me trouble for slicing a 2D surface out of. Here is a snippet where I am grabbing data at random indices in the last dimension. First for a specific latitude, then for the entire domain. ```

CD_subset = rough['CD'][0] wind_inds_decorated <xarray.DataArray (latitude: 3501, longitude: 7001)> array([[33, 15, 25, ..., 52, 66, 35], [ 6, 8, 55, ..., 59, 6, 50], [54, 2, 40, ..., 32, 19, 9], ..., [53, 18, 23, ..., 19, 3, 43], [ 9, 11, 66, ..., 51, 39, 58], [21, 54, 37, ..., 3, 0, 65]]) Dimensions without coordinates: latitude, longitude foo = CD_subset.isel(latitude=0, wind_direction=wind_inds_decorated[0]) foo <xarray.DataArray 'CD' (longitude: 7001)> array([ 0.004052, 0.005915, 0.002771, ..., 0.005604, 0.004715, 0.002756], dtype=float32) Coordinates: scales int16 60 latitude float64 54.99 * longitude (longitude) float64 -130.0 -130.0 -130.0 -130.0 -130.0 ... wind_direction (longitude) int16 165 75 125 5 235 345 315 175 85 35 290 ... foo = CD_subset.isel(wind_direction=wind_inds_decorated) foo <xarray.DataArray 'CD' (latitude: 3501, longitude: 7001)> [24510501 values with dtype=float32] Coordinates: scales int16 60 * latitude (latitude) float64 54.99 54.98 54.97 54.96 54.95 54.95 ... * longitude (longitude) float64 -130.0 -130.0 -130.0 -130.0 -130.0 ... wind_direction (latitude, longitude) int64 165 75 125 5 235 345 315 175 ... ``` All previous attempts at this would result in having to load the entire 13GB array into memory just to get 93.5 MB out. Or, I would try to fetch each individual point, which took way too long. This worked faster than loading the entire thing into memory, and it used less memory, too (I think I maxed out at about 1.2GB of total usage, which is totally acceptable for my use case).

I will try out similar things with the pynio and rasterio backends, and get back to you. Thanks for this work!

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  295838143
Powered by Datasette · Queries took 0.726ms · About: xarray-datasette