issues: 271957479
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
271957479 | MDU6SXNzdWUyNzE5NTc0Nzk= | 1695 | Diagnose groupby/groupby_bins issues | 14314623 | closed | 0 | 3 | 2017-11-07T19:39:38Z | 2017-11-09T16:36:26Z | 2017-11-09T16:36:19Z | CONTRIBUTOR | Code Sample, a copy-pastable example if possible```python import xarray as xr xr.version
ds = xr.open_dataset('../testing/Bianchi_o2.nc',chunks={'TIME':1}) ds
This runs as expectedds.isel(TIME=0).groupby_bins('O2_LINEAR', np.array([0,20,40,60,100])).max() This crashes the kernelds.groupby_bins('O2_LINEAR', np.array([0,20,40,60,100])).max() ``` Problem descriptionI am working on ocean oxygen data and would like to compute the volume of the ocean contained within a range of concentration values. I am trying to use groupby_bins but even with this modest size dataset (1 deg global resolution, 25 depth levels, 12 time steps) my kernel crashes every time without any error message. I eventually want to perform this step on several TB of ocean model output, so this is concerning. First of all I would like to ask if there is an easy way to diagnose the problem further. And secondly, are there recommendations how to compute the sum over groupby_bins for very large datasets (consisting out of dask arrays). |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1695/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |