issues: 271957479

This data as json

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
271957479	MDU6SXNzdWUyNzE5NTc0Nzk=	1695	Diagnose groupby/groupby_bins issues	14314623	closed	0			3	2017-11-07T19:39:38Z	2017-11-09T16:36:26Z	2017-11-09T16:36:19Z	CONTRIBUTOR				Code Sample, a copy-pastable example if possible ```python import xarray as xr xr.version '0.9.6' ds = xr.open_dataset('../testing/Bianchi_o2.nc',chunks={'TIME':1}) ds <xarray.Dataset> Dimensions: (DEPTH: 33, LATITUDE: 180, LONGITUDE: 360, TIME: 12, bnds: 2) Coordinates: * LONGITUDE (LONGITUDE) float64 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 ... * LATITUDE (LATITUDE) float64 -89.5 -88.5 -87.5 -86.5 -85.5 -84.5 -83.5 ... * DEPTH (DEPTH) float64 0.0 10.0 20.0 30.0 50.0 75.0 100.0 125.0 ... * TIME (TIME) float64 15.0 44.0 73.5 104.0 134.5 165.0 195.5 226.5 ... Dimensions without coordinates: bnds Data variables: DEPTH_bnds (DEPTH, bnds) float64 -5.0 5.0 5.0 15.0 15.0 25.0 25.0 40.0 ... TIME_bnds (TIME, bnds) float64 0.5 29.5 29.5 58.75 58.75 88.75 88.75 ... O2_LINEAR (TIME, DEPTH, LATITUDE, LONGITUDE) float64 nan nan nan nan ... Attributes: history: FERRET V5.70 (alpha) 29-Sep-11 This runs as expected ds.isel(TIME=0).groupby_bins('O2_LINEAR', np.array([0,20,40,60,100])).max() This crashes the kernel ds.groupby_bins('O2_LINEAR', np.array([0,20,40,60,100])).max() ``` Problem description I am working on ocean oxygen data and would like to compute the volume of the ocean contained within a range of concentration values. I am trying to use groupby_bins but even with this modest size dataset (1 deg global resolution, 25 depth levels, 12 time steps) my kernel crashes every time without any error message. I eventually want to perform this step on several TB of ocean model output, so this is concerning. First of all I would like to ask if there is an easy way to diagnose the problem further. And secondly, are there recommendations how to compute the sum over groupby_bins for very large datasets (consisting out of dask arrays).	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1695/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		completed	13221727	issue

Links from other tables

0 rows from issues_id in issues_labels
3 rows from issue in issue_comments