issue_comments: 78214797

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/364#issuecomment-78214797	https://api.github.com/repos/pydata/xarray/issues/364	78214797	MDEyOklzc3VlQ29tbWVudDc4MjE0Nzk3	1217238	2015-03-11T07:06:57Z	2015-03-11T07:06:57Z	MEMBER	The problem is that you've created a new `timeofday` dimension that is gigantic and orthogonal to all the other ones. You want `timeofday` to be along the `time` dimension. `d.groupby('timeofday').mean('time')` is literally doing the exact same calculation 70128 times. We also implicitly assume the coordinates corresponding to dimensions have unique labels, which explains why we aren't grouping 48 times instead. Also, unlike pandas, xray currently does the core loop for all groupby operations in pure Python, which means that yes, it will be slow when you have a very large number of groups (and it loops again to handle your 15 different variables). Using something like Cython or Numba to speedup groupby operations is on my to-do list, but I've found this to be less of a barrier than you might expect for multi-dimensional datasets -- individual group members tend to include more elements than in DataFrames.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		60303760