issues: 1322491028

This data as json

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
1322491028	I_kwDOAMm_X85O05yU	6850	Slow lazy performance on cloud data	31974425	closed	0			3	2022-07-29T17:05:31Z	2022-09-12T18:39:05Z	2022-09-12T18:39:04Z	NONE				Hi, I am not sure if this is the place to raise my issue but I'd appreciate any help! I am trying to do a more complicated calculation with CESM cloud data (on pangeo cloud deployment) and am running into an issue on a simpler calculation as part of the workflow. In the process of taking the derivative the cell takes a very long time to run when differencing - even though this step is not even computing anything. It should run quickly but as you can see from the screen shot, the cell takes a long time to run. It shows runtime is ~20s but wall time is much longer (~2min). This becomes a serious issue when trying to take the derivative of multiple variables part of a larger workflow. @jbusecke and I replicated the differencing problem on a randomized dask dataset and, as you can see, the cell takes a much quicker time to run. Below I have pasted reproducible code that isolates the problem. I am not sure how to proceed on fixing this slow performance and would appreciate your help, thanks! ``` import xarray as xr import numpy as np import dask.array as dsa import pop_tools from xgcm import Grid import xgcm from intake import open_catalog Dask sample dataset test_values = dsa.random.random((14695, 2400, 3600), chunks=(1, 2400, 3600)) da_sample = xr.DataArray(test_values, dims=['time', 'x', 'y']) da_sample_u = xr.DataArray(test_values, dims=['time', 'x_u', 'y_u']) ds_sample = xr.Dataset(data_vars=dict(test_values=da_sample, u=da_sample_u)) %timeit ds_sample.pad({'nlon':(2,2)}).diff('nlon') Original dataset url = "https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/ocean/CESM_POP.yaml" cat = open_catalog(url) ds = cat["CESM_POP_hires_control"].to_dask() ds = ds.drop([d for d in ds.dims if d in ds.coords]) %timeit ds.pad({'nlon':(2,2)}).diff('nlon') ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6850/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		completed	13221727	issue

Links from other tables

1 row from issues_id in issues_labels
3 rows from issue in issue_comments