html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2499#issuecomment-431718845,https://api.github.com/repos/pydata/xarray/issues/2499,431718845,MDEyOklzc3VlQ29tbWVudDQzMTcxODg0NQ==,12229877,2018-10-22T00:50:22Z,2018-10-22T00:50:22Z,CONTRIBUTOR,"I'd also try to find a way to use a groupby or apply_along_axis without stacking and unstacking the data, and to choose chunks that match the layout on disk - i.e. try `lon=1` if the order is `time, lat, lon`. If the time observations are *not* contiguous in memory, it's probably worth reshaping the whole array and writing it back to disk up front.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,372244156
https://github.com/pydata/xarray/issues/2499#issuecomment-431657200,https://api.github.com/repos/pydata/xarray/issues/2499,431657200,MDEyOklzc3VlQ29tbWVudDQzMTY1NzIwMA==,12229877,2018-10-21T10:30:23Z,2018-10-21T10:30:23Z,CONTRIBUTOR,"> `dataset = xr.open_dataset(netcdf_precip, chunks={'lat': 1})`
This makes me *really* suspicious - `lat=1` is a very very small chunk size, and completely unchunked in time and lon. Without knowing anything else, I'd try `chunks=dict(lat=200, lon=200)` or higher depending on the time dim - Dask is most efficient with chunks of around 10MB for most workloads.
This all also depends on the data layout on disk too - can you share `repr(xr.open_dataset(netcdf_precip))`? What does `ncdump` say?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,372244156