html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/2314#issuecomment-666422864,https://api.github.com/repos/pydata/xarray/issues/2314,666422864,MDEyOklzc3VlQ29tbWVudDY2NjQyMjg2NA==,4992424,2020-07-30T14:52:50Z,2020-07-30T14:52:50Z,NONE,"Hi @shaprann, I haven't re-visited this exact workflow recently, but one really good option (if you can manage the intermediate storage cost) would be to try to use new tools like http://github.com/pangeo-data/rechunker to pre-process and prepare your data archive prior to analysis. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,344621749 https://github.com/pydata/xarray/issues/2314#issuecomment-417175383,https://api.github.com/repos/pydata/xarray/issues/2314,417175383,MDEyOklzc3VlQ29tbWVudDQxNzE3NTM4Mw==,4992424,2018-08-30T03:09:41Z,2018-08-30T03:09:41Z,NONE,"Can you provide a `gdalinfo` of one of the GeoTiffs? I'm still working on some documentation for use-cases with cloud-optimized GeoTiffs to supplement @scottyhq's fantastic example notebook. One of the wrinkles I'm tracking down and trying to document is when exactly the GDAL->rasterio->dask->xarray pipeline eagerly load the entire file versus when it defers reading or reads subsets of files. So far, it seems that if the GeoTiff is appropriately chunked ahead of time (when it's written to disk), things basically work ""automagically.""","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,344621749