issues: 318950038
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
318950038 | MDU6SXNzdWUzMTg5NTAwMzg= | 2093 | Default chunking in GeoTIFF images | 306380 | closed | 0 | 10 | 2018-04-30T16:21:30Z | 2020-06-18T06:27:07Z | 2020-06-18T06:27:07Z | MEMBER | Given a tiled GeoTIFF image I'm looking for the best practice in reading it as a chunked dataset. I did this in this notebook by first opening the file with rasterio, looking at the block sizes, and then using those to inform the argument to In dask.array every time this has come up we've always shot it down, automatic chunking is error prone and hard to do well. However in these cases the object we're being given usually also conveys its chunking in a way that matches how dask.array thinks about it, so the extra cognitive load on the user has been somewhat low. Rasterio's model and API feel much more foreign to me though than a project like NetCDF or H5Py. I find myself wanting a Thoughts on this? Is this in-scope? If so then what is the right API and what is the right policy for how to make xarray/dask.array chunks larger than GeoTIFF chunks? |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2093/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |