home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 318950038

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
318950038 MDU6SXNzdWUzMTg5NTAwMzg= 2093 Default chunking in GeoTIFF images 306380 closed 0     10 2018-04-30T16:21:30Z 2020-06-18T06:27:07Z 2020-06-18T06:27:07Z MEMBER      

Given a tiled GeoTIFF image I'm looking for the best practice in reading it as a chunked dataset. I did this in this notebook by first opening the file with rasterio, looking at the block sizes, and then using those to inform the argument to chunks= in xarray.open_rasterio. This works, but is somewhat cumbersome because I also had to dive into the rasterio API. Do we want to provide defaults here?

In dask.array every time this has come up we've always shot it down, automatic chunking is error prone and hard to do well. However in these cases the object we're being given usually also conveys its chunking in a way that matches how dask.array thinks about it, so the extra cognitive load on the user has been somewhat low. Rasterio's model and API feel much more foreign to me though than a project like NetCDF or H5Py. I find myself wanting a chunks=True or chunks='100MB' option.

Thoughts on this? Is this in-scope? If so then what is the right API and what is the right policy for how to make xarray/dask.array chunks larger than GeoTIFF chunks?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2093/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 10 rows from issue in issue_comments
Powered by Datasette · Queries took 322.536ms · About: xarray-datasette