home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

1 row where issue = 318950038 and user = 1197350 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • rabernat · 1 ✖

issue 1

  • Default chunking in GeoTIFF images · 1 ✖

author_association 1

  • MEMBER 1
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
385451876 https://github.com/pydata/xarray/issues/2093#issuecomment-385451876 https://api.github.com/repos/pydata/xarray/issues/2093 MDEyOklzc3VlQ29tbWVudDM4NTQ1MTg3Ng== rabernat 1197350 2018-04-30T16:26:22Z 2018-04-30T16:26:22Z MEMBER

There is precedent for auto-aligning dask chunks with the underlying dataset chunks. This is what we do with the auto_chunk argument in open_zarr: http://xarray.pydata.org/en/latest/generated/xarray.open_zarr.html#xarray.open_zarr

On Mon, Apr 30, 2018 at 12:21 PM, Matthew Rocklin notifications@github.com wrote:

Given a tiled GeoTIFF image I'm looking for the best practice in reading it as a chunked dataset. I did this in this notebook https://gist.github.com/mrocklin/3df315e93d4bdeccf76db93caca2a9bd by first opening the file with rasterio, looking at the block sizes, and then using those to inform the argument to chunks= in xarray.open_rasterio. This works, but is somewhat cumbersome because I also had to dive into the rasterio API. Do we want to provide defaults here?

In dask.array every time this has come up we've always shot it down, automatic chunking is error prone and hard to do well. However in these cases the object we're being given usually also conveys its chunking in a way that matches how dask.array thinks about it, so the extra cognitive load on the user has been somewhat low. Rasterio's model and API feel much more foreign to me though than a project like NetCDF or H5Py. I find myself wanting a chunks=True or chunks='100MB' option.

Thoughts on this? Is this in-scope? If so then what is the right API and what is the right policy for how to make xarray/dask.array chunks larger than GeoTIFF chunks?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/2093, or mute the thread https://github.com/notifications/unsubscribe-auth/ABJFJoFj8_VGvsJ5oZQD4WTgW8Xx3lSyks5ttzoKgaJpZM4Ts2Hm .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Default chunking in GeoTIFF images 318950038

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 583.013ms · About: xarray-datasette