home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 1599056009

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1599056009 I_kwDOAMm_X85fT6iJ 7559 Support specifying chunk sizes using labels (e.g. frequency string) 2448579 open 0     2 2023-02-24T17:44:03Z 2023-02-25T03:46:49Z   MEMBER      

Is your feature request related to a problem?

dask.dataframe supports repartitioning or rechunking using a frequency string (freq kwarg).

I think this would be a useful addition to .chunk. It would help with some groupby problems (as suggested in this comment) and generally make a few problems amenable to blockwise/map_blocks solutions.

Describe the solution you'd like

  1. One solution is to allow .chunk(lon=5, time="MS"). There is some ugliness in that this syntax mixes up integer index values (lon=5) and a label-based frequency string time="MS"
  2. So perhaps a second method chunk_by_labels would be useful where chunk_by_labels(lon=5, time="MS") would rechunk the data so that a single chunk contains 5° of longitude points and a month of time. Alternative this could be .chunk(lon=5, time="MS", by="labels")

Describe alternatives you've considered

Have the user do this manually but that's kind of annoying, and a bit advanced.

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7559/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 issue

Links from other tables

  • 3 rows from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 0.804ms · About: xarray-datasette