home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 193514285

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/783#issuecomment-193514285 https://api.github.com/repos/pydata/xarray/issues/783 193514285 MDEyOklzc3VlQ29tbWVudDE5MzUxNDI4NQ== 4295853 2016-03-07T23:59:44Z 2016-03-07T23:59:44Z CONTRIBUTOR

@shoyer, the problem can be "resolved" by manually specifying the chunk size, e.g., https://gist.github.com/76dccfed2ff8e33b3a2a, specifically line 46: rlzns = rlzns.chunk({'Time':30}). The actual number appears to be unimportant, meaning 1 and 1000 also work.

So, following @mrocklin, I'd intuit that the issue is that the xarray rechunking algorithm has a bug and somehow (I'm guessing) there may be incompatible or inconsistent chunk sizes for each dask array spawned for each file. For some condition, the chunk size is getting perturbed by an error of one. It appears that setting chunk size manually ensures that the sizes are maintained for small chunk sizes and for large chunksizes are the maximum size of the chunk for each dask array.

Is it the design that chunk sizes are automatically changed following the indexing of rlzns.xParticle[rnum*Ntr:(rnum+1)*Ntr,:]?

@shoyer, I'm assuming you or your team could work through this bug quickly but if not, can you please provide me some high-level guidance on how to sort this out? In the short term, I can just set the chunk size manually to be 100 which I will confirm works in my application.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  138332032
Powered by Datasette · Queries took 0.873ms · About: xarray-datasette