home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 417242425

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/2389#issuecomment-417242425 https://api.github.com/repos/pydata/xarray/issues/2389 417242425 MDEyOklzc3VlQ29tbWVudDQxNzI0MjQyNQ== 1882397 2018-08-30T08:53:21Z 2018-08-30T08:53:21Z NONE

Ah, that seems to do the trick. I get about 4.5s for both now, and the time spent pickeling stuff is down to reasonable levels (0.022s). Also the number of function calls dropped from 1e8 to 3e5 :-)

There still seems to be some inefficiency in the pickeled graph output, I'm getting a warning about large objects in the graph:

``` /Users/adrianseyboldt/anaconda3/lib/python3.6/site-packages/distributed/worker.py:840: UserWarning: Large object of size 1.31 MB detected in task graph: ('store-03165bae-ac28-11e8-b137-56001c88cd01', <xa ... t 0x316112cc0>) Consider scattering large objects ahead of time with client.scatter to reduce scheduler burden and keep data on workers

future = client.submit(func, big_data)    # bad

big_future = client.scatter(big_data)     # good
future = client.submit(func, big_future)  # good

% (format_bytes(len(b)), s)) ```

The size scales linearly with the number of chunks (it is 13MB if there are 5000 chunks). This doesn't seem to be nearly as problematic as the original issue though.

This is after applying both #2391 and #2261.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  355264812
Powered by Datasette · Queries took 76.461ms · About: xarray-datasette