issue_comments: 324735578

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/1525#issuecomment-324735578	https://api.github.com/repos/pydata/xarray/issues/1525	324735578	MDEyOklzc3VlQ29tbWVudDMyNDczNTU3OA==	306380	2017-08-24T19:37:27Z	2017-08-24T19:37:27Z	MEMBER	To be explicit, by default `da.from_array` currently names arrays by hashing all of the data within them. This can be somewhat slow depending on what hashing libraries you have on your machine, generally something like 500-1000 MB/s. This buys you a deterministic name for your array. If someone else with the exact same data does the exact same operations that then Dask can track that and avoid repeated work. So you have to choose: Avoid repeated work Avoid hashing data The choice really depends on how often you plan to repeat the same computation on the same data that comes from the same numpy array. If you only ever call `a.chunk(...)` once per array then there is no reason to hash.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		252707680