home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 338424196

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/1489#issuecomment-338424196 https://api.github.com/repos/pydata/xarray/issues/1489 338424196 MDEyOklzc3VlQ29tbWVudDMzODQyNDE5Ng== 1217238 2017-10-21T18:49:57Z 2017-10-21T18:49:57Z MEMBER

@mrocklin are you saying that it's easier to properly rechunk data on the xarray side (as arrays) before converting to dask dataframes? That does make sense -- we have some nice structure (as multi-dimensional arrays) that is lost once the data gets put in a DataFrame.

In this case, I suppose we really should add a keyword argument like dims_order to to_dask_dataframe() that lets the user choose how they want to order dimensions on the result.

Initially, I was concerned about the resulting dask graphs when flattening out arrays in the wrong order. Although that would have bad performance implications if you need to stream the data from disk, I see now the total number of chunks no longer blows up, thanks to @pitrou's impressive rewrite of dask.array.reshape().

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  245624267
Powered by Datasette · Queries took 1.057ms · About: xarray-datasette