issue_comments: 325447523
This data as json
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/pydata/xarray/issues/1534#issuecomment-325447523 | https://api.github.com/repos/pydata/xarray/issues/1534 | 325447523 | MDEyOklzc3VlQ29tbWVudDMyNTQ0NzUyMw== | 1197350 | 2017-08-28T19:03:09Z | 2017-08-28T19:03:09Z | MEMBER | Marinna, You are correct. In the present release of Xarray, converting to a pandas dataframe loads all of the data eagerly into memory as a regular pandas object, giving up dask's parallel capabilities and potentially consuming lots of memory. With chunked Xarray data, It would be preferable instead to convert to a dask.dataframe, rather than a regular pandas dataframe, which would carry over some of the performance benefits. This is a known issue: https://github.com/pydata/xarray/issues/1462 With a solution in the works: https://github.com/pydata/xarray/pull/1489 So hopefully a release of Xarray in the near future will have the feature you seek. Alternatively, if you describe the filtering, masking, and other QA/QC that you need to do in more detail, we may be able to help you accomplish this entirely within Xarray. Good luck! Ryan On Mon, Aug 28, 2017 at 2:02 PM, Marinna Martini notifications@github.com wrote:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
253407851 |