home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 1704950804

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1704950804 I_kwDOAMm_X85ln3wU 7833 Slow performance of concat() 703554 closed 0     3 2023-05-11T02:39:36Z 2023-06-02T14:36:12Z 2023-06-02T14:36:12Z CONTRIBUTOR      

What is your issue?

In attempting to concatenate many datasets along a large dimension (total size ~100,000,000) I'm finding very slow performance, e.g., tens of seconds just to concatenate two datasets.

With some profiling, I find all the time is being spend in this list comprehension:

https://github.com/pydata/xarray/blob/51554f2638bc9e4a527492136fe6f54584ffa75d/xarray/core/concat.py#L584

I don't know exactly what's going on here, but it doesn't look right - e.g., if the size of the dimension to be concatenated is large, this list comprehension can run millions of loops, which doesn't seem related to the intended behaviour.

Sorry I don't have an MRE for this yet but please let me know if I can help further.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7833/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 3 rows from issue in issue_comments
Powered by Datasette · Queries took 0.637ms · About: xarray-datasette