issue_comments
6 rows where issue = 1646267547 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
issue 1
- open_mfdataset very slow · 6 ✖
| id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1489341690 | https://github.com/pydata/xarray/issues/7697#issuecomment-1489341690 | https://api.github.com/repos/pydata/xarray/issues/7697 | IC_kwDOAMm_X85YxYz6 | dcherian 2448579 | 2023-03-29T21:20:59Z | 2023-03-29T21:20:59Z | MEMBER |
we still construct a dataset representation for each file which involves reading all coordinates etc. The consistency checking is bypassed at the "concatenation" stage. You could also speed using dask by setting up a cluster and using |
{
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
open_mfdataset very slow 1646267547 | |
| 1489312337 | https://github.com/pydata/xarray/issues/7697#issuecomment-1489312337 | https://api.github.com/repos/pydata/xarray/issues/7697 | IC_kwDOAMm_X85YxRpR | groutr 10678620 | 2023-03-29T20:59:24Z | 2023-03-29T20:59:24Z | NONE | @dcherian I'll look at that. I thought the @headtr1ck I was just informed that the underlying filesystem is actually a networked filesystem. The PR might still be useful, but the latest profile seems more reasonable in light of my new info. |
{
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
open_mfdataset very slow 1646267547 | |
| 1489302292 | https://github.com/pydata/xarray/issues/7697#issuecomment-1489302292 | https://api.github.com/repos/pydata/xarray/issues/7697 | IC_kwDOAMm_X85YxPMU | dcherian 2448579 | 2023-03-29T20:53:37Z | 2023-03-29T20:53:37Z | MEMBER | Fundamentally, xarray has to touch every file because there is no guarantee they are consistent with each other. A number of us now use kerchunk to create virtual aggregate datasets that can be read a lot faster. |
{
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
open_mfdataset very slow 1646267547 | |
| 1489267595 | https://github.com/pydata/xarray/issues/7697#issuecomment-1489267595 | https://api.github.com/repos/pydata/xarray/issues/7697 | IC_kwDOAMm_X85YxGuL | groutr 10678620 | 2023-03-29T20:30:49Z | 2023-03-29T20:33:28Z | NONE |
I tried setting the engine to 'netcdf4' and while it did help a little bit, it still seems slow on my system. Here is my profile with I'm not sure what to make of this profile. I don't see anything in the file_manager that would be especially slow. Perhaps it is a filesystem bottleneck at this point (given that the cpu time is 132s of the total 288s duration). |
{
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
open_mfdataset very slow 1646267547 | |
| 1489146483 | https://github.com/pydata/xarray/issues/7697#issuecomment-1489146483 | https://api.github.com/repos/pydata/xarray/issues/7697 | IC_kwDOAMm_X85YwpJz | headtr1ck 43316012 | 2023-03-29T19:02:39Z | 2023-03-29T19:02:39Z | COLLABORATOR | It seems that this problematic code is mostly used to determine the engine that is used to finally open it. Did you try specifying the correct engine directly? |
{
"total_count": 1,
"+1": 1,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
open_mfdataset very slow 1646267547 | |
| 1489083542 | https://github.com/pydata/xarray/issues/7697#issuecomment-1489083542 | https://api.github.com/repos/pydata/xarray/issues/7697 | IC_kwDOAMm_X85YwZyW | Illviljan 14371165 | 2023-03-29T18:17:35Z | 2023-03-29T18:17:35Z | MEMBER | Looks like you almost got this figured out! You want to create a PR for this? |
{
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
open_mfdataset very slow 1646267547 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] (
[html_url] TEXT,
[issue_url] TEXT,
[id] INTEGER PRIMARY KEY,
[node_id] TEXT,
[user] INTEGER REFERENCES [users]([id]),
[created_at] TEXT,
[updated_at] TEXT,
[author_association] TEXT,
[body] TEXT,
[reactions] TEXT,
[performed_via_github_app] TEXT,
[issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
ON [issue_comments] ([user]);

user 4