issue_comments
10 rows where author_association = "MEMBER" and issue = 212561278 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- open_mfdataset() significantly slower on 0.9.1 vs. 0.8.2 · 10 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
344437569 | https://github.com/pydata/xarray/issues/1301#issuecomment-344437569 | https://api.github.com/repos/pydata/xarray/issues/1301 | MDEyOklzc3VlQ29tbWVudDM0NDQzNzU2OQ== | jhamman 2443309 | 2017-11-14T23:41:57Z | 2017-11-14T23:41:57Z | MEMBER | @friedrichknuth, any chance you can take a look at this with the latest v0.10 release candidate? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset() significantly slower on 0.9.1 vs. 0.8.2 212561278 | |
291516997 | https://github.com/pydata/xarray/issues/1301#issuecomment-291516997 | https://api.github.com/repos/pydata/xarray/issues/1301 | MDEyOklzc3VlQ29tbWVudDI5MTUxNjk5Nw== | rabernat 1197350 | 2017-04-04T14:27:18Z | 2017-04-04T14:27:18Z | MEMBER | My understanding is that you are concatenating across the variable My tests showed that it's not necessarily the concat step that is slowing this down. Your profiling suggest that it's a netcdf datetime decoding issue. I wonder if @shoyer or @jhamman have any ideas about how to improve performance here. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset() significantly slower on 0.9.1 vs. 0.8.2 212561278 | |
286220317 | https://github.com/pydata/xarray/issues/1301#issuecomment-286220317 | https://api.github.com/repos/pydata/xarray/issues/1301 | MDEyOklzc3VlQ29tbWVudDI4NjIyMDMxNw== | rabernat 1197350 | 2017-03-13T19:40:50Z | 2017-03-13T19:40:50Z | MEMBER | And the length of
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset() significantly slower on 0.9.1 vs. 0.8.2 212561278 | |
286219858 | https://github.com/pydata/xarray/issues/1301#issuecomment-286219858 | https://api.github.com/repos/pydata/xarray/issues/1301 | MDEyOklzc3VlQ29tbWVudDI4NjIxOTg1OA== | rabernat 1197350 | 2017-03-13T19:39:15Z | 2017-03-13T19:39:15Z | MEMBER | There is definitely something funky with these datasets that is causing xarray to go very slow. This is fast: ```python
But even just trying to print the repr is slow ```python
Maybe some of this has to do with the change at 0.9.0 to allowing index-less dimensions (i.e. coordinates are optional). All of these datasets have such a dimension, e.g.
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset() significantly slower on 0.9.1 vs. 0.8.2 212561278 | |
285149350 | https://github.com/pydata/xarray/issues/1301#issuecomment-285149350 | https://api.github.com/repos/pydata/xarray/issues/1301 | MDEyOklzc3VlQ29tbWVudDI4NTE0OTM1MA== | rabernat 1197350 | 2017-03-08T19:52:11Z | 2017-03-08T19:52:11Z | MEMBER | I just tried this on a few different datasets. Comparing python 2.7, xarray 0.7.2, dask 0.7.1 (an old environment I had on hand) with python 2.7, xarray 0.9.1-28-g1cad803, dask 0.13.0 (my current "production" environment), I could not reproduce. The up-to-date stack was faster by a factor of < 2. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset() significantly slower on 0.9.1 vs. 0.8.2 212561278 | |
285110824 | https://github.com/pydata/xarray/issues/1301#issuecomment-285110824 | https://api.github.com/repos/pydata/xarray/issues/1301 | MDEyOklzc3VlQ29tbWVudDI4NTExMDgyNA== | shoyer 1217238 | 2017-03-08T17:35:49Z | 2017-03-08T17:35:49Z | MEMBER |
Indeed, this is highly recommended, see http://dask.pydata.org/en/latest/faq.html |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset() significantly slower on 0.9.1 vs. 0.8.2 212561278 | |
284915063 | https://github.com/pydata/xarray/issues/1301#issuecomment-284915063 | https://api.github.com/repos/pydata/xarray/issues/1301 | MDEyOklzc3VlQ29tbWVudDI4NDkxNTA2Mw== | shoyer 1217238 | 2017-03-08T01:16:58Z | 2017-03-08T01:16:58Z | MEMBER | Hmm. It might be interesting to try |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset() significantly slower on 0.9.1 vs. 0.8.2 212561278 | |
284914442 | https://github.com/pydata/xarray/issues/1301#issuecomment-284914442 | https://api.github.com/repos/pydata/xarray/issues/1301 | MDEyOklzc3VlQ29tbWVudDI4NDkxNDQ0Mg== | jhamman 2443309 | 2017-03-08T01:13:35Z | 2017-03-08T01:13:35Z | MEMBER | This is what I'm seeing for my
Weren't there some recent changes to the thread lock related to dask distributed? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset() significantly slower on 0.9.1 vs. 0.8.2 212561278 | |
284908153 | https://github.com/pydata/xarray/issues/1301#issuecomment-284908153 | https://api.github.com/repos/pydata/xarray/issues/1301 | MDEyOklzc3VlQ29tbWVudDI4NDkwODE1Mw== | shoyer 1217238 | 2017-03-08T00:38:55Z | 2017-03-08T00:38:55Z | MEMBER | Wow, that is pretty bad. Try setting If that doesn't help, try downgrading dask to see if it's responsible. Profiling results from |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset() significantly slower on 0.9.1 vs. 0.8.2 212561278 | |
284905152 | https://github.com/pydata/xarray/issues/1301#issuecomment-284905152 | https://api.github.com/repos/pydata/xarray/issues/1301 | MDEyOklzc3VlQ29tbWVudDI4NDkwNTE1Mg== | jhamman 2443309 | 2017-03-08T00:22:10Z | 2017-03-08T00:22:10Z | MEMBER | I've also noticed that we have a bottleneck here. @shoyer - any idea what we changed that could impact this? Could this be coming from a change upstream in dask? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset() significantly slower on 0.9.1 vs. 0.8.2 212561278 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 3