issue_comments
3 rows where author_association = "NONE", issue = 212561278 and user = 10554254 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
issue 1
- open_mfdataset() significantly slower on 0.9.1 vs. 0.8.2 · 3 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
344949160 | https://github.com/pydata/xarray/issues/1301#issuecomment-344949160 | https://api.github.com/repos/pydata/xarray/issues/1301 | MDEyOklzc3VlQ29tbWVudDM0NDk0OTE2MA== | friedrichknuth 10554254 | 2017-11-16T15:01:59Z | 2017-11-16T15:02:48Z | NONE | Looks like it has been resolved! Tested with the latest pre-release v0.10.0rc2 on the dataset linked by najascutellatus above. https://marine.rutgers.edu/~michaesm/netcdf/data/
xarray==0.10.0rc2-1-g8267fdb dask==0.15.4 ``` 194381 function calls (188429 primitive calls) in 0.869 seconds Ordered by: internal time List reduced from 469 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 50 0.393 0.008 0.393 0.008 {numpy.core.multiarray.arange} 50 0.164 0.003 0.557 0.011 indexing.py:266(index_indexer_1d) 5 0.083 0.017 0.085 0.017 netCDF4.py:185(open_netcdf4_group) 190 0.024 0.000 0.066 0.000 netCDF4.py:256(open_store_variable) 190 0.022 0.000 0.022 0.000 netCDF4_.py:29(init) 50 0.018 0.000 0.021 0.000 {operator.getitem} 5145/3605 0.012 0.000 0.019 0.000 indexing.py:493(shape) 2317/1291 0.009 0.000 0.094 0.000 _abcoll.py:548(update) 26137 0.006 0.000 0.013 0.000 {isinstance} 720 0.005 0.000 0.006 0.000 {method 'getncattr' of 'netCDF4._netCDF4.Variable' objects}
Ordered by: internal time List reduced from 659 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 30 87.527 2.918 87.527 2.918 {pandas._libs.tslib.array_to_timedelta64} 65 7.055 0.109 7.059 0.109 {operator.getitem} 80 0.799 0.010 0.799 0.010 {numpy.core.multiarray.arange} 7895/4420 0.502 0.000 0.524 0.000 utils.py:412(shape) 68 0.442 0.007 0.442 0.007 {pandas._libs.algos.ensure_object} 80 0.350 0.004 1.150 0.014 indexing.py:318(_index_indexer_1d) 60/30 0.296 0.005 88.407 2.947 timedeltas.py:158(_convert_listlike) 30 0.284 0.009 0.298 0.010 algorithms.py:719(checked_add_with_arr) 123 0.140 0.001 0.140 0.001 {method 'astype' of 'numpy.ndarray' objects} 1049/719 0.096 0.000 96.513 0.134 {numpy.core.multiarray.array} ``` |
{ "total_count": 3, "+1": 1, "-1": 0, "laugh": 0, "hooray": 2, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset() significantly slower on 0.9.1 vs. 0.8.2 212561278 | |
293619896 | https://github.com/pydata/xarray/issues/1301#issuecomment-293619896 | https://api.github.com/repos/pydata/xarray/issues/1301 | MDEyOklzc3VlQ29tbWVudDI5MzYxOTg5Ng== | friedrichknuth 10554254 | 2017-04-12T15:42:18Z | 2017-04-12T15:42:18Z | NONE | decode_times=False significantly reduces read time, but the proportional performance discrepancy between xarray 0.8.2 and 0.9.1 remains the same. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset() significantly slower on 0.9.1 vs. 0.8.2 212561278 | |
286220522 | https://github.com/pydata/xarray/issues/1301#issuecomment-286220522 | https://api.github.com/repos/pydata/xarray/issues/1301 | MDEyOklzc3VlQ29tbWVudDI4NjIyMDUyMg== | friedrichknuth 10554254 | 2017-03-13T19:41:25Z | 2017-03-13T19:41:25Z | NONE | Looks like the issue might be that xarray 0.9.1 is decoding all timestamps on load. xarray==0.9.1, dask==0.13.0 ``` da.set_options(get=da.async.get_sync) %prun -l 10 ds = xr.open_mfdataset('./*.nc')
Ordered by: internal time List reduced from 625 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function)
18 57.057 3.170 57.057 3.170 {pandas.tslib.array_to_timedelta64}
39 0.860 0.022 0.863 0.022 {operator.getitem}
48 0.402 0.008 0.402 0.008 {numpy.core.multiarray.arange}
4341/2463 0.257 0.000 0.273 0.000 utils.py:412(shape)
88 0.245 0.003 0.245 0.003 {pandas.algos.ensure_object}
48 0.158 0.003 0.561 0.012 indexing.py:318(_index_indexer_1d)
36/18 0.135 0.004 57.509 3.195 timedeltas.py:150(_convert_listlike)
18 0.126 0.007 0.130 0.007 nanops.py:815(_checked_add_with_arr)
51 0.070 0.001 0.070 0.001 {method 'astype' of 'numpy.ndarray' objects}
676/475 0.047 0.000 58.853 0.124 {numpy.core.multiarray.array}
xarray==0.8.2, dask==0.13.0 ``` da.set_options(get=da.async.get_sync) %prun -l 10 ds = xr.open_mfdataset('./*.nc')
Ordered by: internal time List reduced from 621 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 2571/1800 0.178 0.000 0.184 0.000 utils.py:387(shape) 18 0.174 0.010 0.174 0.010 {numpy.core.multiarray.arange} 16 0.079 0.005 0.079 0.005 {numpy.core.multiarray.concatenate} 483/420 0.077 0.000 0.125 0.000 {numpy.core.multiarray.array} 15 0.054 0.004 0.197 0.013 indexing.py:259(index_indexer_1d) 3 0.041 0.014 0.043 0.014 netCDF4.py:181(init) 105 0.013 0.000 0.057 0.001 netCDF4_.py:196(open_store_variable) 15 0.012 0.001 0.013 0.001 {operator.getitem} 2715/1665 0.007 0.000 0.178 0.000 indexing.py:343(shape) 5971 0.006 0.000 0.006 0.000 collections.py:71(setitem) ``` The version of dask is held constant in each test. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset() significantly slower on 0.9.1 vs. 0.8.2 212561278 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 1