issue_comments
28 rows where issue = 331668890 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
issue 1
- Slow performance of isel · 28 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1468649950 | https://github.com/pydata/xarray/issues/2227#issuecomment-1468649950 | https://api.github.com/repos/pydata/xarray/issues/2227 | IC_kwDOAMm_X85XidHe | dcherian 2448579 | 2023-03-14T18:49:51Z | 2023-03-14T18:54:16Z | MEMBER | A reproducible example would help but indexing with dask arrays is a bit limited. On https://github.com/pydata/xarray/pull/5873 it's possible it will raise an error and ask you to compute the indexer. Also see https://github.com/dask/dask/issues/4156 EDIT: your slowdown is probably because it's compuing |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
1467929278 | https://github.com/pydata/xarray/issues/2227#issuecomment-1467929278 | https://api.github.com/repos/pydata/xarray/issues/2227 | IC_kwDOAMm_X85XftK- | dschwoerer 5637662 | 2023-03-14T11:32:10Z | 2023-03-14T11:32:10Z | CONTRIBUTOR | I see, they are not the same - the slow one is still a dask array, the other one is not:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
1464180874 | https://github.com/pydata/xarray/issues/2227#issuecomment-1464180874 | https://api.github.com/repos/pydata/xarray/issues/2227 | IC_kwDOAMm_X85XRaCK | shoyer 1217238 | 2023-03-10T18:04:23Z | 2023-03-10T18:04:23Z | MEMBER | @dschwoerer are you sure that you are actually calculating the same thing in both cases? What exactly do the values of |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
1463894170 | https://github.com/pydata/xarray/issues/2227#issuecomment-1463894170 | https://api.github.com/repos/pydata/xarray/issues/2227 | IC_kwDOAMm_X85XQUCa | dschwoerer 5637662 | 2023-03-10T14:36:43Z | 2023-03-10T14:36:43Z | CONTRIBUTOR | I just changed
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
558700154 | https://github.com/pydata/xarray/issues/2227#issuecomment-558700154 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDU1ODcwMDE1NA== | dcherian 2448579 | 2019-11-26T16:08:24Z | 2019-11-26T16:08:24Z | MEMBER | I don't know much about indexing but that PR propagates a "new" indexes property as part of #1603 (work towards enabling more flexible indexing), it doesn't change anything about "indexing". I think the dask docs may be more relevant to what you may be asking about: https://docs.dask.org/en/latest/array-slicing.html |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
558693816 | https://github.com/pydata/xarray/issues/2227#issuecomment-558693816 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDU1ODY5MzgxNg== | Hoeze 1200058 | 2019-11-26T15:54:25Z | 2019-11-26T15:54:25Z | NONE | Hi, I'd like to understand how |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
533193480 | https://github.com/pydata/xarray/issues/2227#issuecomment-533193480 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDUzMzE5MzQ4MA== | shoyer 1217238 | 2019-09-19T15:49:24Z | 2019-09-19T15:49:24Z | MEMBER | Yes, align checks The real mystery here is why |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
533119743 | https://github.com/pydata/xarray/issues/2227#issuecomment-533119743 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDUzMzExOTc0Mw== | dcherian 2448579 | 2019-09-19T13:00:40Z | 2019-09-19T13:00:40Z | MEMBER | I think align tries to optimize that case, so maybe something's also possible there? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
533036570 | https://github.com/pydata/xarray/issues/2227#issuecomment-533036570 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDUzMzAzNjU3MA== | crusaderky 6213168 | 2019-09-19T08:57:44Z | 2019-09-19T08:57:44Z | MEMBER | Can we short-circuit the special case where the index of the array used for slicing is the same object as the index being sliced, so no alignment is needed? ```python
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
533033540 | https://github.com/pydata/xarray/issues/2227#issuecomment-533033540 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDUzMzAzMzU0MA== | crusaderky 6213168 | 2019-09-19T08:49:32Z | 2019-09-19T08:49:32Z | MEMBER | Before #3319: ``` %timeit ds.a.values[time_filter] 158 ms ± 1.14 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) %timeit ds.a.isel(time=time_filter.values) 2.57 s ± 3.65 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) %timeit ds.a.isel(time=time_filter)
3.12 s ± 37.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit ds.a.isel(time=time_filter.values) 665 ms ± 6.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) %timeit ds.a.isel(time=time_filter) 1.15 s ± 1.55 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` Good job! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
532804542 | https://github.com/pydata/xarray/issues/2227#issuecomment-532804542 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDUzMjgwNDU0Mg== | shoyer 1217238 | 2019-09-18T18:17:22Z | 2019-09-18T18:17:22Z | MEMBER | https://github.com/pydata/xarray/pull/3319 gives us about a 2x performance boost. It could likely be much faster, but at least this fixes the regression. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
532787342 | https://github.com/pydata/xarray/issues/2227#issuecomment-532787342 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDUzMjc4NzM0Mg== | shoyer 1217238 | 2019-09-18T17:33:38Z | 2019-09-18T17:33:38Z | MEMBER | Yes, I'm seeing similar numbers, about 10x slower indexing in a DataArray. This seems to have gotten slower over time. It would be good to track this down and add a benchmark! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
532780068 | https://github.com/pydata/xarray/issues/2227#issuecomment-532780068 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDUzMjc4MDA2OA== | dcherian 2448579 | 2019-09-18T17:14:38Z | 2019-09-18T17:14:38Z | MEMBER | On master I'm seeing ``` %timeit ds.a.isel(time=time_filter) 3.65 s ± 29.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) %timeit ds.a.isel(time=time_filter.values) 2.99 s ± 15 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) %timeit ds.a.values[time_filter] 227 ms ± 6.59 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` Can someone else reproduce? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
454162334 | https://github.com/pydata/xarray/issues/2227#issuecomment-454162334 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDQ1NDE2MjMzNA== | max-sixty 5635139 | 2019-01-14T21:09:49Z | 2019-01-14T21:09:49Z | MEMBER | In an effort to reduce the issue backlog, I'll close this, but please reopen if you disagree |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
425224969 | https://github.com/pydata/xarray/issues/2227#issuecomment-425224969 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDQyNTIyNDk2OQ== | WeatherGod 291576 | 2018-09-27T20:05:05Z | 2018-09-27T20:05:05Z | CONTRIBUTOR | It would be ten files opened via xr.open_mfdataset() concatenated across a time dimension, each one looking like: ``` netcdf convect_gust_20180301_0000 { dimensions: latitude = 3502 ; longitude = 7002 ; variables: double latitude(latitude) ; latitude:_FillValue = NaN ; latitude:_Storage = "contiguous" ; latitude:_Endianness = "little" ; double longitude(longitude) ; longitude:_FillValue = NaN ; longitude:_Storage = "contiguous" ; longitude:_Endianness = "little" ; float gust(latitude, longitude) ; gust:_FillValue = NaNf ; gust:units = "m/s" ; gust:description = "gust winds" ; gust:_Storage = "chunked" ; gust:_ChunkSizes = 701, 1401 ; gust:_DeflateLevel = 8 ; gust:_Shuffle = "true" ; gust:_Endianness = "little" ; // global attributes: :start_date = "03/01/2018 00:00" ; :end_date = "03/01/2018 01:00" ; :interval = "half-open" ; :init_date = "02/28/2018 22:00" ; :history = "Created 2018-09-12 15:53:44.468144" ; :description = "Convective Downscaling, format V2.0" ; :_NCProperties = "version=1|netcdflibversion=4.6.1|hdf5libversion=1.10.1" ; :_SuperblockVersion = 0 ; :_IsNetcdf4 = 1 ; :_Format = "netCDF-4" ; ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
424945257 | https://github.com/pydata/xarray/issues/2227#issuecomment-424945257 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDQyNDk0NTI1Nw== | jhamman 2443309 | 2018-09-27T03:16:40Z | 2018-09-27T03:16:40Z | MEMBER | @WeatherGod - are you reading data from netCDF files by chance? If so, can you share the compression/chunk layout for those ( |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
424795330 | https://github.com/pydata/xarray/issues/2227#issuecomment-424795330 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDQyNDc5NTMzMA== | WeatherGod 291576 | 2018-09-26T17:06:44Z | 2018-09-26T17:06:44Z | CONTRIBUTOR | No, it does not make a difference. The example above peaks at around 5GB of memory (a bit much, but manageable). And it peaks similarly if we chunk it like you suggested. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
424549023 | https://github.com/pydata/xarray/issues/2227#issuecomment-424549023 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDQyNDU0OTAyMw== | shoyer 1217238 | 2018-09-26T00:54:24Z | 2018-09-26T00:54:24Z | MEMBER | @WeatherGod does adding something like |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
424485235 | https://github.com/pydata/xarray/issues/2227#issuecomment-424485235 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDQyNDQ4NTIzNQ== | WeatherGod 291576 | 2018-09-25T20:14:02Z | 2018-09-25T20:14:02Z | CONTRIBUTOR | Yeah, it looks like if |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
424479421 | https://github.com/pydata/xarray/issues/2227#issuecomment-424479421 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDQyNDQ3OTQyMQ== | WeatherGod 291576 | 2018-09-25T19:54:59Z | 2018-09-25T19:54:59Z | CONTRIBUTOR | Just for posterity, though, here is my simplified (working!) example: ``` import numpy as np import xarray as xr da = xr.DataArray(np.random.randn(10, 3000, 7000), dims=('time', 'latitude', 'longitude')) window = da.rolling(time=2).construct('win') indexes = window.argmax(dim='win') result = window.isel(win=indexes) ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
424477465 | https://github.com/pydata/xarray/issues/2227#issuecomment-424477465 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDQyNDQ3NzQ2NQ== | WeatherGod 291576 | 2018-09-25T19:48:20Z | 2018-09-25T19:48:20Z | CONTRIBUTOR | Huh, strange... I just tried a simplified version of what I was doing (particularly, no dask arrays), and everything worked fine. I'll have to investigate further. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
424473282 | https://github.com/pydata/xarray/issues/2227#issuecomment-424473282 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDQyNDQ3MzI4Mg== | max-sixty 5635139 | 2018-09-25T19:35:57Z | 2018-09-25T19:35:57Z | MEMBER | @WeatherGod do you have a reproducible example? I'm happy to have a look |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
424470752 | https://github.com/pydata/xarray/issues/2227#issuecomment-424470752 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDQyNDQ3MDc1Mg== | WeatherGod 291576 | 2018-09-25T19:27:28Z | 2018-09-25T19:27:28Z | CONTRIBUTOR | I am looking into a similar performance issue with isel, but it seems that the issue is that it is creating arrays that are much bigger than needed. For my multidimensional case (time/x/y/window), what should end up only taking a few hundred MB is spiking up to 10's of GB of used RAM. Don't know if this might be a possible source of performance issues. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
396725591 | https://github.com/pydata/xarray/issues/2227#issuecomment-396725591 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDM5NjcyNTU5MQ== | shoyer 1217238 | 2018-06-12T20:38:47Z | 2018-06-12T20:38:47Z | MEMBER | My measurements: ```
Given the size of this gap, I suspect this could be improved with some investigation and profiling, but there is certainly an upper-limit on the possible performance gain. One simple example is that indexing the dataset needs to index both |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
396675613 | https://github.com/pydata/xarray/issues/2227#issuecomment-396675613 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDM5NjY3NTYxMw== | rabernat 1197350 | 2018-06-12T17:45:48Z | 2018-06-12T17:45:48Z | MEMBER | Another part of the matrix of possibilities. Takes about half the time if you pass |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
396675417 | https://github.com/pydata/xarray/issues/2227#issuecomment-396675417 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDM5NjY3NTQxNw== | JohnMrziglod 4180033 | 2018-06-12T17:45:14Z | 2018-06-12T17:45:14Z | NONE | I am sorry @rabernat and @maxim-lian ,
the variable's name time and the simple example with the greater than filter are misleading. In general, it is about using a boolean mask via |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
396662676 | https://github.com/pydata/xarray/issues/2227#issuecomment-396662676 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDM5NjY2MjY3Ng== | max-sixty 5635139 | 2018-06-12T17:02:34Z | 2018-06-12T17:02:34Z | MEMBER | @rabernat that's a good solution where it's a slice When is a time that it needs to align a bool array? If you try and pass an array of unequal length, it doesn't work anyway: ```python In [12]: ds.a.isel(time=time_filter[:-1]) IndexError: Boolean array size 54999999 is used to index array with shape (55000000,). ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 | |
396660606 | https://github.com/pydata/xarray/issues/2227#issuecomment-396660606 | https://api.github.com/repos/pydata/xarray/issues/2227 | MDEyOklzc3VlQ29tbWVudDM5NjY2MDYwNg== | rabernat 1197350 | 2018-06-12T16:55:55Z | 2018-06-12T16:55:55Z | MEMBER | I don't have experience using Here's how I would recommend writing the query using label-based selection:
|
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Slow performance of isel 331668890 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 10