issue_comments
12 rows where issue = 208903781 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- Rolling window operation does not work with dask arrays · 12 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
328731021 | https://github.com/pydata/xarray/issues/1279#issuecomment-328731021 | https://api.github.com/repos/pydata/xarray/issues/1279 | MDEyOklzc3VlQ29tbWVudDMyODczMTAyMQ== | jhamman 2443309 | 2017-09-12T04:13:37Z | 2017-09-12T04:13:37Z | MEMBER | see #1568 for PR that adds this |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Rolling window operation does not work with dask arrays 208903781 | |
328690191 | https://github.com/pydata/xarray/issues/1279#issuecomment-328690191 | https://api.github.com/repos/pydata/xarray/issues/1279 | MDEyOklzc3VlQ29tbWVudDMyODY5MDE5MQ== | jhamman 2443309 | 2017-09-11T23:48:58Z | 2017-09-12T04:13:15Z | MEMBER | @darothen and @shoyer - Here's a little wrapper function that does the dask and bottleneck piece...
I don't think this would be all that difficult to drop into our current |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Rolling window operation does not work with dask arrays 208903781 | |
328724745 | https://github.com/pydata/xarray/issues/1279#issuecomment-328724745 | https://api.github.com/repos/pydata/xarray/issues/1279 | MDEyOklzc3VlQ29tbWVudDMyODcyNDc0NQ== | jhamman 2443309 | 2017-09-12T03:30:20Z | 2017-09-12T03:30:20Z | MEMBER | @darothen - I'll open a PR in a few minutes. I'll fix the typos. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Rolling window operation does not work with dask arrays 208903781 | |
328724595 | https://github.com/pydata/xarray/issues/1279#issuecomment-328724595 | https://api.github.com/repos/pydata/xarray/issues/1279 | MDEyOklzc3VlQ29tbWVudDMyODcyNDU5NQ== | darothen 4992424 | 2017-09-12T03:29:29Z | 2017-09-12T03:29:29Z | NONE | @shoyer - This output is usually provided as a sequence of daily netCDF files, each on a ~2 degree global grid with 24 timesteps per file (so shape 24 x 96 x 144). For convenience, I usually concatenate these files into yearly datasets, so they'll have a shape (8736 x 96 x 144). I haven't played too much with how to chunk the data, but it's not uncommon for me to load 20-50 of these files simultaneously (each holding a years worth of data) and treat each year as an "ensemble member dimension, so my data has shape (50 x 8736 x 96 x 144). Yes, keeping everything in dask array land is preferable, I suppose. @jhamman - Wow, that worked pretty much perfectly! There's a handful of typos (you switch from "a" to "x" halfway through), and there's a lot of room for optimization by chunksize. But it just works, which is absolutely ridiculous. I just pushed a ~200 GB dataset on my cluster with ~50 cores and it screamed through the calculation. Is there anyway this could be pushed before 0.10.0? It's a killer enhancement. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Rolling window operation does not work with dask arrays 208903781 | |
328315251 | https://github.com/pydata/xarray/issues/1279#issuecomment-328315251 | https://api.github.com/repos/pydata/xarray/issues/1279 | MDEyOklzc3VlQ29tbWVudDMyODMxNTI1MQ== | shoyer 1217238 | 2017-09-10T02:24:22Z | 2017-09-10T02:24:22Z | MEMBER | @darothen Can you give an example of typical My sense is that we would do better to keep everything in the form of (dask) arrays, rather than converting into dataframes. For the highest performance, I would make a dask array routine that combines ghosting, map blocks and bottleneck's rolling window functions. Then it should be straightforward into rolling in place of the existing bottleneck routine. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Rolling window operation does not work with dask arrays 208903781 | |
328314676 | https://github.com/pydata/xarray/issues/1279#issuecomment-328314676 | https://api.github.com/repos/pydata/xarray/issues/1279 | MDEyOklzc3VlQ29tbWVudDMyODMxNDY3Ng== | darothen 4992424 | 2017-09-10T02:04:33Z | 2017-09-10T02:04:33Z | NONE | In light of #1489 is there a way to move forward here with In soliciting the atmospheric chemistry community for a few illustrative examples for gcpy, it's become apparent that indices computed from re-sampled timeseries would be killer, attention-grabbing functionality. For instance, the EPA air quality standard we use for ozone involves taking hourly data, computing 8-hour rolling means for each day of your dataset, and then picking the maximum of those means for each day ("MDA8 ozone"). Similar metrics exist for other pollutants. With traditional xarray data-structures, it's trivial to compute this quantity (assuming we have hourly data and using the new resample API from #1272):
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Rolling window operation does not work with dask arrays 208903781 | |
302137119 | https://github.com/pydata/xarray/issues/1279#issuecomment-302137119 | https://api.github.com/repos/pydata/xarray/issues/1279 | MDEyOklzc3VlQ29tbWVudDMwMjEzNzExOQ== | shoyer 1217238 | 2017-05-17T15:59:58Z | 2017-05-17T15:59:58Z | MEMBER | @darothen we would need to add xarray -> dask dataframe conversion functions, which don't currently exist. Otherwise I think we would still need to rewrite this (but of course the dataframe implementation could be a useful reference point). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Rolling window operation does not work with dask arrays 208903781 | |
301489242 | https://github.com/pydata/xarray/issues/1279#issuecomment-301489242 | https://api.github.com/repos/pydata/xarray/issues/1279 | MDEyOklzc3VlQ29tbWVudDMwMTQ4OTI0Mg== | darothen 4992424 | 2017-05-15T14:18:55Z | 2017-05-15T14:18:55Z | NONE | Dask dataframes have recently been updated so that rolling operations work (dask/dask#2198). Does this open a pathway to enable rolling on dask arrays within xarray? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Rolling window operation does not work with dask arrays 208903781 | |
284133376 | https://github.com/pydata/xarray/issues/1279#issuecomment-284133376 | https://api.github.com/repos/pydata/xarray/issues/1279 | MDEyOklzc3VlQ29tbWVudDI4NDEzMzM3Ng== | shoyer 1217238 | 2017-03-04T07:06:25Z | 2017-03-04T07:06:25Z | MEMBER |
Yes, that would work for such cases. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Rolling window operation does not work with dask arrays 208903781 | |
284132513 | https://github.com/pydata/xarray/issues/1279#issuecomment-284132513 | https://api.github.com/repos/pydata/xarray/issues/1279 | MDEyOklzc3VlQ29tbWVudDI4NDEzMjUxMw== | jhamman 2443309 | 2017-03-04T06:45:11Z | 2017-03-04T06:45:11Z | MEMBER | An idea...since we only have 1-D rolling methods in xarray, couldn't we just use |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Rolling window operation does not work with dask arrays 208903781 | |
281185199 | https://github.com/pydata/xarray/issues/1279#issuecomment-281185199 | https://api.github.com/repos/pydata/xarray/issues/1279 | MDEyOklzc3VlQ29tbWVudDI4MTE4NTE5OQ== | shoyer 1217238 | 2017-02-20T21:28:37Z | 2017-02-20T21:28:37Z | MEMBER |
Yes, this is correct -- we automatically compute dask arrays when converting to pandas, because pandas does not have any notion of lazy arrays. Note that we currently have two versions of rolling window operations:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Rolling window operation does not work with dask arrays 208903781 | |
281101281 | https://github.com/pydata/xarray/issues/1279#issuecomment-281101281 | https://api.github.com/repos/pydata/xarray/issues/1279 | MDEyOklzc3VlQ29tbWVudDI4MTEwMTI4MQ== | rabernat 1197350 | 2017-02-20T15:01:44Z | 2017-02-20T15:01:44Z | MEMBER | It seems like the most efficient way to handle this would be to use ghost cells equal to the window length. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Rolling window operation does not work with dask arrays 208903781 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 4