issue_comments
5 rows where issue = 1479121713 and user = 5821660 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
These facets timed out: issue
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1340771454 | https://github.com/pydata/xarray/issues/7363#issuecomment-1340771454 | https://api.github.com/repos/pydata/xarray/issues/7363 | IC_kwDOAMm_X85P6ox- | kmuehlbauer 5821660 | 2022-12-07T10:50:28Z | 2022-12-07T10:50:28Z | MEMBER | Does this more or less represent your Dataset? ```python import numpy as np import xarray as xr import datetime create two timeseries', second is for reindexitime = np.arange(0, 3208464).astype("<M8[s]") itime2 = np.arange(0, 4000000).astype("<M8[s]") create two dataset with the time onlyds1 = xr.Dataset({"time": itime}) ds2 = xr.Dataset({"time": itime2}) add random data to ds1ds1 = ds1.expand_dims("station") ds1 = ds1.assign({"test": (["station", "time"], np.random.rand(106, 3208464))}) ``` Now we reindex with the longer timeseries, it only takes a couple of seconds on my machine:
Data is unchanged after reindex:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
expand dimension by re-allocating larger arrays with more space "at the end of the corresponding dimension", block copying previously existing data, and autofill newly created entry by a default value (note: alternative to reindex, but much faster for extending large arrays along, for example, the time dimension) 1479121713 | |
1340712532 | https://github.com/pydata/xarray/issues/7363#issuecomment-1340712532 | https://api.github.com/repos/pydata/xarray/issues/7363 | IC_kwDOAMm_X85P6aZU | kmuehlbauer 5821660 | 2022-12-07T10:20:40Z | 2022-12-07T10:20:40Z | MEMBER | @jerabaul29 Concerning possible slowness of reindex, I think it uses some sorting inside. But isn't a timeseries sorted anyway? Nevertheless you have a point here, that reindex might not be the right tool for this use-case. It would be nice of we could create your dataset in memory with some random data and check the different proposed solutions for their performance. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
expand dimension by re-allocating larger arrays with more space "at the end of the corresponding dimension", block copying previously existing data, and autofill newly created entry by a default value (note: alternative to reindex, but much faster for extending large arrays along, for example, the time dimension) 1479121713 | |
1340482904 | https://github.com/pydata/xarray/issues/7363#issuecomment-1340482904 | https://api.github.com/repos/pydata/xarray/issues/7363 | IC_kwDOAMm_X85P5iVY | kmuehlbauer 5821660 | 2022-12-07T06:59:11Z | 2022-12-07T06:59:11Z | MEMBER | @jerabaul29 Does your Dataset with the 3 Million time points fit into your machine's memory? Are the arrays dask-backed? It is unfortunately not seen in the screenshots. Calculating from the sizes this is 106 x 3_208_464 single measurements -> 340_097_184. Going from float (8 byte) this will lead to 2_720_777_472, roughly 2.7GB which should fit in most setups. I'm not really sure but good chance that reindex is creating a completely new Dataset, which means the computer has to hold the origin as well as the new Dataset (which is roughly 3.2GB). This adds up to almost 6GB RAM. Depending on your machine and other tasks this might drive into RAM issues. But xarray devs will know better. @keewis suggestion of creating and concatenating a new array with predefined values which is file-backed could resolve the issues you are currently facing. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
expand dimension by re-allocating larger arrays with more space "at the end of the corresponding dimension", block copying previously existing data, and autofill newly created entry by a default value (note: alternative to reindex, but much faster for extending large arrays along, for example, the time dimension) 1479121713 | |
1339450779 | https://github.com/pydata/xarray/issues/7363#issuecomment-1339450779 | https://api.github.com/repos/pydata/xarray/issues/7363 | IC_kwDOAMm_X85P1mWb | kmuehlbauer 5821660 | 2022-12-06T14:13:30Z | 2022-12-06T14:13:30Z | MEMBER | You could take the exact time you have and just add the addition times. You even might create those additional ones by giving a timeinterval and the number . I'd need to look up , but I'm currently only on phone. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
expand dimension by re-allocating larger arrays with more space "at the end of the corresponding dimension", block copying previously existing data, and autofill newly created entry by a default value (note: alternative to reindex, but much faster for extending large arrays along, for example, the time dimension) 1479121713 | |
1339403307 | https://github.com/pydata/xarray/issues/7363#issuecomment-1339403307 | https://api.github.com/repos/pydata/xarray/issues/7363 | IC_kwDOAMm_X85P1awr | kmuehlbauer 5821660 | 2022-12-06T13:39:06Z | 2022-12-06T13:39:33Z | MEMBER | Would xarray.Dataset.reindex do what you want? You would need to extend you time array/coordinate appropriately and feed it to reindex. Maybe you also need to provide fillvalue keywords to get your need portions filled with the correct fillvalue. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
expand dimension by re-allocating larger arrays with more space "at the end of the corresponding dimension", block copying previously existing data, and autofill newly created entry by a default value (note: alternative to reindex, but much faster for extending large arrays along, for example, the time dimension) 1479121713 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 1