github: issue_comments: 8 rows where author_association = "MEMBER" and issue = 1479121713 sorted by updated

8 rows where author_association = "MEMBER" and issue = 1479121713 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
1340809916	https://github.com/pydata/xarray/issues/7363#issuecomment-1340809916	https://api.github.com/repos/pydata/xarray/issues/7363	IC_kwDOAMm_X85P6yK8	keewis 14808389	2022-12-07T11:09:03Z	2022-12-07T11:09:03Z	MEMBER	implementing a "grow_coordinate" function to grow / reallocate larger arrays copying the previous chunk along a coordinate this sounds a lot like `pad` with `mode="constant"`? is it possible that xarray makes no assumptions of this kind `xarray` uses `pandas` indexes for alignment and indexing (if you have a recent version of `xarray` you should see the "Indexes" section in the HTML repr), so yes, it will always make sure to use a search that is more efficient than the linear search, as long as the data is sorted. This was also the reason why you had to use `swap_dims` / `set_index` to create an index along the coordinate you wanted to `reindex`.	{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 1, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	expand dimension by re-allocating larger arrays with more space "at the end of the corresponding dimension", block copying previously existing data, and autofill newly created entry by a default value (note: alternative to reindex, but much faster for extending large arrays along, for example, the time dimension) 1479121713
1340771454	https://github.com/pydata/xarray/issues/7363#issuecomment-1340771454	https://api.github.com/repos/pydata/xarray/issues/7363	IC_kwDOAMm_X85P6ox-	kmuehlbauer 5821660	2022-12-07T10:50:28Z	2022-12-07T10:50:28Z	MEMBER	Does this more or less represent your Dataset? ```python import numpy as np import xarray as xr import datetime create two timeseries', second is for reindex itime = np.arange(0, 3208464).astype("<M8[s]") itime2 = np.arange(0, 4000000).astype("<M8[s]") create two dataset with the time only ds1 = xr.Dataset({"time": itime}) ds2 = xr.Dataset({"time": itime2}) add random data to ds1 ds1 = ds1.expand_dims("station") ds1 = ds1.assign({"test": (["station", "time"], np.random.rand(106, 3208464))}) ``` Now we reindex with the longer timeseries, it only takes a couple of seconds on my machine: `python %%time ds3 = ds1.reindex(time=ds2.time)` `CPU times: user 3.16 s, sys: 649 ms, total: 3.81 s Wall time: 3.81 s` Data is unchanged after reindex: `python xr.testing.assert_equal(ds1.test, ds3.test.isel(time=slice(0, 3208464)))`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	expand dimension by re-allocating larger arrays with more space "at the end of the corresponding dimension", block copying previously existing data, and autofill newly created entry by a default value (note: alternative to reindex, but much faster for extending large arrays along, for example, the time dimension) 1479121713
1340712532	https://github.com/pydata/xarray/issues/7363#issuecomment-1340712532	https://api.github.com/repos/pydata/xarray/issues/7363	IC_kwDOAMm_X85P6aZU	kmuehlbauer 5821660	2022-12-07T10:20:40Z	2022-12-07T10:20:40Z	MEMBER	@jerabaul29 Concerning possible slowness of reindex, I think it uses some sorting inside. But isn't a timeseries sorted anyway? Nevertheless you have a point here, that reindex might not be the right tool for this use-case. It would be nice of we could create your dataset in memory with some random data and check the different proposed solutions for their performance.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	expand dimension by re-allocating larger arrays with more space "at the end of the corresponding dimension", block copying previously existing data, and autofill newly created entry by a default value (note: alternative to reindex, but much faster for extending large arrays along, for example, the time dimension) 1479121713
1340482904	https://github.com/pydata/xarray/issues/7363#issuecomment-1340482904	https://api.github.com/repos/pydata/xarray/issues/7363	IC_kwDOAMm_X85P5iVY	kmuehlbauer 5821660	2022-12-07T06:59:11Z	2022-12-07T06:59:11Z	MEMBER	@jerabaul29 Does your Dataset with the 3 Million time points fit into your machine's memory? Are the arrays dask-backed? It is unfortunately not seen in the screenshots. Calculating from the sizes this is 106 x 3_208_464 single measurements -> 340_097_184. Going from float (8 byte) this will lead to 2_720_777_472, roughly 2.7GB which should fit in most setups. I'm not really sure but good chance that reindex is creating a completely new Dataset, which means the computer has to hold the origin as well as the new Dataset (which is roughly 3.2GB). This adds up to almost 6GB RAM. Depending on your machine and other tasks this might drive into RAM issues. But xarray devs will know better. @keewis suggestion of creating and concatenating a new array with predefined values which is file-backed could resolve the issues you are currently facing.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	expand dimension by re-allocating larger arrays with more space "at the end of the corresponding dimension", block copying previously existing data, and autofill newly created entry by a default value (note: alternative to reindex, but much faster for extending large arrays along, for example, the time dimension) 1479121713
1339675640	https://github.com/pydata/xarray/issues/7363#issuecomment-1339675640	https://api.github.com/repos/pydata/xarray/issues/7363	IC_kwDOAMm_X85P2dP4	keewis 14808389	2022-12-06T16:55:40Z	2022-12-06T16:55:40Z	MEMBER	I'm a bit surprised. Could you post a `repr` of `timestamps_extended_basis`? That might help figuring out what exactly happened. If everything fails, you might also create a new `xarray` object with just the new values, and then use `xr.concat` to combine both?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	expand dimension by re-allocating larger arrays with more space "at the end of the corresponding dimension", block copying previously existing data, and autofill newly created entry by a default value (note: alternative to reindex, but much faster for extending large arrays along, for example, the time dimension) 1479121713
1339568566	https://github.com/pydata/xarray/issues/7363#issuecomment-1339568566	https://api.github.com/repos/pydata/xarray/issues/7363	IC_kwDOAMm_X85P2DG2	keewis 14808389	2022-12-06T15:39:20Z	2022-12-06T15:39:20Z	MEMBER	I think this is because you don't have an index along the dimension. Try any of `python previous_observations.set_coords(["timestamps"]).swap_dims({"time": "timestamps"}).reindex(...) previous_observations.set_index({"time": "timestamps"}).reindex(...)` (the only difference is the name of the dimension / coordinate you end up with)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	expand dimension by re-allocating larger arrays with more space "at the end of the corresponding dimension", block copying previously existing data, and autofill newly created entry by a default value (note: alternative to reindex, but much faster for extending large arrays along, for example, the time dimension) 1479121713
1339450779	https://github.com/pydata/xarray/issues/7363#issuecomment-1339450779	https://api.github.com/repos/pydata/xarray/issues/7363	IC_kwDOAMm_X85P1mWb	kmuehlbauer 5821660	2022-12-06T14:13:30Z	2022-12-06T14:13:30Z	MEMBER	You could take the exact time you have and just add the addition times. You even might create those additional ones by giving a timeinterval and the number . I'd need to look up , but I'm currently only on phone.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	expand dimension by re-allocating larger arrays with more space "at the end of the corresponding dimension", block copying previously existing data, and autofill newly created entry by a default value (note: alternative to reindex, but much faster for extending large arrays along, for example, the time dimension) 1479121713
1339403307	https://github.com/pydata/xarray/issues/7363#issuecomment-1339403307	https://api.github.com/repos/pydata/xarray/issues/7363	IC_kwDOAMm_X85P1awr	kmuehlbauer 5821660	2022-12-06T13:39:06Z	2022-12-06T13:39:33Z	MEMBER	Would xarray.Dataset.reindex do what you want? You would need to extend you time array/coordinate appropriately and feed it to reindex. Maybe you also need to provide fillvalue keywords to get your need portions filled with the correct fillvalue.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	expand dimension by re-allocating larger arrays with more space "at the end of the corresponding dimension", block copying previously existing data, and autofill newly created entry by a default value (note: alternative to reindex, but much faster for extending large arrays along, for example, the time dimension) 1479121713

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

8 rows where author_association = "MEMBER" and issue = 1479121713 sorted by updated_at descending

create two timeseries', second is for reindex

create two dataset with the time only

add random data to ds1

Advanced export