github: issue_comments: 1 row where issue = 342531772 and user = 4441338 sorted by updated

1 row where issue = 342531772 and user = 4441338 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	performed_via_github_app	issue
673565228	https://github.com/pydata/xarray/issues/2300#issuecomment-673565228	https://api.github.com/repos/pydata/xarray/issues/2300	MDEyOklzc3VlQ29tbWVudDY3MzU2NTIyOA==	LunarLanding 4441338	2020-08-13T16:04:04Z	2020-08-13T16:04:04Z	NONE	I arrived here due to a different use case / problem, which ultimately I solved, but I think there's value in documenting it here. My use case is the following workflow: 1 . take raw data, build a dataset, append it to a zarr store Z 2 . analyze the data on Z, then maybe goto 1. Step 2's performance is much better when data on Z is chunked properly along the appending dimension 'frame' (chunks of size 50), however step 1 only adds 1 element along it. I end up with Z having chunks (1,1,1,1,1...) on 'frame'. On xarray 0.16.0, this seems solvable via the encoding parameter, if we take care to only use it on the store creation. Before that version, I was using something like the monkey patch posted by @chrisbarber . Code: ```python import shutil import xarray as xr import numpy as np import tempfile zarr_path = tempfile.mkdtemp() def append_test(ds,chunks): shutil.rmtree(zarr_path) `for i in range(21): d = ds.isel(frame=slice(i,i+1)) d = d.chunk(chunks) d.to_zarr(zarr_path,consolidated=True,(dict(mode='a',append_dim='frame') if i>0 else {})) dsa = xr.open_zarr(str(zarr_path),consolidated=True) print(dsa.chunks,dsa.dims)` sometime before 0.16.0 import contextlib @contextlib.contextmanager def change_determine_zarr_chunks(chunks): orig_determine_zarr_chunks = xr.backends.zarr._determine_zarr_chunks try: def new_determine_zarr_chunks( enc_chunks, var_chunks, ndim, name): da = ds[name] zchunks = tuple(chunks[dim] if (dim in chunks and chunks[dim] is not None) else da.shape[i] for i,dim in enumerate(da.dims)) return zchunks xr.backends.zarr._determine_zarr_chunks = new_determine_zarr_chunks yield finally: xr.backends.zarr._determine_zarr_chunks = orig_determine_zarr_chunks chunks = {'frame':10,'other':50} ds = xr.Dataset({'data':xr.DataArray(data=np.random.rand(100,100),dims=('frame','other'))}) append_test(ds,chunks) with change_determine_zarr_chunks(chunks): append_test(ds,chunks) with 0.16.0 def append_test_encoding(ds,chunks): shutil.rmtree(zarr_path) `encoding = {} for k,v in ds.variables.items(): encoding[k]={'chunks':tuple(chunks[dk] if dk in chunks else v.shape[i] for i,dk in enumerate(v.dims))} for i in range(21): d = ds.isel(frame=slice(i,i+1)) d = d.chunk(chunks) d.to_zarr(zarr_path,consolidated=True,(dict(mode='a',append_dim='frame') if i>0 else dict(encoding = encoding))) dsa = xr.open_zarr(str(zarr_path),consolidated=True) print(dsa.chunks,dsa.dims)` append_test_encoding(ds,chunks) ``` `Frozen(SortedKeysDict({'frame': (1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), 'other': (50, 50)})) Frozen(SortedKeysDict({'frame': 21, 'other': 100})) Frozen(SortedKeysDict({'frame': (10, 10, 1), 'other': (50, 50)})) Frozen(SortedKeysDict({'frame': 21, 'other': 100})) Frozen(SortedKeysDict({'frame': (10, 10, 1), 'other': (50, 50)})) Frozen(SortedKeysDict({'frame': 21, 'other': 100}))`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		zarr and xarray chunking compatibility and `to_zarr` performance 342531772

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

1 row where issue = 342531772 and user = 4441338 sorted by updated_at descending

sometime before 0.16.0

with 0.16.0

Advanced export