github: issue_comments: 4 rows where author_association = "COLLABORATOR" and issue = 717410970 sorted by updated

4 rows where author_association = "COLLABORATOR" and issue = 717410970 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
735372002	https://github.com/pydata/xarray/issues/4496#issuecomment-735372002	https://api.github.com/repos/pydata/xarray/issues/4496	MDEyOklzc3VlQ29tbWVudDczNTM3MjAwMg==	aurghs 35919497	2020-11-29T10:29:34Z	2020-11-29T10:29:34Z	COLLABORATOR	@ravwojdyla I think that currently there is no way to do this. But it would be nice to have an interface that allows defining different chunks for each variable. The main problem that I see in implementing that is to keep the ´xr.open_dataset(... chunks=)´, ´ds.chunk´ and ´ds.chunks´ interfaces backwards compatible. Probably a new issue for that would be better since this refactor is already a little bit tricky and your proposal could be implemented separately.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Flexible backends - Harmonise zarr chunking with other backends chunking 717410970
721590240	https://github.com/pydata/xarray/issues/4496#issuecomment-721590240	https://api.github.com/repos/pydata/xarray/issues/4496	MDEyOklzc3VlQ29tbWVudDcyMTU5MDI0MA==	aurghs 35919497	2020-11-04T08:35:08Z	2020-11-04T09:22:01Z	COLLABORATOR	@weiji14 Thank you very much for your feedback. I think we should align also `xr.open_mfdataset`. In the case of `engine == zarr` and `chunk == -1` there is a UserWarning also in `xr.open_dataset`, but I think it should be removed. Maybe we should evaluate for the future to integrate/use dask function `dask.array.core.normalize_chunks` (https://docs.dask.org/en/latest/array-api.html#dask.array.core.normalize_chunks) with the key `previous_chunks` (see comment https://github.com/pydata/xarray/pull/2530#discussion_r247352940) It could be particularly useful for (re-)chunking taking into account the previous chunks or the on-disk chunks, especially if the on-disk chunks are small.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Flexible backends - Harmonise zarr chunking with other backends chunking 717410970
720785384	https://github.com/pydata/xarray/issues/4496#issuecomment-720785384	https://api.github.com/repos/pydata/xarray/issues/4496	MDEyOklzc3VlQ29tbWVudDcyMDc4NTM4NA==	aurghs 35919497	2020-11-02T23:32:48Z	2020-11-03T09:28:48Z	COLLABORATOR	I think we can keep talking here about xarray chunking interface. It seems that the interface for chunking is a tricky problem in xarray. There are involved different interfaces already implemented: - dask: `da.rechunk`, `da.from_array` - xarray: `xr.open_dataset` - xarray: `ds.chunk` - xarray-zarr: `xr.open_dataset(engine="zarr")` (≈ `xr.open_zarr`) They are similar, but there are some inconsistencies. dask The allowed values for chunking in dask are: - dictionary (or tuple) - integers > 0 - `-1`: no chunking (along this dimension) - `auto`: allow the chunking (in this dimension) to accommodate ideal chunk sizes (default 128MiB) The allowed values in the dictionary are: `-1`, `auto`, `None` (no change to the chunking along this dimension) Note: `None` isn't supported outside the dictionary. Note: If chunking along some dimension is not specified then the chunking along this dimension will not change (e.g. {} is equivalent to {0: `None`}) xarray: `xr.open_dataset` for all the engines != "zarr" It works as dask but also `None` is supported. If `chunk` is `None` then it doesn't use dask at all. xarray: `ds.chunk` It works as dask but also `None` is supported. `None` is equivalent to a dictionary with all values `None` (and equivalent to the empty dictionary). xarray: xr.open_dataset(engine="zarr") It works as dask except for: - `None` is supported. If `chunk` is `None` then it doesn't use dask at all. - If chunking along some dimension is not specified then encoded chunks are used. - `auto` is equivalent to the empty dictionary, encoded chunks are used. - `auto` inside the dictionary is passed on to dask and behaves as in dask. Points to be discussed: 1) `auto` and `{}` The main problem is how to uniform dask and xarray-zarr. Option 1 Maybe the encoded chunking provided by the backend can be seen just as the current on-disk data chunking. According to dask interface, if in a dictionary the chunks for some dimension are `None` or not defined, then the current chunking along that dimension doesn't change. From this perspective, we would have: - with `auto` it uses dask auto-chunking. - with `-1` it uses dask but no chunking. - with `{}` it uses the backend encoded chunks (when available) for on-disk data (`xr.open_dataset`) and the current chunking for already opened datasets (`ds.chunk`) Note: `ds.chunk` behavior would be unchanged Note: `xr.open_dataset` would be unchanged, except for `engine="zarr"`, since currently the `var.encodings["chunks"]` is defined only by zarr. Option 2 We could use a different new value for the encoded chunks (e.g.`encoded` TBC). Something like: `open_dataset(chunks="encoded")` `open_dataset(chunks={"x": "encoded", "y": 10,...})` Both expressions could be supported. cons: - `chunks="encoded"`: with zarr the user probably needs to specify always to use the encoded chunks. - `chunks="encoded"`: the user must specify explicitly in the dictionary which dimension should be chunked with the encoded chunks, that's very inconvenient (but is it really used? @weiji14 do you have some idea about it?). 2) `None` `chunks=None` should produce the same result in `xr.open_dataset` and `ds.rechunk`. @shoyer, @alexamici, @jhamman, @dcherian, @weiji14 suggestions are welcome	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Flexible backends - Harmonise zarr chunking with other backends chunking 717410970
706098129	https://github.com/pydata/xarray/issues/4496#issuecomment-706098129	https://api.github.com/repos/pydata/xarray/issues/4496	MDEyOklzc3VlQ29tbWVudDcwNjA5ODEyOQ==	aurghs 35919497	2020-10-09T10:18:10Z	2020-10-09T10:18:10Z	COLLABORATOR	The key value `auto` is redundant because it has the same behavior as `{}`, we could remove one of them. That's not completely true. With no dask installed `auto` uses chunk=None, while `{}` raises an error. Probably it makes sense.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Flexible backends - Harmonise zarr chunking with other backends chunking 717410970

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);