github: issue_comments: 3 rows where issue = 887711474 sorted by updated

3 rows where issue = 887711474 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
845604377	https://github.com/pydata/xarray/issues/5290#issuecomment-845604377	https://api.github.com/repos/pydata/xarray/issues/5290	MDEyOklzc3VlQ29tbWVudDg0NTYwNDM3Nw==	shoyer 1217238	2021-05-21T02:22:15Z	2021-05-21T02:22:15Z	MEMBER	The (potentially very large) coordinate still needs to fit in memory though... either when creating the dummy zarr store (which could be done differently) or when opening it. Is that correct? This is correct That wont work for my use case when the coordinate is very large. Do you know an alternative? Would it help if I store the coordinate with a non-dimension name? Yes, this would work. We do probably want to change this behavior in the future as part of the changes related to https://github.com/pydata/xarray/issues/1603, e.g., to support out of core indexing. See also https://github.com/pydata/xarray/blob/master/design_notes/flexible_indexes_notes.md	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Inconclusive error messages using to_zarr with regions 887711474
839758891	https://github.com/pydata/xarray/issues/5290#issuecomment-839758891	https://api.github.com/repos/pydata/xarray/issues/5290	MDEyOklzc3VlQ29tbWVudDgzOTc1ODg5MQ==	niowniow 5802846	2021-05-12T13:08:51Z	2021-05-12T20:28:10Z	CONTRIBUTOR	Thanks a lot! Very helpful comments. I'll check out your PR. If i understand it correct, zarr does some autochunking while saving coordinates even without setting specific encodings, at least for bigger coordinate arrays. I can get what I want by creating a zarr store with compute=False then deleting everything except the metadata manually on the filesystem level. Then each call to_zarr() with region results in only one coordinate chunk being created on disk. Reading with xr.open_zarr() works as expected: the coordinate contains nan except for the region written before. The (potentially very large) coordinate still needs to fit in memory though... either when creating the dummy zarr store (which could be done differently) or when opening it. Is that correct? That wont work for my use case when the coordinate is very large. Do you know an alternative? Would it help if I store the coordinate with a non-dimension name? i guess it all boils down to the way xarray recreates the Dataset from zarr store. The only way I can think of right know to make useful "chunked indices" are some form of hierachical indexing. Each chunk is represented by the first index in that chunk. Which would probably only work for sequential indices. I dont know if such indexing exists for pandas. Maybe a hierachical chunking could be useful for some very large datasets!? I dont know if that would create too much overhead but it would be a structured way to access long-term high-res data. In a way I think thats what I'm trying to implement. I would be happy about any pointers to existing solutions. Regarding the documentation: I could provide an example with a time coordinate, which would illustrate two issues I encountered. * region requires index space coordinates (I know: it's already explained in the docs... :) * the before mentioned "coordinates need to be predefined" issue. (Sorry if this bug report is not the right place to ask all these questions)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Inconclusive error messages using to_zarr with regions 887711474
839069111	https://github.com/pydata/xarray/issues/5290#issuecomment-839069111	https://api.github.com/repos/pydata/xarray/issues/5290	MDEyOklzc3VlQ29tbWVudDgzOTA2OTExMQ==	shoyer 1217238	2021-05-11T19:45:01Z	2021-05-11T19:45:37Z	MEMBER	Hi @niowniow thanks for the feedback and code example here. I've been refactoring the Zarr region re-write functionality in https://github.com/pydata/xarray/pull/5252 so your feedback is timely. It might be worth trying the code in that PR to see if it changes anything, but in the best case I suspect it would just give you a different error message. To clarify: - Writing to a region requires that all the desired variables already exist in the Zarr store (e.g., via `to_zarr()` with `compute=False`). The main point of writing to a region is that it should be safe to do in parallel, which is not the case for creating new variables. So I think encountering errors in this case is expected, although I agree these error messages are not very informative! - "chunks in coords are not taken into account while saving!?" is correct, but this is actually more a limitation of Xarray's current data model. Coords with the same name as their dimension are converted into an in-memory pandas.Index object -- they can't be stored as dask arrays. You actually still write the data in chunks, but you have to do it by supplying the `encoding` parameter to `to_zarr`, e.g., `ds.to_zarr(path, mode='w', encoding={'x': {'chunks': 10}}, compute=False, consolidated=True)`. If you have any specific suggestions for where the docs might be clarified those would certainly be appreciated!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Inconclusive error messages using to_zarr with regions 887711474

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);