html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/5290#issuecomment-839758891,https://api.github.com/repos/pydata/xarray/issues/5290,839758891,MDEyOklzc3VlQ29tbWVudDgzOTc1ODg5MQ==,5802846,2021-05-12T13:08:51Z,2021-05-12T20:28:10Z,CONTRIBUTOR,"Thanks a lot! Very helpful comments. I'll check out your PR.
If i understand it correct, zarr does some autochunking while saving coordinates even without setting specific encodings, at least for bigger coordinate arrays.
I can get what I want by creating a zarr store with compute=False then deleting everything except the metadata manually on the filesystem level. Then each call to_zarr() with region results in only one coordinate chunk being created on disk.
Reading with xr.open_zarr() works as expected: the coordinate contains nan except for the region written before.
The (potentially very large) coordinate still needs to fit in memory though... either when creating the dummy zarr store (which could be done differently) or when opening it. Is that correct? That wont work for my use case when the coordinate is very large.
Do you know an alternative? Would it help if I store the coordinate with a non-dimension name? i guess it all boils down to the way xarray recreates the Dataset from zarr store. The only way I can think of right know to make useful ""chunked indices"" are some form of hierachical indexing. Each chunk is represented by the first index in that chunk. Which would probably only work for sequential indices. I dont know if such indexing exists for pandas. Maybe a hierachical chunking could be useful for some very large datasets!? I dont know if that would create too much overhead but it would be a structured way to access long-term high-res data. In a way I think thats what I'm trying to implement. I would be happy about any pointers to existing solutions.
Regarding the documentation: I could provide an example with a time coordinate, which would illustrate two issues I encountered.
* region requires index space coordinates (I know: it's already explained in the docs... :)
* the before mentioned ""coordinates need to be predefined"" issue.
(Sorry if this bug report is not the right place to ask all these questions)
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,887711474