html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/6334#issuecomment-1091315261,https://api.github.com/repos/pydata/xarray/issues/6334,1091315261,IC_kwDOAMm_X85BDCY9,35919497,2022-04-07T08:33:43Z,2022-04-07T08:37:33Z,COLLABORATOR,"Thank you for this fix!
It looks good to me. I would prefer a separate function for the check, but that's fine too.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1160073438
https://github.com/pydata/xarray/pull/6276#issuecomment-1041249397,https://api.github.com/repos/pydata/xarray/issues/6276,1041249397,IC_kwDOAMm_X84-EDR1,35919497,2022-02-16T08:46:05Z,2022-02-16T08:46:24Z,COLLABORATOR,"I would prefer to avoid using **kwargs, the explicit list of parameters would make the code more readable.
But I think it's fine that way too :)
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1138440632
https://github.com/pydata/xarray/pull/5959#issuecomment-969122428,https://api.github.com/repos/pydata/xarray/issues/5959,969122428,IC_kwDOAMm_X845w6J8,35919497,2021-11-15T17:08:38Z,2021-11-15T17:08:38Z,COLLABORATOR,"> @alexamici Could you please have a final look into this?
Thank you @kmuehlbauer!
LGTM","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1048309254
https://github.com/pydata/xarray/pull/5609#issuecomment-887324339,https://api.github.com/repos/pydata/xarray/issues/5609,887324339,IC_kwDOAMm_X840436z,35919497,2021-07-27T08:37:35Z,2021-07-27T08:38:30Z,COLLABORATOR,"> I would try to stay as close to `open_dataset` as possible, which would make migrating to `rioxarray`'s engine easier once we deprecate `open_rasterio`. If I understand the signature of `open_dataset` correctly, this is called `backend_kwargs`?
The idea was to deprecate in the future `open_dataset` `backend_kwargs`. Currently, in `open_dataset` signature, you can use either `backend_kwargs` or `kwargs` to pass the backend additional parameters. I would avoid adding `backend_kwargs` in `open_rasterio` interface.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,945434599
https://github.com/pydata/xarray/pull/5300#issuecomment-843328294,https://api.github.com/repos/pydata/xarray/issues/5300,843328294,MDEyOklzc3VlQ29tbWVudDg0MzMyODI5NA==,35919497,2021-05-18T16:31:13Z,2021-05-18T16:31:13Z,COLLABORATOR,That's perfect. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,891253662
https://github.com/pydata/xarray/issues/5329#issuecomment-843101733,https://api.github.com/repos/pydata/xarray/issues/5329,843101733,MDEyOklzc3VlQ29tbWVudDg0MzEwMTczMw==,35919497,2021-05-18T11:48:37Z,2021-05-18T12:24:21Z,COLLABORATOR,"I think that It's not a bug: `filename_or_obj` in `open_dataset` can be a file, file-like, bytes, URL. The accepted inputs depend on the engine. So it doesn't make sense to raise a `FileNotFoundError` if the engine is not defined by the user or not automatically detected by xarray.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,894125618
https://github.com/pydata/xarray/issues/5302#issuecomment-840820163,https://api.github.com/repos/pydata/xarray/issues/5302,840820163,MDEyOklzc3VlQ29tbWVudDg0MDgyMDE2Mw==,35919497,2021-05-13T20:39:14Z,2021-05-13T20:39:14Z,COLLABORATOR,"Me too, I was thinking about something like that to fix the error message.
Let me try to implement it. But I have really no time this week and the next one, sorry. I can do it after 23th if for you is ok.
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,891281614
https://github.com/pydata/xarray/pull/5033#issuecomment-819654025,https://api.github.com/repos/pydata/xarray/issues/5033,819654025,MDEyOklzc3VlQ29tbWVudDgxOTY1NDAyNQ==,35919497,2021-04-14T16:31:52Z,2021-04-14T16:31:52Z,COLLABORATOR,"@Illviljan I see your point, but the subclassing doesn't add too much complexity and for consistency would be better to add a check on the class.
After that, I think we can merge it.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,831008649
https://github.com/pydata/xarray/issues/5150#issuecomment-819571176,https://api.github.com/repos/pydata/xarray/issues/5150,819571176,MDEyOklzc3VlQ29tbWVudDgxOTU3MTE3Ng==,35919497,2021-04-14T14:40:34Z,2021-04-14T14:41:19Z,COLLABORATOR,"I can try to reproduce the error and fix it, but I need to know at least the xarray version.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,856915051
https://github.com/pydata/xarray/pull/5135#issuecomment-819563132,https://api.github.com/repos/pydata/xarray/issues/5135,819563132,MDEyOklzc3VlQ29tbWVudDgxOTU2MzEzMg==,35919497,2021-04-14T14:29:59Z,2021-04-14T14:29:59Z,COLLABORATOR,@bcbnz could you check if this fixes also #5132?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,853644364
https://github.com/pydata/xarray/pull/5135#issuecomment-818808309,https://api.github.com/repos/pydata/xarray/issues/5135,818808309,MDEyOklzc3VlQ29tbWVudDgxODgwODMwOQ==,35919497,2021-04-13T15:03:39Z,2021-04-13T15:03:39Z,COLLABORATOR,"> @aurghs can you confirm that all failures are unrelated to the changes?
In my understanding, the errors are not related to the changes.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,853644364
https://github.com/pydata/xarray/pull/5033#issuecomment-816517328,https://api.github.com/repos/pydata/xarray/issues/5033,816517328,MDEyOklzc3VlQ29tbWVudDgxNjUxNzMyOA==,35919497,2021-04-09T08:31:05Z,2021-04-09T08:31:27Z,COLLABORATOR,"> Making a backend doesn't have to be super difficult either depending if you already have a nice 3rd party module you can thinly wrap to return a Dataset instead of whatever is the default
I agree. Adding a plugin is not really very difficult, but in some cases could be discouraging especially if you are just exploring how the backends work.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,831008649
https://github.com/pydata/xarray/pull/5135#issuecomment-816050897,https://api.github.com/repos/pydata/xarray/issues/5135,816050897,MDEyOklzc3VlQ29tbWVudDgxNjA1MDg5Nw==,35919497,2021-04-08T18:37:46Z,2021-04-08T18:37:46Z,COLLABORATOR,"At this point, probably `api.normalize_path` is the best choice.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,853644364
https://github.com/pydata/xarray/pull/5135#issuecomment-816021103,https://api.github.com/repos/pydata/xarray/issues/5135,816021103,MDEyOklzc3VlQ29tbWVudDgxNjAyMTEwMw==,35919497,2021-04-08T17:52:46Z,2021-04-08T17:52:46Z,COLLABORATOR,"> LGTM. I don't know how we would test this...
I don't have any good idea about it.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,853644364
https://github.com/pydata/xarray/pull/5065#issuecomment-811810701,https://api.github.com/repos/pydata/xarray/issues/5065,811810701,MDEyOklzc3VlQ29tbWVudDgxMTgxMDcwMQ==,35919497,2021-04-01T10:21:15Z,2021-04-01T11:01:44Z,COLLABORATOR,"> ```python
> new_var = var.chunk(chunks, name=name2, lock=lock)
> new_var.encoding = var.encoding
> ```
Here you are modifying `_maybe_chunk`, but `_maybe_chunk` is also used in `Dataset.chunks`.
Probably would be better to change `backend.api.py`, here:
https://github.com/pydata/xarray/blob/ddc352faa6de91f266a1749773d08ae8d6f09683/xarray/backends/api.py#L296-L307
But maybe also in this case we want to drop `encoding[""chunks""]` if they are not compatible with dask ones. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943
https://github.com/pydata/xarray/issues/5098#issuecomment-811820701,https://api.github.com/repos/pydata/xarray/issues/5098,811820701,MDEyOklzc3VlQ29tbWVudDgxMTgyMDcwMQ==,35919497,2021-04-01T10:40:37Z,2021-04-01T10:41:44Z,COLLABORATOR,"I think that this is consequence of the refactor done by @alexamici when he has removed `_normalize_path`: https://github.com/pydata/xarray/pull/4701
We decided to demand the path interpretation to the backends.
I'll have a look to understand how to fix it. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,847014702
https://github.com/pydata/xarray/pull/5065#issuecomment-811818035,https://api.github.com/repos/pydata/xarray/issues/5065,811818035,MDEyOklzc3VlQ29tbWVudDgxMTgxODAzNQ==,35919497,2021-04-01T10:35:29Z,2021-04-01T10:36:01Z,COLLABORATOR,"> Hmm. I would also be happy with explicitly deleting `chunks` from encoding for now. It's not adding a lot of technical debt.
I see two reasons for keeping it:
- We should be able to read and write the data with the same structure on disk.
- The user may be interested in this information.
But it seems to me that having two different definitions of chunks (dask one and encoded one), is not very intuitive and it's not easy to define a clear default in writing. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943
https://github.com/pydata/xarray/pull/5065#issuecomment-811209453,https://api.github.com/repos/pydata/xarray/issues/5065,811209453,MDEyOklzc3VlQ29tbWVudDgxMTIwOTQ1Mw==,35919497,2021-03-31T16:27:05Z,2021-03-31T17:50:19Z,COLLABORATOR,"~`rechunk`~ `Variable.chunk` is used always when you open a data with dask, even if you are using the default chunking. So in this way, you will drop the encoding always when dask is used (≈ always).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943
https://github.com/pydata/xarray/pull/5065#issuecomment-811284237,https://api.github.com/repos/pydata/xarray/issues/5065,811284237,MDEyOklzc3VlQ29tbWVudDgxMTI4NDIzNw==,35919497,2021-03-31T17:45:29Z,2021-03-31T17:49:17Z,COLLABORATOR,"> Does it actually get called every time we load a dataset with chunks?
Yes
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943
https://github.com/pydata/xarray/pull/5065#issuecomment-811199910,https://api.github.com/repos/pydata/xarray/issues/5065,811199910,MDEyOklzc3VlQ29tbWVudDgxMTE5OTkxMA==,35919497,2021-03-31T16:20:30Z,2021-03-31T16:31:32Z,COLLABORATOR,"> Should the Zarr backend be setting this?
Yes, they are already defined in zarr: preferred_chunks=chunks. We decide to separate the `chunks` and the `preferred_chunks`:
- The `preferred_chunks` is used by the backend to define the default chunks to be used by xarray.
- The `chunks` are the on-disk chunks.
They are not necessarily the same.
Maybe we can drop the `preferred_chunks` after they are used.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943
https://github.com/pydata/xarray/pull/5065#issuecomment-808399567,https://api.github.com/repos/pydata/xarray/issues/5065,808399567,MDEyOklzc3VlQ29tbWVudDgwODM5OTU2Nw==,35919497,2021-03-26T17:34:44Z,2021-03-26T18:08:04Z,COLLABORATOR,"Perhaps we could remove also ``overwrite_encoded_chunks``, it shouldn't be any more necessary.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943
https://github.com/pydata/xarray/issues/4118#issuecomment-808057690,https://api.github.com/repos/pydata/xarray/issues/4118,808057690,MDEyOklzc3VlQ29tbWVudDgwODA1NzY5MA==,35919497,2021-03-26T09:09:38Z,2021-03-26T09:09:38Z,COLLABORATOR,"We could also provide a use-case in remote sensing: it would be really useful in the interferometric processing for managing Sentinel-1 IW and EW SLC data, which has multiple tiles (burts) partially overlapping in one direction (azimuth).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,628719058
https://github.com/pydata/xarray/issues/4118#issuecomment-806403993,https://api.github.com/repos/pydata/xarray/issues/4118,806403993,MDEyOklzc3VlQ29tbWVudDgwNjQwMzk5Mw==,35919497,2021-03-25T06:41:09Z,2021-03-25T06:42:58Z,COLLABORATOR,@alexamici and I can write the technical part of the proposal. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,628719058
https://github.com/pydata/xarray/issues/5053#issuecomment-801920057,https://api.github.com/repos/pydata/xarray/issues/5053,801920057,MDEyOklzc3VlQ29tbWVudDgwMTkyMDA1Nw==,35919497,2021-03-18T13:19:27Z,2021-03-18T16:09:17Z,COLLABORATOR,"Unfortunately, in a previous version, there were internals plugins for the backends. Now they have been removed, but you need to re-install xarray to remove the entrypoints.
There isn't a release with the internal plugins, so the users should not have this problem.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,834641104
https://github.com/pydata/xarray/issues/4491#issuecomment-786652376,https://api.github.com/repos/pydata/xarray/issues/4491,786652376,MDEyOklzc3VlQ29tbWVudDc4NjY1MjM3Ng==,35919497,2021-02-26T13:37:20Z,2021-02-26T13:37:20Z,COLLABORATOR,"If you want I can take care to move ` pynio` to an external repository, but we should decide where.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,715730538
https://github.com/pydata/xarray/issues/4380#issuecomment-751489633,https://api.github.com/repos/pydata/xarray/issues/4380,751489633,MDEyOklzc3VlQ29tbWVudDc1MTQ4OTYzMw==,35919497,2020-12-27T16:43:56Z,2021-01-08T07:20:24Z,COLLABORATOR,"> > Does encoding['chunks'] serve any purpose after you've loaded a Zarr store and all the variables are defined as dask arrays?
>
> No. I run into this frequently and it is annoying. @rabernat do you remember why you chose to keep `chunks` around in `encoding`
The `encodings[""chunks""]` is used in `to_zarr`. It seems to be reasonable: I expect that I should be able to read and re-write a Zarr without modifying the chunking on disk.
It seems to me that dask chunks are used in writing only when the `encodings[""chunks""]` is not defined or they are not compatible anymore with variables shapes. In the other cases `encodings[""chunks""]` is used.
So if you want to use the encoded chunks, you have to be sure that they are still compatible with variables shapes and that each Zarr chunk is contained in only one dask chunk.
If you want to use the dask chunks you can:
- Delite the encoded chunking as done by @eric-czech.
- Use encoding when you write: `ds.to_zarr('/tmp/ds3.zarr', mode='w', encoding={'x': {}})`.
Maybe this interface is a little bit confusing.
Probably would be better to move `overwrite_encoded_chunks` from `open_dataset` to `to_zarr`. `open_dataset` interface would be cleaner and would be clear how to use dask chunks in writing.
Concerning the different chunking per variable, I link here this related issue:
https://github.com/pydata/xarray/issues/4623","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,686608969
https://github.com/pydata/xarray/issues/4380#issuecomment-751481163,https://api.github.com/repos/pydata/xarray/issues/4380,751481163,MDEyOklzc3VlQ29tbWVudDc1MTQ4MTE2Mw==,35919497,2020-12-27T15:32:10Z,2020-12-27T15:32:29Z,COLLABORATOR,"I'm not sure but ... It seems to be a bug this error. There is a check on the final chunk that it seems to have the wrong direction in the inequality.
The part of the code to decide what's chunking should be used in case we have defined both, dask chunking and encoded chucking, is the following: https://github.com/pydata/xarray/blob/ac234619d5471e789b0670a673084dbb01df4f9e/xarray/backends/zarr.py#L141-L173
the aims of these checks, as described in the comment, is to avoid to have multiple dask chunks in one zarr chunk. According to this logic this inequality at line 163:
https://github.com/pydata/xarray/blob/ac234619d5471e789b0670a673084dbb01df4f9e/xarray/backends/zarr.py#L163 has the wrong direction. It should be in this way:
`if dchunks[-1] < zchunk`, but this last one seems to me that it is always verified.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,686608969
https://github.com/pydata/xarray/pull/4726#issuecomment-750313634,https://api.github.com/repos/pydata/xarray/issues/4726,750313634,MDEyOklzc3VlQ29tbWVudDc1MDMxMzYzNA==,35919497,2020-12-23T14:02:54Z,2020-12-23T14:29:21Z,COLLABORATOR,"> Btw & not for here. There are other warnings from the backends refactor. Would be nice if you could hunt them down and either fix or suppress them.
I was just doing that now. Some of them are fixed in https://github.com/pydata/xarray/pull/4728. ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,773717776
https://github.com/pydata/xarray/pull/4721#issuecomment-749649237,https://api.github.com/repos/pydata/xarray/issues/4721,749649237,MDEyOklzc3VlQ29tbWVudDc0OTY0OTIzNw==,35919497,2020-12-22T16:45:58Z,2020-12-22T16:45:58Z,COLLABORATOR,This cleanup involves only apiv2. I merge it.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,773048869
https://github.com/pydata/xarray/issues/2148#issuecomment-747391520,https://api.github.com/repos/pydata/xarray/issues/2148,747391520,MDEyOklzc3VlQ29tbWVudDc0NzM5MTUyMA==,35919497,2020-12-17T11:47:44Z,2020-12-17T11:47:44Z,COLLABORATOR,I think this has been fixed at some point. it can be closed.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,324032926
https://github.com/pydata/xarray/issues/4496#issuecomment-735372002,https://api.github.com/repos/pydata/xarray/issues/4496,735372002,MDEyOklzc3VlQ29tbWVudDczNTM3MjAwMg==,35919497,2020-11-29T10:29:34Z,2020-11-29T10:29:34Z,COLLABORATOR,"@ravwojdyla I think that currently there is no way to do this. But it would be nice to have an interface that allows defining different chunks for each variable.
The main problem that I see in implementing that is to keep the ´xr.open_dataset(... chunks=)´, ´ds.chunk´ and ´ds.chunks´ interfaces backwards compatible.
Probably a new issue for that would be better since this refactor is already a little bit tricky and your proposal could be implemented separately.
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,717410970
https://github.com/pydata/xarray/pull/4577#issuecomment-732208262,https://api.github.com/repos/pydata/xarray/issues/4577,732208262,MDEyOklzc3VlQ29tbWVudDczMjIwODI2Mg==,35919497,2020-11-23T14:47:45Z,2020-11-24T06:48:10Z,COLLABORATOR,"- I have replaced entrypoints with pkg_resources. I can't see any drawback in this change, but only advantages, the main one is that we remove an external dependency.
- I have added the tests.
- I didn't remove the signature inspection, since we have already talked about it widely with @shoyer, and in the end, we decided to keep the inspection and add the checks on the signature to be sure that neither *args nor **kwargs will be used.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,741714847
https://github.com/pydata/xarray/issues/4496#issuecomment-721590240,https://api.github.com/repos/pydata/xarray/issues/4496,721590240,MDEyOklzc3VlQ29tbWVudDcyMTU5MDI0MA==,35919497,2020-11-04T08:35:08Z,2020-11-04T09:22:01Z,COLLABORATOR,"@weiji14 Thank you very much for your feedback. I think we should align also `xr.open_mfdataset`.
In the case of `engine == zarr` and `chunk == -1 ` there is a UserWarning also in `xr.open_dataset`, but I think it should be removed.
Maybe we should evaluate for the future to integrate/use dask function `dask.array.core.normalize_chunks`
(https://docs.dask.org/en/latest/array-api.html#dask.array.core.normalize_chunks) with the key `previous_chunks` (see comment https://github.com/pydata/xarray/pull/2530#discussion_r247352940)
It could be particularly useful for (re-)chunking taking into account the previous chunks or the on-disk chunks, especially if the on-disk chunks are small.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,717410970
https://github.com/pydata/xarray/issues/4496#issuecomment-720785384,https://api.github.com/repos/pydata/xarray/issues/4496,720785384,MDEyOklzc3VlQ29tbWVudDcyMDc4NTM4NA==,35919497,2020-11-02T23:32:48Z,2020-11-03T09:28:48Z,COLLABORATOR,"I think we can keep talking here about xarray chunking interface.
It seems that the interface for chunking is a tricky problem in xarray. There are involved different interfaces already implemented:
- dask: `da.rechunk`, `da.from_array`
- xarray: `xr.open_dataset`
- xarray: `ds.chunk`
- xarray-zarr: `xr.open_dataset(engine=""zarr"")` (≈ `xr.open_zarr`)
They are similar, but there are some inconsistencies.
**dask**
The allowed values for chunking in dask are:
- dictionary (or tuple)
- integers > 0
- `-1`: no chunking (along this dimension)
- `auto`: allow the chunking (in this dimension) to accommodate ideal chunk sizes (default 128MiB)
The allowed values in the dictionary are: `-1`, `auto`, `None` (no change to the chunking along this dimension)
Note: `None` isn't supported outside the dictionary.
Note: If chunking along some dimension is not specified then the chunking along this dimension will not change (e.g. {} is equivalent to {0: `None`})
**xarray: `xr.open_dataset` for all the engines != ""zarr""**
It works as dask but also `None` is supported. If `chunk` is `None` then it doesn't use dask at all.
**xarray: `ds.chunk`**
It works as dask but also `None` is supported. ` None` is equivalent to a dictionary with all values ` None` (and equivalent to the empty dictionary).
**xarray: xr.open_dataset(engine=""zarr"")**
It works as dask except for:
- `None` is supported. If `chunk` is `None` then it doesn't use dask at all.
- If chunking along some dimension is not specified then encoded chunks are used.
- `auto` is equivalent to the empty dictionary, encoded chunks are used.
- `auto` inside the dictionary is passed on to dask and behaves as in dask.
**Points to be discussed:**
1) **`auto` and `{}`**
The main problem is how to uniform dask and xarray-zarr.
**Option 1**
Maybe the encoded chunking provided by the backend can be seen just as the current on-disk data chunking. According to dask interface, if in a dictionary the chunks for some dimension are `None` or not defined, then the current chunking along that dimension doesn't change. From this perspective, we would have:
- with `auto` it uses dask auto-chunking.
- with `-1` it uses dask but no chunking.
- with `{}` it uses the backend encoded chunks (when available) for on-disk data (`xr.open_dataset`) and the current chunking for already opened datasets (`ds.chunk`)
Note: `ds.chunk` behavior would be unchanged
Note: `xr.open_dataset` would be unchanged, except for `engine=""zarr""`, since currently the `var.encodings[""chunks""]` is defined only by zarr.
**Option 2**
We could use a different new value for the encoded chunks (e.g.`encoded` TBC). Something like:
`open_dataset(chunks=""encoded"")`
` open_dataset(chunks={""x"": ""encoded"", ""y"": 10,...})`
Both expressions could be supported.
cons:
- `chunks=""encoded""`: with zarr the user probably needs to specify always to use the encoded chunks.
- `chunks=""encoded""`: the user must specify explicitly in the dictionary which dimension should be chunked with the encoded chunks, that's very inconvenient (but is it really used? @weiji14 do you have some idea about it?).
2) **`None`**
`chunks=None` should produce the same result in `xr.open_dataset` and `ds.rechunk`.
@shoyer, @alexamici, @jhamman, @dcherian, @weiji14 suggestions are welcome","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,717410970
https://github.com/pydata/xarray/issues/4490#issuecomment-718346664,https://api.github.com/repos/pydata/xarray/issues/4490,718346664,MDEyOklzc3VlQ29tbWVudDcxODM0NjY2NA==,35919497,2020-10-29T04:07:46Z,2020-10-29T04:07:46Z,COLLABORATOR,"Taking into account the comments in this issue and the calls, I would propose this solution: https://github.com/pydata/xarray/pull/4547","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,715374721
https://github.com/pydata/xarray/issues/4539#issuecomment-716527698,https://api.github.com/repos/pydata/xarray/issues/4539,716527698,MDEyOklzc3VlQ29tbWVudDcxNjUyNzY5OA==,35919497,2020-10-26T12:56:00Z,2020-10-26T12:56:00Z,COLLABORATOR,"I've tried to replicate the error, but I couldn't.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,729117202
https://github.com/pydata/xarray/issues/4496#issuecomment-706098129,https://api.github.com/repos/pydata/xarray/issues/4496,706098129,MDEyOklzc3VlQ29tbWVudDcwNjA5ODEyOQ==,35919497,2020-10-09T10:18:10Z,2020-10-09T10:18:10Z,COLLABORATOR,"> * The key value `auto` is redundant because it has the same behavior as `{}`, we could remove one of them.
That's not completely true. With no dask installed `auto` uses chunk=None, while `{}` raises an error. Probably it makes sense.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,717410970
https://github.com/pydata/xarray/issues/4490#issuecomment-704607745,https://api.github.com/repos/pydata/xarray/issues/4490,704607745,MDEyOklzc3VlQ29tbWVudDcwNDYwNzc0NQ==,35919497,2020-10-06T23:35:30Z,2020-10-06T23:35:30Z,COLLABORATOR,"I agree, `open_dataset()` currently has a very long signature that should be changed.
The interface you proposed is obviously clearer, but a class could give a false idea that all backends support all the decoding options listed in the class. I see two other alternatives:
- Instead of a class we could use a dictionary. Pros 1, 2 and 3 would still hold.
- With the interface proposed by @alexamici in #4309 the pros 2 and 3 would still hold and partially 1 (since the open_dataset interface would be greatly simplified).
For both these proposals, we would lose the autocompletion with the tab but, on the other hand, the user would be relieved of managing a class.
Finally, I'm not sure that for the user it would be clear the separation between backend_kwargs and decode, since they both contain arguments that will be passed to the backend. Especially if the backend needs more specific decoding options that must be set in backend_kwargs. In this sense, #4309 seems less error-prone.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,715374721
https://github.com/pydata/xarray/issues/2197#issuecomment-393898567,https://api.github.com/repos/pydata/xarray/issues/2197,393898567,MDEyOklzc3VlQ29tbWVudDM5Mzg5ODU2Nw==,35919497,2018-06-01T14:30:46Z,2018-06-01T14:32:53Z,COLLABORATOR,"Also with oversampling we have the same problem (2007-02-02 02:00:00 is out of bound):
``` python
import numpy as np
import pandas as pd
import xarray as xr
time = np.arange('2007-01-01 00:00:00', '2007-02-02 00:00:00', dtype='datetime64[ns]')
arr = xr.DataArray(
np.arange(time.size), coords=[time,], dims=('time',), name='data'
)
resampler = arr.resample(time='3h', base=2, label='right')
resampler
DatetimeIndex(['2007-01-01 02:00:00', '2007-01-01 05:00:00',
'2007-01-01 08:00:00', '2007-01-01 11:00:00',
'2007-01-01 14:00:00', '2007-01-01 17:00:00',
'2007-01-01 20:00:00', '2007-01-01 23:00:00',
'2007-01-02 02:00:00', '2007-01-02 05:00:00',
...
'2007-01-31 23:00:00', '2007-02-01 02:00:00',
'2007-02-01 05:00:00', '2007-02-01 08:00:00',
'2007-02-01 11:00:00', '2007-02-01 14:00:00',
'2007-02-01 17:00:00', '2007-02-01 20:00:00',
'2007-02-01 23:00:00', '2007-02-02 02:00:00'],
dtype='datetime64[ns]', name='time', length=257, freq='3H')
```
The fix is really very easy, I can try to make pull request.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,327591169
https://github.com/pydata/xarray/issues/2148#issuecomment-389935163,https://api.github.com/repos/pydata/xarray/issues/2148,389935163,MDEyOklzc3VlQ29tbWVudDM4OTkzNTE2Mw==,35919497,2018-05-17T16:52:16Z,2018-05-17T17:27:07Z,COLLABORATOR,"The coordinates are grouped correctly:
```python
list(arr.groupby('x'))
[(1,
array([1., 1., 1.])
Coordinates:
* x (x) int64 1 1 1
x2 (x) int64 1 2 3),
(2,
array([1., 1.])
Coordinates:
* x (x) int64 2 2
x2 (x) int64 4 5)]
```
I think the grouping make sense. But once the groups are collapsed with some operation, I'm not sure that can be found a corresponding meaningful operation to apply to the grouped coordinates.
In the following cases the mean after gourpby() works as expected:
```
arr = xr.DataArray(
np.ones(5),
dims=('x',),
coords={
'x': ('x', np.array([1, 1, 1, 2, 2])),
'x1': ('x', np.array([1, 1, 1, 2, 2])),
'x2': ('x', np.array([1, 2, 3, 4, 5])),
}
)
arr.groupby('x1').mean('x')
array([1., 1.])
Coordinates:
* x1 (x1) int64 1 2
arr.groupby((xr.DataArray([1,1,1,2,2], dims=('x'), name='x3'))).mean('x')
array([1., 1.])
Coordinates:
* x3 (x3) int64 1 2
```
Also also if I try to group with an array named with as the dimension along which we perform the mean, I get the same problem:
```python
arr.groupby(xr.DataArray([1,1,1,2,2], dims=('x'), name='x')).mean('x')
array([1., 1.])
Coordinates:
* x (x) int64 1 2
x1 (x) int64 1 1 1 2 2
x2 (x) int64 1 2 3 4 5
```
If I try to use an other dimension name we obtain again an strange behaviour:
```python
arr = xr.DataArray(
np.ones((5, 2)),
dims=('x', 'y'),
coords={
'x': ('x', np.array([1, 1, 1, 2, 2])),
'x1': ('x', np.array([1, 1, 1, 2, 2])),
'x2': ('x', np.array([1, 2, 3, 4, 5])),
}
)
arr.groupby(xr.DataArray([1,1,1,2,2], dims=('x'), name='y')).mean('x')
array([1., 1., 1., 1.])
Coordinates:
* y (y) int64 1 2
```
In this case probably it should raise an error. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,324032926