html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/pull/5065#issuecomment-811481334,https://api.github.com/repos/pydata/xarray/issues/5065,811481334,MDEyOklzc3VlQ29tbWVudDgxMTQ4MTMzNA==,1217238,2021-03-31T21:35:11Z,2021-03-31T21:35:11Z,MEMBER,"> Why is `chunk` getting called here? Does it actually get called every time we load a dataset with chunks? If so, we will need a more sophisticated solution. This happens specifically on this line: https://github.com/pydata/xarray/blob/ddc352faa6de91f266a1749773d08ae8d6f09683/xarray/core/dataset.py#L438 So perhaps it would make sense to copy `encoding` specifically in this case, e.g., ```python new_var = var.chunk(chunks, name=name2, lock=lock) new_var.encoding = var.encoding ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943 https://github.com/pydata/xarray/pull/5065#issuecomment-811458761,https://api.github.com/repos/pydata/xarray/issues/5065,811458761,MDEyOklzc3VlQ29tbWVudDgxMTQ1ODc2MQ==,1217238,2021-03-31T20:54:46Z,2021-03-31T20:54:46Z,MEMBER,"Hmm. I would also be happy with explicitly deleting `chunks` from encoding for now. It's not adding a lot of technical debt. In the long term, the whole handling of encoding should be revisited, e.g., see https://github.com/pydata/xarray/issues/5082","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943 https://github.com/pydata/xarray/pull/5065#issuecomment-807140762,https://api.github.com/repos/pydata/xarray/issues/5065,807140762,MDEyOklzc3VlQ29tbWVudDgwNzE0MDc2Mg==,1217238,2021-03-25T17:26:46Z,2021-03-25T17:26:46Z,MEMBER,"> FWIW, I would also favor dropping `encoding['chunks']` after indexing, coarsening, interpolating, etc. Basically anything that changes the array shape or chunk structure. We already drop all of `encoding` after indexing. My guess is that we do the same for coarsening and interpolations as well (though I haven't checked).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943 https://github.com/pydata/xarray/pull/5065#issuecomment-807111762,https://api.github.com/repos/pydata/xarray/issues/5065,807111762,MDEyOklzc3VlQ29tbWVudDgwNzExMTc2Mg==,1217238,2021-03-25T17:08:09Z,2021-03-25T17:08:09Z,MEMBER,"> Xarray knows to drop the `dtype` encoding after an arithmetic operation. How does that work? To me `.chunk` feel like a similar case: an operation that invalidates any existing encoding. To be honest, the existing convention is quite adhoc, just based on what seemed most appropriate at the time. https://github.com/pydata/xarray/issues/1614 is most comprehensive description of the current state of things. We were considering saying that `attrs` and `encoding` should always use the same rules, but perhaps we should be more aggressive about dropping `encoding`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943 https://github.com/pydata/xarray/pull/5065#issuecomment-806154872,https://api.github.com/repos/pydata/xarray/issues/5065,806154872,MDEyOklzc3VlQ29tbWVudDgwNjE1NDg3Mg==,1217238,2021-03-24T20:10:19Z,2021-03-24T20:10:19Z,MEMBER,"I'm a little conflicted about dealing with `encoding['chunks']` specifically in `chunk()`: - On one hand, it feels inconsistent for this only this single method in xarray to modify part of `encoding`. Nothing else in xarray (after CF decoding) does this. Effectively `encoding['chunks']` is now becoming a part of xarray's data model. - On the other hand, this would absolutely fix a recurrent pain-point for users, and in that sense it's worth doing. Maybe this isn't such a big deal in this particular case, especially if we don't think we would need to add such encoding specific logic to any other methods. But are we really sure about that -- what about cases like indexing? I guess the other alternative to make `chunk()` and various other methods that would change chunking drop `encoding` entirely. I don't know if this would really be a better comprehensive solution (I know dropping `attrs` is much hated), but at least it's an easier mental model.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943