html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/2844#issuecomment-778717611,https://api.github.com/repos/pydata/xarray/issues/2844,778717611,MDEyOklzc3VlQ29tbWVudDc3ODcxNzYxMQ==,22566757,2021-02-14T03:35:55Z,2021-02-14T03:35:55Z,CONTRIBUTOR,"> ~Does anyone know why the `xr.open_dataset(....)` call is echoed in the warning message. Is this intentional? Cc @dcherian @DWesl~
>
It seems you've already figured this out, but for anyone else with this question, the repeat of the call on that file is part of the warning that the file does not have all the variables the attributes refer to. You can fix this by recreating the file with the listed variables added (`areacella`, or by deleting the attribute from the variables (`cell_measures`). You can also ignore the warning using the machinery in the `warnings` module.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-778632780,https://api.github.com/repos/pydata/xarray/issues/2844,778632780,MDEyOklzc3VlQ29tbWVudDc3ODYzMjc4MA==,2448579,2021-02-13T15:17:21Z,2021-02-13T15:17:21Z,MEMBER,Great I'll merge before the next release if no one else has comments. Thanks @DWesl ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-778629061,https://api.github.com/repos/pydata/xarray/issues/2844,778629061,MDEyOklzc3VlQ29tbWVudDc3ODYyOTA2MQ==,22566757,2021-02-13T14:46:25Z,2021-02-13T14:46:25Z,CONTRIBUTOR,I think this looks good.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-586277583,https://api.github.com/repos/pydata/xarray/issues/2844,586277583,MDEyOklzc3VlQ29tbWVudDU4NjI3NzU4Mw==,24736507,2020-02-14T12:59:36Z,2021-02-11T17:47:29Z,NONE,"Hello @DWesl! Thanks for updating this PR. We checked the lines you've touched for [PEP 8](https://www.python.org/dev/peps/pep-0008) issues, and found:
There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:
##### Comment last updated at 2021-02-11 17:47:29 UTC","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-777487024,https://api.github.com/repos/pydata/xarray/issues/2844,777487024,MDEyOklzc3VlQ29tbWVudDc3NzQ4NzAyNA==,2448579,2021-02-11T14:12:04Z,2021-02-11T14:12:04Z,MEMBER,"> One option is to change decode_coords from bool to Union[bool, str] and allow decode_coords to be either ""all"" (this PR) or ""coordinates"", or True ( == ""coordinates"", backwards compatible). I can bring this up at the next dev meeting.
Updated to implement this proposal after getting OK at the previous meeting.
@DWesl can you take a look at the most recent changes please?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-762899193,https://api.github.com/repos/pydata/xarray/issues/2844,762899193,MDEyOklzc3VlQ29tbWVudDc2Mjg5OTE5Mw==,2448579,2021-01-19T15:03:02Z,2021-01-19T15:03:02Z,MEMBER,"> there were some substantial concerns about backwards compatibility
:( yes this is currently backward incompatible (so the next release will bump major version). There's reluctance to add yet another `decode_*` kwarg to `open_dataset`, and also reluctance to issue a warning saying that the behaviour of `decode_coords` will change in a future version (since this would be very common).
One option is to change `decode_coords` from `bool` to `Union[bool, str]` and allow `decode_coords` to be either `""all""` (this PR) or `""coordinates""`, or `True` ( == `""coordinates""`, backwards compatible). I can bring this up at the next dev meeting.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-761843786,https://api.github.com/repos/pydata/xarray/issues/2844,761843786,MDEyOklzc3VlQ29tbWVudDc2MTg0Mzc4Ng==,3460034,2021-01-17T16:58:34Z,2021-01-17T16:58:34Z,CONTRIBUTOR,"I unfortunately have not been following along with the recent developments in this PR, so these may have already been previously covered. Sorry if that is the case!
However, earlier on in the development of this PR, there were some substantial concerns about backwards compatibility (e.g., for libraries like MetPy that currently rely on `grid_mapping` and the like being in `attrs`) and the improved propagation of `encoding` through operations (so that moving these to `encoding` doesn't mean they are unnecessarily lost). What is the current status with regards to these?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-761842344,https://api.github.com/repos/pydata/xarray/issues/2844,761842344,MDEyOklzc3VlQ29tbWVudDc2MTg0MjM0NA==,22566757,2021-01-17T16:48:39Z,2021-01-17T16:48:39Z,CONTRIBUTOR,Looks good to me. I was wondering where those docstrings were.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-674158261,https://api.github.com/repos/pydata/xarray/issues/2844,674158261,MDEyOklzc3VlQ29tbWVudDY3NDE1ODI2MQ==,2448579,2020-08-14T16:33:31Z,2020-08-14T16:33:31Z,MEMBER,"I think it's OK if we skip `ancillary_variables`, the CF data model does not map perfectly to xarray's data model so this kind of annoyance will always be present. Also thanks @DocOtak for that pathological example :D (now I know what those mathematicians talk about)
> A method or addition to __getitem__ on the accessor returning a Dataset with the linked variables include
This has already been implemented though it could use a lot more testing: https://cf-xarray.readthedocs.io/en/latest/examples/introduction.html#Feature:-Accessing-variables-by-standard-names","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-671094846,https://api.github.com/repos/pydata/xarray/issues/2844,671094846,MDEyOklzc3VlQ29tbWVudDY3MTA5NDg0Ng==,3460034,2020-08-09T20:03:09Z,2020-08-09T20:03:09Z,CONTRIBUTOR,"For what it's worth, here's a bit of an elaboration on my view from https://github.com/pydata/xarray/issues/4215#issuecomment-656789017 and https://github.com/pydata/xarray/issues/4121#issuecomment-639758583:
`ancillary_variables` really fit a role of ""non-coordinate linked variables,"" which doesn't have a good place in xarray's current data model. Both the current behavior (any links are completely lost if not manually handled) and linking as coordinates (conceptually confusing and [causing spurious proliferation](https://github.com/pydata/xarray/pull/2844#issuecomment-670808763)) are problematic. Short of a ""linked"" or ""subsidiary"" variable concept being added to xarray's data model (could be worth discussing on a new issue?), `cf-xarray` seems like the best place for managing `ancillary_variables`, either through
- A method that extracts `ancilliary_variables` from a Dataset given a variable name (https://github.com/xarray-contrib/cf-xarray/issues/5#issue-628569748)
- Parsing the `ancilliary_variables` attribute and replacing the string with a Dataset or list of DataArrays or Variables containing the actual ancillary variables (https://github.com/pydata/xarray/issues/4121#issue-630573329 and https://github.com/pydata/xarray/pull/2844#issuecomment-670991386)
- A method or addition to `__getitem__` on the accessor returning a Dataset with the linked variables included (https://github.com/pydata/xarray/pull/2844#issuecomment-670996109)
For those interested, the issue for `ancillary_variables` in cf-xarray is https://github.com/xarray-contrib/cf-xarray/issues/5.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-670996109,https://api.github.com/repos/pydata/xarray/issues/2844,670996109,MDEyOklzc3VlQ29tbWVudDY3MDk5NjEwOQ==,22566757,2020-08-09T02:17:07Z,2020-08-09T16:36:12Z,CONTRIBUTOR,"That's two people with that view so I made the change.
Again, I feel that the quality flags are essentially meaningless on their own, useful primarily in the context of their associated variables, like the items currently put in the XArray `coords` attribute, which, admittedly, is only those variables identified by CF as dimension or auxiliary coordinates at the moment, and should remain associated with the relevant variable even if it is extracted into a `DataArray`. Since all of the other people who have opinions on the matter seem to disagree with me, I changed the code to preserve the present behavior with regards to `ancillary_variables`. I can always monkey-patch it back in if it really bothers me, or add a `Dataset.__getitem__` wrapper to `xarray-contrib/cf-xarray` so that the `ancillary_variables` stay associated when I pull variables out, or move back to `SciTools/iris`.
On a related note, I should probably check whether this breaks conversion to an `iris.Cube`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-670991386,https://api.github.com/repos/pydata/xarray/issues/2844,670991386,MDEyOklzc3VlQ29tbWVudDY3MDk5MTM4Ng==,868027,2020-08-09T01:11:09Z,2020-08-09T01:11:09Z,CONTRIBUTOR,"Yes, my view is that things in `ancillary_variables` should stay in the attrs of their variable (DataArray) and not moved to coords. Currently this PR will remove the `ancillary_variables` from the attrs of the variables in a file which have it. This appears to break CF defined connection between associated variables (like uncertainty and QC). While the information isn't lost, I would need to look in encoding to get it. It looks like the [first reply](https://github.com/pydata/xarray/issues/4215#issuecomment-656789017) in #4215 also didn't like putting `ancillary_variables` in the coords.
What would be really awesome is some sort of variable proxy I could replace the string names with actual references/pointers to the correct DataArray in the Dataset. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-670816691,https://api.github.com/repos/pydata/xarray/issues/2844,670816691,MDEyOklzc3VlQ29tbWVudDY3MDgxNjY5MQ==,22566757,2020-08-08T03:25:17Z,2020-08-08T03:53:39Z,CONTRIBUTOR,"You are correct; `ancillary_variables` is neither `grid_mapping` or `bounds`.
My personal view is that the quality information should stay with the variable it describes unless explicitly dropped; I think your view is that quality information can always be extracted from the original dataset, and that no variable should carry quality information for a different variable. At this point it would be simple to remove `ancillary_variables` from the attributes processed by this PR. There was a suggestion earlier of adding a `decode_aux_vars` argument to control the new behavior as a means of avoiding back-compatibility breaks like this one. I will leave that as a question for the maintainers; there is also some related discussion at #4215.
I should point out that a similar situation arises for `grid_mapping`; `ds.coords[""time""]` will include the `grid_mapping` variable in its coordinates.
In contrast, `ds.coords[""x""]` will not include the bounds for the `x` variable, since it has more dimensions than `ds.coords[""x""]`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-670808763,https://api.github.com/repos/pydata/xarray/issues/2844,670808763,MDEyOklzc3VlQ29tbWVudDY3MDgwODc2Mw==,868027,2020-08-08T02:12:08Z,2020-08-08T02:12:08Z,CONTRIBUTOR,"I decided to try out this PR on some of the data files we are working with at my data office. In our datasets we have per variable quality flag information per variable uncertainty information. The CF way of tying all these together is via the [ancillary_variables attribute](http://cfconventions.org/cf-conventions/cf-conventions.html#ancillary-data). This PR pulls all these out into the Dataset coordinates. Since in the xarray data model (right now) the coordinates apply to an entire dataset, this feels inappropriate and maybe even breaking. The ancillary_variables attribute is not used in CF grid mapping or bounds as far as I can tell.
Here is an example using this PR (note that all the varN type names will be replaced with better variable names before we publish these):
```python
In [1]: import xarray as xr
In [2]: ds = xr.open_dataset(""examples/converted/06AQ19840719.nc"")
In [3]: ds
Out[3]:
Dimensions: (N_LEVELS: 24, N_PROF: 38)
Coordinates:
var1_qc (N_PROF, N_LEVELS) float32 ...
var4_qc (N_PROF, N_LEVELS) float32 ...
var5_qc (N_PROF, N_LEVELS) float32 ...
var6_qc (N_PROF, N_LEVELS) float32 ...
var7_qc (N_PROF, N_LEVELS) float32 ...
var8_qc (N_PROF, N_LEVELS) float32 ...
var9_qc (N_PROF, N_LEVELS) float32 ...
var10_qc (N_PROF, N_LEVELS) float32 ...
var11_qc (N_PROF, N_LEVELS) float32 ...
var12_qc (N_PROF, N_LEVELS) float32 ...
var13_qc (N_PROF, N_LEVELS) float32 ...
var14_qc (N_PROF, N_LEVELS) float32 ...
var15_qc (N_PROF, N_LEVELS) float32 ...
pressure (N_PROF, N_LEVELS) float64 ...
latitude (N_PROF) float64 ...
longitude (N_PROF) float64 ...
time (N_PROF) datetime64[ns] ...
expocode (N_PROF) object ...
station (N_PROF) object ...
cast (N_PROF) int8 ...
sample (N_PROF, N_LEVELS) object ...
Dimensions without coordinates: N_LEVELS, N_PROF
Data variables:
var0 (N_PROF) object ...
var1 (N_PROF, N_LEVELS) object ...
var2 (N_PROF) float32 ...
var3 (N_PROF, N_LEVELS) float32 ...
var4 (N_PROF, N_LEVELS) float32 ...
var5 (N_PROF, N_LEVELS) float32 ...
var6 (N_PROF, N_LEVELS) float32 ...
var7 (N_PROF, N_LEVELS) float32 ...
var8 (N_PROF, N_LEVELS) float32 ...
var9 (N_PROF, N_LEVELS) float32 ...
var10 (N_PROF, N_LEVELS) float32 ...
var11 (N_PROF, N_LEVELS) float32 ...
var12 (N_PROF, N_LEVELS) float32 ...
var13 (N_PROF, N_LEVELS) float32 ...
var14 (N_PROF, N_LEVELS) float32 ...
var15 (N_PROF, N_LEVELS) float32 ...
Attributes:
Conventions: CF-1.8 CCHDO-0.1.dev157+g52933e0.d20200707
```
This looks especially confusing when you ask for one specific variable:
```python
In [15]: ds.var6
Out[15]:
array([[33.3965, 33.5742, 34.8769, ..., 34.9858, 34.9852, nan],
[33.1129, 34.0742, 34.6595, ..., nan, nan, nan],
[32.5328, 33.2687, 34.2262, ..., nan, nan, nan],
...,
[35.0686, 35.09 , 35.1415, ..., nan, nan, nan],
[35.0303, 35.0295, 35.0715, ..., nan, nan, nan],
[35.0682, 35.0756, 35.0622, ..., nan, nan, nan]],
dtype=float32)
Coordinates:
var1_qc (N_PROF, N_LEVELS) float32 ...
var4_qc (N_PROF, N_LEVELS) float32 ...
var5_qc (N_PROF, N_LEVELS) float32 ...
var6_qc (N_PROF, N_LEVELS) float32 ...
var7_qc (N_PROF, N_LEVELS) float32 ...
var8_qc (N_PROF, N_LEVELS) float32 ...
var9_qc (N_PROF, N_LEVELS) float32 ...
var10_qc (N_PROF, N_LEVELS) float32 ...
var11_qc (N_PROF, N_LEVELS) float32 ...
var12_qc (N_PROF, N_LEVELS) float32 ...
var13_qc (N_PROF, N_LEVELS) float32 ...
var14_qc (N_PROF, N_LEVELS) float32 ...
var15_qc (N_PROF, N_LEVELS) float32 ...
pressure (N_PROF, N_LEVELS) float64 ...
latitude (N_PROF) float64 ...
longitude (N_PROF) float64 ...
time (N_PROF) datetime64[ns] ...
expocode (N_PROF) object ...
station (N_PROF) object ...
cast (N_PROF) int8 ...
sample (N_PROF, N_LEVELS) object ...
Dimensions without coordinates: N_PROF, N_LEVELS
Attributes:
whp_name: CTDSAL
whp_unit: PSS-78
standard_name: sea_water_practical_salinity
units: 1
reference_scale: PSS-78
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-670765806,https://api.github.com/repos/pydata/xarray/issues/2844,670765806,MDEyOklzc3VlQ29tbWVudDY3MDc2NTgwNg==,22566757,2020-08-07T22:29:20Z,2020-08-07T22:29:20Z,CONTRIBUTOR,"The MinimumVersionsPolicy error appears to be a series of internal `conda` errors, and is probably unrelated.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-670730008,https://api.github.com/repos/pydata/xarray/issues/2844,670730008,MDEyOklzc3VlQ29tbWVudDY3MDczMDAwOA==,22566757,2020-08-07T22:02:47Z,2020-08-07T22:02:47Z,CONTRIBUTOR,pydata/xarray-data#19,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-670728389,https://api.github.com/repos/pydata/xarray/issues/2844,670728389,MDEyOklzc3VlQ29tbWVudDY3MDcyODM4OQ==,2448579,2020-08-07T21:56:29Z,2020-08-07T21:56:29Z,MEMBER,"@DWesl can you put delete the ""bounds"" attribute and send in a PR to xarray-data?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-667744209,https://api.github.com/repos/pydata/xarray/issues/2844,667744209,MDEyOklzc3VlQ29tbWVudDY2Nzc0NDIwOQ==,22566757,2020-08-03T00:14:24Z,2020-08-03T19:42:02Z,CONTRIBUTOR,"The `rasm` dataset has coordinates `xc` and `yc`, which reference bounds `xv` and `yv` respectively, which I do not see in the variable list with `decode_coords=False`. It would appear that pydata/xarray-data#4 did not include the bounds in the updated dataset when adding coordinates to `rasm.nc`, so this warning is correct. I do not know that file, so I'm probably not the best person to add bounds. Should I wait for an update to `pydata/xarray-data`, or should I ask sphinx to ignore the warning?
Another option is to just delete the `bounds` attributes of `xc` and `yc` in `rasm.nc`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-667737160,https://api.github.com/repos/pydata/xarray/issues/2844,667737160,MDEyOklzc3VlQ29tbWVudDY2NzczNzE2MA==,22566757,2020-08-02T23:13:26Z,2020-08-02T23:13:26Z,CONTRIBUTOR,"The example the doc build doesn't like:
```python
ds = xr.tutorial.load_dataset(""rasm"")
ds.to_zarr(""rasm.zarr"", mode=""w"")
import zarr
zgroup = zarr.open(""rasm.zarr"")
print(zgroup.tree())
dict(zgroup[""Tair""].attrs)
```
I'll need to look into the `rasm` dataset to figure out why there is a warning now.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-497948836,https://api.github.com/repos/pydata/xarray/issues/2844,497948836,MDEyOklzc3VlQ29tbWVudDQ5Nzk0ODgzNg==,22566757,2019-06-01T14:20:08Z,2020-07-14T15:55:17Z,CONTRIBUTOR,"On 5/31/2019 11:50 AM, dcherian wrote:
> It isn't just MetPy though. I'm sure there's existing code relying on
> adding |grid_mapping| and |bounds| to |attrs| in order to write
> CF-compliant files. So there's a (potentially big) backward
> compatibility issue. This becomes worse if in the future we keep
> interpreting more CF attributes and moving them to |encoding| :/.
At present, the proper, CF-compliant way to do this is to have both
|grid_mapping| and |bounds| variables in |data_vars|, and maintain the
attributes yourself, including making sure the variables get copied into
the result after relevant |ds[var_name]| and |ds.sel(axis=bounds)|
operations.
If you decide to move these variables to |coords|, the |bounds| variables will still get dropped on any subsetting operation, including those where the relevant axis was retained, the |grid_mapping| variables will be included in the result of all subsetting operations (including pulling out, for example, a time coordinate), and both will be included in some |coordinates| attribute when written to disk, breaking CF compliance.
This PR only really addresses getting these variables in |coords| initially and
keeping them out of the global |coordinates| attribute when writing to disk.
>
> Since I'm doing this primarily to get grid_mapping and bounds
> variables out of ds.data_vars.
>
> I'm +1 on this but I wonder whether saving them in |attrs| and using
> that information when encoding coordinates would be the more pragmatic
> choice.
You have a point about |grid_mapping|, but applying the MetPy approach
of saving the information in another, more directly useful format
(|cartopy.Projection| instances) immediately after loading the file
would be a way around that.
For |bounds|, I think |pd.PeriodIndex| would be the most natural
representation for time, and |pd.IntervalIndex| for most other 1-D cases,
but that still leaves |bounds| for two-or-more-dimensional coordinates.
That's a design choice I'll leave to the maintainers.
>
> We could define |encoding| as containing a specified set of CF
> attributes that control on-disk representation such as |units|,
> |scale_factor|, |contiguous| etc. and leaving everything else in
> |attrs|. A full list of attributes that belong in |encoding| could be in
> the docs so that downstream packages can fully depend on this behaviour.
>
> Currently I see |coordinates| is interpreted and moved to |encoding|. In
> the above proposal, this would be left in |attrs| but its value would
> still be interpreted if |decode_coords=True|.
>
> What do you think?
At present, |set(ds[var_name].attrs[""coordinates""].split())| and
|set(ds[var_name].coords) - set(ds[var_name].indexes[dim_name])|
would be identical, since the |coordinates| attribute is essentially
computed from the second expression on write.
Do you have a use case in mind where you need specifically the list of
CF auxiliary coordinates, or is that just an example of something that
would change under the new proposal? I assume |units| would be moved to
|encoding| only for |datetime64[ns]| and |timedelta64[ns]| variables.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-644407583,https://api.github.com/repos/pydata/xarray/issues/2844,644407583,MDEyOklzc3VlQ29tbWVudDY0NDQwNzU4Mw==,2448579,2020-06-15T21:47:08Z,2020-06-15T21:47:08Z,MEMBER,"Hi @DWesl . I sincerely apologize for not responding earlier.
Reading over this thread again, I think I agree with @shoyer these should live in `encoding`, because we've technically ""decoded"" the attribute and converted these variables to coords.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-644405067,https://api.github.com/repos/pydata/xarray/issues/2844,644405067,MDEyOklzc3VlQ29tbWVudDY0NDQwNTA2Nw==,22566757,2020-06-15T21:40:49Z,2020-06-15T21:40:49Z,CONTRIBUTOR,"This PR currently puts `grid_mapping` and `bounds` in `encoding` once it is done with them. Is that where XArray wants to put them, or should they be somewhere else?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-633253434,https://api.github.com/repos/pydata/xarray/issues/2844,633253434,MDEyOklzc3VlQ29tbWVudDYzMzI1MzQzNA==,22566757,2020-05-24T16:09:04Z,2020-05-24T16:09:04Z,CONTRIBUTOR,"Should I change this to put `grid_mapping` and `bounds` back in `attrs`, or should I leave them in `encoding`?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-598016992,https://api.github.com/repos/pydata/xarray/issues/2844,598016992,MDEyOklzc3VlQ29tbWVudDU5ODAxNjk5Mg==,1217238,2020-03-12T05:43:30Z,2020-03-12T05:43:30Z,MEMBER,"We could probably make it a rule that encoding gets preserved in exactly
the same way as attrs. I think there are some old issues about formalizing
metadata conventions...
On Wed, Mar 11, 2020 at 2:19 PM Ryan May wrote:
> Thanks for the info. Based on that, I lean towards attrs.
>
> I think a better rationale, though, would be to formalize the role of
> encoding in xarray.
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> , or
> unsubscribe
>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-597883721,https://api.github.com/repos/pydata/xarray/issues/2844,597883721,MDEyOklzc3VlQ29tbWVudDU5Nzg4MzcyMQ==,221526,2020-03-11T21:19:31Z,2020-03-11T21:19:31Z,CONTRIBUTOR,"Thanks for the info. Based on that, I lean towards `attrs`.
I think a better rationale, though, would be to formalize the role of `encoding` in xarray.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-597375929,https://api.github.com/repos/pydata/xarray/issues/2844,597375929,MDEyOklzc3VlQ29tbWVudDU5NzM3NTkyOQ==,22566757,2020-03-10T23:54:41Z,2020-03-10T23:54:41Z,CONTRIBUTOR,"I think the choice is between `attrs` and `encoding`, not both.
If it helps lean your decision one way or the other, `attrs` tends to stay associated with `Dataset`s through more operations than `encoding`, so `parse_cf()` would have to be called fairly soon after opening if the information ends up in `encoding`, while putting it in `attrs` gives users a bit more time for that.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-597374274,https://api.github.com/repos/pydata/xarray/issues/2844,597374274,MDEyOklzc3VlQ29tbWVudDU5NzM3NDI3NA==,221526,2020-03-10T23:47:43Z,2020-03-10T23:47:43Z,CONTRIBUTOR,"As a downstream user, I just want to be told what to do (assuming `encoding` is part of the public API for xarray). I'd love not to have to modify our code, but that's not essential necessarily.
So to clarify: is this about whether they should be in one spot or the other? Or is it about having `grid_mapping` and `bounds` in both?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-596954223,https://api.github.com/repos/pydata/xarray/issues/2844,596954223,MDEyOklzc3VlQ29tbWVudDU5Njk1NDIyMw==,2448579,2020-03-10T08:03:37Z,2020-03-10T08:03:37Z,MEMBER,"My preference is for attrs but someone else should weigh in. cc @pydata/xarray
Maybe @dopplershift , as MetPy maintainer has thoughts on this matter?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-587466093,https://api.github.com/repos/pydata/xarray/issues/2844,587466093,MDEyOklzc3VlQ29tbWVudDU4NzQ2NjA5Mw==,22566757,2020-02-18T13:43:12Z,2020-02-18T13:43:12Z,CONTRIBUTOR,"The test failures seem to all be due to recent changes in `cftime`/`CFTimeIndex`, which I haven't touched.
Is sticking the `grid_mapping` and `bounds` attributes in `encoding` good, or should I put them back in `attrs`?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-586273656,https://api.github.com/repos/pydata/xarray/issues/2844,586273656,MDEyOklzc3VlQ29tbWVudDU4NjI3MzY1Ng==,22566757,2020-02-14T12:47:06Z,2020-02-14T12:47:06Z,CONTRIBUTOR,"I just noticed [pandas.PeriodIndex](https://pandas.pydata.org/docs/user_guide/timeseries.html#time-span-representation) would be an alternative to [pandas.IntervalIndex](https://pandas.pydata.org/docs/reference/api/pandas.IntervalIndex.html#pandas.IntervalIndex) for time data if which side the interval is closed on is largely irrelevant for such data.
Is there an interest in using these for 1D coordinates with bounds? I think `ds.groupby_bins()` already returns an `IntervalIndex`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-497760666,https://api.github.com/repos/pydata/xarray/issues/2844,497760666,MDEyOklzc3VlQ29tbWVudDQ5Nzc2MDY2Ng==,2448579,2019-05-31T15:50:28Z,2019-05-31T15:50:28Z,MEMBER,"It isn't just MetPy though. I'm sure there's existing code relying on adding `grid_mapping` and `bounds` to `attrs` in order to write CF-compliant files. So there's a (potentially big) backward compatibility issue. This becomes worse if in the future we keep interpreting more CF attributes and moving them to `encoding` :/.
> Since I'm doing this primarily to get grid_mapping and bounds variables out of ds.data_vars.
I'm +1 on this but I wonder whether saving them in `attrs` and using that information when encoding coordinates would be the more pragmatic choice.
We could define `encoding` as containing a specified set of CF attributes that control on-disk representation such as `units`, `scale_factor`, `contiguous` etc. and leaving everything else in `attrs`. A full list of attributes that belong in `encoding` could be in the docs so that downstream packages can fully depend on this behaviour.
Currently I see `coordinates` is interpreted and moved to `encoding`. In the above proposal, this would be left in `attrs` but its value would still be interpreted if `decode_coords=True`.
What do you think?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-497566742,https://api.github.com/repos/pydata/xarray/issues/2844,497566742,MDEyOklzc3VlQ29tbWVudDQ5NzU2Njc0Mg==,22566757,2019-05-31T04:00:17Z,2019-05-31T04:00:17Z,CONTRIBUTOR,"Switched to use `in` rather than `is not None`.
Re: `grid_mapping` in `.encoding` not `.attrs`
[MetPy assumes `grid_mapping` will be in `.attrs`](https://github.com/Unidata/MetPy/blob/master/metpy/xarray.py#L200). Since the [xarray documentation mentions this capability](http://xarray.pydata.org/en/latest/weather-climate.html#cf-compliant-coordinate-variables), should I be making concurrent changes to MetPy to allow this to continue?
If so, would it be sufficient to change their `.attrs` references to `.encoding` and mentioning in both sets of documentation that the user should call `ds.metpy.parse_cf()` immediately after loading to ensure the information is available for MetPy to use? I don't entirely understand the accessor API.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-497558317,https://api.github.com/repos/pydata/xarray/issues/2844,497558317,MDEyOklzc3VlQ29tbWVudDQ5NzU1ODMxNw==,22566757,2019-05-31T03:04:06Z,2019-05-31T03:13:53Z,CONTRIBUTOR,"This is briefly mentioned above, in
https://github.com/pydata/xarray/pull/2844#discussion_r270595609
The rationale was that everywhere else xarray uses CF attributes for something, the original values of those attributes are recorded in `var.encoding`, not `var.attrs`, and consistency across a code base is a good thing. Since I'm doing this primarily to get `grid_mapping` and `bounds` variables out of `ds.data_vars`, I don't have strong opinions on the subject.
If you feel strongly to the contrary, there's an idea at the top of this thread for getting `bounds` information encoded in terms xarray already uses in some cases (`Dataset.groupby_bins()`), and the diffs for this PR should help you figure out what needs changing to support this.
For `grid_mapping` there's
http://xarray.pydata.org/en/latest/weather-climate.html#cf-compliant-coordinate-variables
which is enough for my uses.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-497553898,https://api.github.com/repos/pydata/xarray/issues/2844,497553898,MDEyOklzc3VlQ29tbWVudDQ5NzU1Mzg5OA==,2448579,2019-05-31T02:37:31Z,2019-05-31T02:37:31Z,MEMBER,I'm a little confused on why these are in encoding and not in attrs.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-497308484,https://api.github.com/repos/pydata/xarray/issues/2844,497308484,MDEyOklzc3VlQ29tbWVudDQ5NzMwODQ4NA==,358378,2019-05-30T12:11:02Z,2019-05-30T12:11:02Z,CONTRIBUTOR,"I'd like this one to be merged very much. Is there anything holding this back?
Also, it might be nice to update the documentation with info on this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-478298195,https://api.github.com/repos/pydata/xarray/issues/2844,478298195,MDEyOklzc3VlQ29tbWVudDQ3ODI5ODE5NQ==,1217238,2019-03-30T23:27:19Z,2019-03-30T23:27:19Z,MEMBER,"The current VariableCoder interface only works for coders that work at the level of individual variables. But coordinates only make sense on a dataset, so we'll need a different abstraction for that, e.g., a DatasetCoder?
For now, it's probably fine to keep this logic where you have it currently.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-478290053,https://api.github.com/repos/pydata/xarray/issues/2844,478290053,MDEyOklzc3VlQ29tbWVudDQ3ODI5MDA1Mw==,22566757,2019-03-30T21:17:17Z,2019-03-30T21:17:17Z,CONTRIBUTOR,"I can shift this to use encoding only, but I'm having trouble figuring out where that code would go.
Would the preferred path be to create VariableCoder classes for each and add them to encode_cf_variable, then add tests to xarray.tests.test_coding?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093
https://github.com/pydata/xarray/pull/2844#issuecomment-476586154,https://api.github.com/repos/pydata/xarray/issues/2844,476586154,MDEyOklzc3VlQ29tbWVudDQ3NjU4NjE1NA==,22566757,2019-03-26T11:31:05Z,2019-03-26T11:31:05Z,CONTRIBUTOR,"Related to #1475 and #2288 , but this is just keeping the metadata consistent where already present, not extending the data model to include bounds, cells, or projections. I should add a test to ensure saving still works if the bounds are lost when pulling out variables.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,424265093