html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/6323#issuecomment-1497455189,https://api.github.com/repos/pydata/xarray/issues/6323,1497455189,IC_kwDOAMm_X85ZQVpV,15570875,2023-04-05T13:06:12Z,2023-04-05T13:06:12Z,NONE,"In a future where `encoding` has been removed from Xarray's data model entirely, would `open_dataset_with_encoding`, or whatever name gets settled on, still exist? It's not clear to me if removal from the data model means just removing it from Xarray's data structures, or if it also means removing it from Xarray's APIs.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1158378382
https://github.com/pydata/xarray/issues/6323#issuecomment-1496845498,https://api.github.com/repos/pydata/xarray/issues/6323,1496845498,IC_kwDOAMm_X85ZOAy6,15570875,2023-04-05T02:46:02Z,2023-04-05T02:46:02Z,NONE,"In the hypothetical invocation `open_dataset(..., return_encoding=True)`, do you envision the returned encoding as being a separate returned object, or would it still be an attribute on the Dataset object?
I'm guessing the latter, because the subsequent statement 'disable all encoding propagation by discarding encoding attributes once a Dataset has been modified' doesn't make much sense to me for the former.
If so, after encoding attributes are discarded, would there still be an encoding attribute on the Dataset object that the user could reset to the values prior to the Dataset modification? This would enable the user to propagate encoding values through their workflow.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1158378382
https://github.com/pydata/xarray/pull/4409#issuecomment-691146816,https://api.github.com/repos/pydata/xarray/issues/4409,691146816,MDEyOklzc3VlQ29tbWVudDY5MTE0NjgxNg==,15570875,2020-09-11T15:00:01Z,2020-09-11T15:00:01Z,NONE,"I disagree that this is deterministic. If I run the script multiple times, the plot title varies, and I consider the plot title part of the output.
I have jupyter notebooks that create figures and use this code idiom. If I refactor code of mine that is used by these notebooks, I would like to rerun the notebooks to confirm that the notebook results don't change. Having the plot titles change at random complicates this comparison.
I think sorting the coordinates would avoid this difficulty that I encounter.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,694448177
https://github.com/pydata/xarray/pull/4409#issuecomment-691120654,https://api.github.com/repos/pydata/xarray/issues/4409,691120654,MDEyOklzc3VlQ29tbWVudDY5MTEyMDY1NA==,15570875,2020-09-11T14:15:42Z,2020-09-11T14:15:42Z,NONE,"Here's another example that yields non-deterministic coordinate order, which propagates into a plot title when selection is done on the coordinates. When I run the code below, the title is sometimes `x = 0.0, y = 0.0` and sometimes `y = 0.0, x = 0.0`.
This is in a new conda environment that I created using the command `conda create -n title_order python=3.7 matplotlib xarray`. Output from `xr.show_versions()` is below.
I think the non-determinism is coming from the command `ds_subset = ds[['var']]`.
```
import numpy as np
import xarray as xr
xlen = 3
ylen = 4
zlen = 5
x = xr.DataArray(np.linspace(0.0, 1.0, xlen), dims=('x'))
y = xr.DataArray(np.linspace(0.0, 1.0, ylen), dims=('y'))
z = xr.DataArray(np.linspace(0.0, 1.0, zlen), dims=('z'))
vals = np.arange(xlen*ylen*zlen, dtype='float64').reshape((xlen, ylen, zlen))
da = xr.DataArray(vals, dims=('x', 'y', 'z'), coords={'x': x, 'y': y, 'z': z})
ds = xr.Dataset({'var': da})
print('coords for var in original Dataset')
print(ds['var'].coords)
print('**********')
ds_subset = ds[['var']]
print('coords for var after subsetting')
print(ds_subset['var'].coords)
print('**********')
p = ds_subset['var'].isel(x=0,y=0).plot()
print('title for plot() with dim selection')
print(p[0].axes.get_title())
```
Output of xr.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.7.8 | packaged by conda-forge | (default, Jul 31 2020, 02:25:08)
[GCC 7.5.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-1127.13.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: None
libnetcdf: None
xarray: 0.16.0
pandas: 1.1.2
numpy: 1.19.1
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.3.1
cartopy: None
seaborn: None
numbagg: None
pint: None
setuptools: 49.6.0.post20200814
pip: 20.2.3
conda: None
pytest: None
IPython: None
sphinx: None
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,694448177
https://github.com/pydata/xarray/issues/3606#issuecomment-563480104,https://api.github.com/repos/pydata/xarray/issues/3606,563480104,MDEyOklzc3VlQ29tbWVudDU2MzQ4MDEwNA==,15570875,2019-12-09T23:02:24Z,2019-12-09T23:02:24Z,NONE,"How about
> If `deep=True`, a deep copy is made of the data array. Otherwise, a shallow copy is made, and the returned data array's values are a new view of this data array's values.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,535043825
https://github.com/pydata/xarray/issues/2921#issuecomment-489752548,https://api.github.com/repos/pydata/xarray/issues/2921,489752548,MDEyOklzc3VlQ29tbWVudDQ4OTc1MjU0OA==,15570875,2019-05-06T19:52:47Z,2019-05-06T19:52:47Z,NONE,"It looks like `ds.time.encoding` is not getting set when `open_mfdataset` is opening multiple files. I suspect that this is leading to the surprising unit for `time` when the `ds` is written out. The code below demonstrates that `ds.time.encoding` is set by `open_mfdataset` in the single-file case and is not set in the multi-file case. However, `ds.time_bounds.encoding` is set in both the single- and multi-file cases.
The possibility of this is alluded to the in a [comment](https://github.com/pydata/xarray/issues/2436#issuecomment-449737841) in #2436, which relates the issue to #1614.
```
import numpy as np
import xarray as xr
# create time and time_bounds DataArrays for Jan-1850 and Feb-1850
time_bounds_vals = np.array([[0.0, 31.0], [31.0, 59.0]])
time_vals = time_bounds_vals.mean(axis=1)
time_var = xr.DataArray(time_vals, dims='time',
coords={'time':time_vals})
time_bounds_var = xr.DataArray(time_bounds_vals, dims=('time', 'd2'),
coords={'time':time_vals})
# create Dataset of time and time_bounds
ds = xr.Dataset(coords={'time':time_var}, data_vars={'time_bounds':time_bounds_var})
ds.time.attrs = {'bounds':'time_bounds', 'calendar':'noleap',
'units':'days since 1850-01-01'}
# write Jan-1850 values to file
ds.isel(time=slice(0,1)).to_netcdf('Jan-1850.nc', unlimited_dims='time')
# write Feb-1850 values to file
ds.isel(time=slice(1,2)).to_netcdf('Feb-1850.nc', unlimited_dims='time')
# use open_mfdataset to read in files, combining into 1 Dataset
decode_times = True
decode_cf = True
ds = xr.open_mfdataset(['Jan-1850.nc'],
decode_cf=decode_cf, decode_times=decode_times)
print('time and time_bounds encoding, single-file open_mfdataset')
print(ds.time.encoding)
print(ds.time_bounds.encoding)
# use open_mfdataset to read in files, combining into 1 Dataset
decode_times = True
decode_cf = True
ds = xr.open_mfdataset(['Jan-1850.nc', 'Feb-1850.nc'],
decode_cf=decode_cf, decode_times=decode_times)
print('--------------------')
print('time and time_bounds encoding, multi-file open_mfdataset')
print(ds.time.encoding)
print(ds.time_bounds.encoding)
```
produces
```
time and time_bounds encoding, single-file open_mfdataset
{'zlib': False, 'shuffle': False, 'complevel': 0, 'fletcher32': False, 'contiguous': False, 'chunksizes': (512,), 'source': '/gpfs/fs1/work/klindsay/analysis/CESM2_coup_carb_cycle_JAMES/Jan-1850.nc', 'original_shape': (1,), 'dtype': dtype('float64'), '_FillValue': nan, 'units': 'days since 1850-01-01', 'calendar': 'noleap'}
{'zlib': False, 'shuffle': False, 'complevel': 0, 'fletcher32': False, 'contiguous': False, 'chunksizes': (1, 2), 'source': '/gpfs/fs1/work/klindsay/analysis/CESM2_coup_carb_cycle_JAMES/Jan-1850.nc', 'original_shape': (1, 2), 'dtype': dtype('float64'), '_FillValue': nan, 'units': 'days since 1850-01-01', 'calendar': 'noleap'}
--------------------
time and time_bounds encoding, multi-file open_mfdataset
{}
{'zlib': False, 'shuffle': False, 'complevel': 0, 'fletcher32': False, 'contiguous': False, 'chunksizes': (1, 2), 'source': '/gpfs/fs1/work/klindsay/analysis/CESM2_coup_carb_cycle_JAMES/Jan-1850.nc', 'original_shape': (1, 2), 'dtype': dtype('float64'), '_FillValue': nan, 'units': 'days since 1850-01-01', 'calendar': 'noleap'}
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,437418525
https://github.com/pydata/xarray/issues/521#issuecomment-474906014,https://api.github.com/repos/pydata/xarray/issues/521,474906014,MDEyOklzc3VlQ29tbWVudDQ3NDkwNjAxNA==,15570875,2019-03-20T16:09:54Z,2019-03-20T16:09:54Z,NONE,"@AJueling , do you know the provenance of the file with `time.attrs.bounds /= 'time_bound'`? If that file is being produced by an NCAR or CESM supplied workflow, then I am willing to see if the workflow can be corrected to keep `time.attrs.bounds = 'time_bound'`. With this mismatch, it seems hopeless for xarray to automatically figure out how to handle this file as it was intended to be handled.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,99836561
https://github.com/pydata/xarray/issues/521#issuecomment-474897336,https://api.github.com/repos/pydata/xarray/issues/521,474897336,MDEyOklzc3VlQ29tbWVudDQ3NDg5NzMzNg==,15570875,2019-03-20T15:52:45Z,2019-03-20T15:52:45Z,NONE,"@rabernat , it is not clear to me that issue 2 is an objective error in the metadata.
The [CF conventions](http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html#cell-boundaries) section on the `bounds` attribute states:
> Since a boundary variable is considered to be part of a coordinate variable’s metadata, it is not necessary to provide it with attributes such as `long_name` and `units`.
>
> Boundary variable attributes which determine the coordinate type (`units`, `standard_name`, `axis` and `positive`) or those which affect the interpretation of the array values (`units`, `calendar`, `leap_month`, `leap_year` and `month_lengths`) must always agree exactly with the same attributes of its associated coordinate, scalar coordinate or auxiliary coordinate variable. To avoid duplication, however, it is recommended that these are not provided to a boundary variable.
I conclude from this that software parsing CF metadata should have the variable identified by the `bounds` attribute inherit the attributes mentioned above from the variable with the `bounds` attribute. @spencerkclark describes this as a work around. One could argue that based on the CF conventions text, xarray would be justified in dong that automatically.
However, this is confounded by issue 3, that `time.attrs.bounds /= 'time_bound'`, which I agree is an error in the metadata. As a CESM-POP developer, I'm surprised to see that. Raw model output from CESM-POP has `time.attrs.bounds = 'time_bound'`. So it seems like something in a post-processing workflow has the net effect of changing `time.attrs.bounds`, but is preserving the name of the variable `bounds`. That is problematic.
If CESM-POP were to adhere more closely to the CF recommendation in this section, I think it would drop `time_bound.attrs.units`, not add `time_bound.attrs.calendar`. But I don't think that is what you are suggesting.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,99836561
https://github.com/pydata/xarray/issues/2752#issuecomment-461519515,https://api.github.com/repos/pydata/xarray/issues/2752,461519515,MDEyOklzc3VlQ29tbWVudDQ2MTUxOTUxNQ==,15570875,2019-02-07T17:22:49Z,2019-02-07T17:22:49Z,NONE,"Thanks for the quick responses @jhamman and @dcherian. I had focused on the descriptive text below the function signature, which mentions defaults for some, but not all arguments. I now realize that I need to also examine the function signature in the docs. Sorry for the noise.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,407750967