html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/6323#issuecomment-1497455189,https://api.github.com/repos/pydata/xarray/issues/6323,1497455189,IC_kwDOAMm_X85ZQVpV,15570875,2023-04-05T13:06:12Z,2023-04-05T13:06:12Z,NONE,"In a future where `encoding` has been removed from Xarray's data model entirely, would `open_dataset_with_encoding`, or whatever name gets settled on, still exist? It's not clear to me if removal from the data model means just removing it from Xarray's data structures, or if it also means removing it from Xarray's APIs.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1158378382 https://github.com/pydata/xarray/issues/6323#issuecomment-1496845498,https://api.github.com/repos/pydata/xarray/issues/6323,1496845498,IC_kwDOAMm_X85ZOAy6,15570875,2023-04-05T02:46:02Z,2023-04-05T02:46:02Z,NONE,"In the hypothetical invocation `open_dataset(..., return_encoding=True)`, do you envision the returned encoding as being a separate returned object, or would it still be an attribute on the Dataset object? I'm guessing the latter, because the subsequent statement 'disable all encoding propagation by discarding encoding attributes once a Dataset has been modified' doesn't make much sense to me for the former. If so, after encoding attributes are discarded, would there still be an encoding attribute on the Dataset object that the user could reset to the values prior to the Dataset modification? This would enable the user to propagate encoding values through their workflow.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1158378382 https://github.com/pydata/xarray/pull/4409#issuecomment-691146816,https://api.github.com/repos/pydata/xarray/issues/4409,691146816,MDEyOklzc3VlQ29tbWVudDY5MTE0NjgxNg==,15570875,2020-09-11T15:00:01Z,2020-09-11T15:00:01Z,NONE,"I disagree that this is deterministic. If I run the script multiple times, the plot title varies, and I consider the plot title part of the output. I have jupyter notebooks that create figures and use this code idiom. If I refactor code of mine that is used by these notebooks, I would like to rerun the notebooks to confirm that the notebook results don't change. Having the plot titles change at random complicates this comparison. I think sorting the coordinates would avoid this difficulty that I encounter.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,694448177 https://github.com/pydata/xarray/pull/4409#issuecomment-691120654,https://api.github.com/repos/pydata/xarray/issues/4409,691120654,MDEyOklzc3VlQ29tbWVudDY5MTEyMDY1NA==,15570875,2020-09-11T14:15:42Z,2020-09-11T14:15:42Z,NONE,"Here's another example that yields non-deterministic coordinate order, which propagates into a plot title when selection is done on the coordinates. When I run the code below, the title is sometimes `x = 0.0, y = 0.0` and sometimes `y = 0.0, x = 0.0`. This is in a new conda environment that I created using the command `conda create -n title_order python=3.7 matplotlib xarray`. Output from `xr.show_versions()` is below. I think the non-determinism is coming from the command `ds_subset = ds[['var']]`. ``` import numpy as np import xarray as xr xlen = 3 ylen = 4 zlen = 5 x = xr.DataArray(np.linspace(0.0, 1.0, xlen), dims=('x')) y = xr.DataArray(np.linspace(0.0, 1.0, ylen), dims=('y')) z = xr.DataArray(np.linspace(0.0, 1.0, zlen), dims=('z')) vals = np.arange(xlen*ylen*zlen, dtype='float64').reshape((xlen, ylen, zlen)) da = xr.DataArray(vals, dims=('x', 'y', 'z'), coords={'x': x, 'y': y, 'z': z}) ds = xr.Dataset({'var': da}) print('coords for var in original Dataset') print(ds['var'].coords) print('**********') ds_subset = ds[['var']] print('coords for var after subsetting') print(ds_subset['var'].coords) print('**********') p = ds_subset['var'].isel(x=0,y=0).plot() print('title for plot() with dim selection') print(p[0].axes.get_title()) ```
Output of xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.7.8 | packaged by conda-forge | (default, Jul 31 2020, 02:25:08) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1127.13.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None xarray: 0.16.0 pandas: 1.1.2 numpy: 1.19.1 scipy: None netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.3.1 cartopy: None seaborn: None numbagg: None pint: None setuptools: 49.6.0.post20200814 pip: 20.2.3 conda: None pytest: None IPython: None sphinx: None
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,694448177 https://github.com/pydata/xarray/issues/3606#issuecomment-563480104,https://api.github.com/repos/pydata/xarray/issues/3606,563480104,MDEyOklzc3VlQ29tbWVudDU2MzQ4MDEwNA==,15570875,2019-12-09T23:02:24Z,2019-12-09T23:02:24Z,NONE,"How about > If `deep=True`, a deep copy is made of the data array. Otherwise, a shallow copy is made, and the returned data array's values are a new view of this data array's values.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,535043825 https://github.com/pydata/xarray/issues/2921#issuecomment-489752548,https://api.github.com/repos/pydata/xarray/issues/2921,489752548,MDEyOklzc3VlQ29tbWVudDQ4OTc1MjU0OA==,15570875,2019-05-06T19:52:47Z,2019-05-06T19:52:47Z,NONE,"It looks like `ds.time.encoding` is not getting set when `open_mfdataset` is opening multiple files. I suspect that this is leading to the surprising unit for `time` when the `ds` is written out. The code below demonstrates that `ds.time.encoding` is set by `open_mfdataset` in the single-file case and is not set in the multi-file case. However, `ds.time_bounds.encoding` is set in both the single- and multi-file cases. The possibility of this is alluded to the in a [comment](https://github.com/pydata/xarray/issues/2436#issuecomment-449737841) in #2436, which relates the issue to #1614. ``` import numpy as np import xarray as xr # create time and time_bounds DataArrays for Jan-1850 and Feb-1850 time_bounds_vals = np.array([[0.0, 31.0], [31.0, 59.0]]) time_vals = time_bounds_vals.mean(axis=1) time_var = xr.DataArray(time_vals, dims='time', coords={'time':time_vals}) time_bounds_var = xr.DataArray(time_bounds_vals, dims=('time', 'd2'), coords={'time':time_vals}) # create Dataset of time and time_bounds ds = xr.Dataset(coords={'time':time_var}, data_vars={'time_bounds':time_bounds_var}) ds.time.attrs = {'bounds':'time_bounds', 'calendar':'noleap', 'units':'days since 1850-01-01'} # write Jan-1850 values to file ds.isel(time=slice(0,1)).to_netcdf('Jan-1850.nc', unlimited_dims='time') # write Feb-1850 values to file ds.isel(time=slice(1,2)).to_netcdf('Feb-1850.nc', unlimited_dims='time') # use open_mfdataset to read in files, combining into 1 Dataset decode_times = True decode_cf = True ds = xr.open_mfdataset(['Jan-1850.nc'], decode_cf=decode_cf, decode_times=decode_times) print('time and time_bounds encoding, single-file open_mfdataset') print(ds.time.encoding) print(ds.time_bounds.encoding) # use open_mfdataset to read in files, combining into 1 Dataset decode_times = True decode_cf = True ds = xr.open_mfdataset(['Jan-1850.nc', 'Feb-1850.nc'], decode_cf=decode_cf, decode_times=decode_times) print('--------------------') print('time and time_bounds encoding, multi-file open_mfdataset') print(ds.time.encoding) print(ds.time_bounds.encoding) ``` produces ``` time and time_bounds encoding, single-file open_mfdataset {'zlib': False, 'shuffle': False, 'complevel': 0, 'fletcher32': False, 'contiguous': False, 'chunksizes': (512,), 'source': '/gpfs/fs1/work/klindsay/analysis/CESM2_coup_carb_cycle_JAMES/Jan-1850.nc', 'original_shape': (1,), 'dtype': dtype('float64'), '_FillValue': nan, 'units': 'days since 1850-01-01', 'calendar': 'noleap'} {'zlib': False, 'shuffle': False, 'complevel': 0, 'fletcher32': False, 'contiguous': False, 'chunksizes': (1, 2), 'source': '/gpfs/fs1/work/klindsay/analysis/CESM2_coup_carb_cycle_JAMES/Jan-1850.nc', 'original_shape': (1, 2), 'dtype': dtype('float64'), '_FillValue': nan, 'units': 'days since 1850-01-01', 'calendar': 'noleap'} -------------------- time and time_bounds encoding, multi-file open_mfdataset {} {'zlib': False, 'shuffle': False, 'complevel': 0, 'fletcher32': False, 'contiguous': False, 'chunksizes': (1, 2), 'source': '/gpfs/fs1/work/klindsay/analysis/CESM2_coup_carb_cycle_JAMES/Jan-1850.nc', 'original_shape': (1, 2), 'dtype': dtype('float64'), '_FillValue': nan, 'units': 'days since 1850-01-01', 'calendar': 'noleap'} ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,437418525 https://github.com/pydata/xarray/issues/521#issuecomment-474906014,https://api.github.com/repos/pydata/xarray/issues/521,474906014,MDEyOklzc3VlQ29tbWVudDQ3NDkwNjAxNA==,15570875,2019-03-20T16:09:54Z,2019-03-20T16:09:54Z,NONE,"@AJueling , do you know the provenance of the file with `time.attrs.bounds /= 'time_bound'`? If that file is being produced by an NCAR or CESM supplied workflow, then I am willing to see if the workflow can be corrected to keep `time.attrs.bounds = 'time_bound'`. With this mismatch, it seems hopeless for xarray to automatically figure out how to handle this file as it was intended to be handled.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,99836561 https://github.com/pydata/xarray/issues/521#issuecomment-474897336,https://api.github.com/repos/pydata/xarray/issues/521,474897336,MDEyOklzc3VlQ29tbWVudDQ3NDg5NzMzNg==,15570875,2019-03-20T15:52:45Z,2019-03-20T15:52:45Z,NONE,"@rabernat , it is not clear to me that issue 2 is an objective error in the metadata. The [CF conventions](http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html#cell-boundaries) section on the `bounds` attribute states: > Since a boundary variable is considered to be part of a coordinate variable’s metadata, it is not necessary to provide it with attributes such as `long_name` and `units`. > > Boundary variable attributes which determine the coordinate type (`units`, `standard_name`, `axis` and `positive`) or those which affect the interpretation of the array values (`units`, `calendar`, `leap_month`, `leap_year` and `month_lengths`) must always agree exactly with the same attributes of its associated coordinate, scalar coordinate or auxiliary coordinate variable. To avoid duplication, however, it is recommended that these are not provided to a boundary variable. I conclude from this that software parsing CF metadata should have the variable identified by the `bounds` attribute inherit the attributes mentioned above from the variable with the `bounds` attribute. @spencerkclark describes this as a work around. One could argue that based on the CF conventions text, xarray would be justified in dong that automatically. However, this is confounded by issue 3, that `time.attrs.bounds /= 'time_bound'`, which I agree is an error in the metadata. As a CESM-POP developer, I'm surprised to see that. Raw model output from CESM-POP has `time.attrs.bounds = 'time_bound'`. So it seems like something in a post-processing workflow has the net effect of changing `time.attrs.bounds`, but is preserving the name of the variable `bounds`. That is problematic. If CESM-POP were to adhere more closely to the CF recommendation in this section, I think it would drop `time_bound.attrs.units`, not add `time_bound.attrs.calendar`. But I don't think that is what you are suggesting.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,99836561 https://github.com/pydata/xarray/issues/2752#issuecomment-461519515,https://api.github.com/repos/pydata/xarray/issues/2752,461519515,MDEyOklzc3VlQ29tbWVudDQ2MTUxOTUxNQ==,15570875,2019-02-07T17:22:49Z,2019-02-07T17:22:49Z,NONE,"Thanks for the quick responses @jhamman and @dcherian. I had focused on the descriptive text below the function signature, which mentions defaults for some, but not all arguments. I now realize that I need to also examine the function signature in the docs. Sorry for the noise.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,407750967