id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 927336712,MDU6SXNzdWU5MjczMzY3MTI=,5510,Can't remove coordinates attribute from DataArrays ,40183561,open,0,,,11,2021-06-22T15:05:43Z,2023-09-29T11:48:50Z,,CONTRIBUTOR,,,," **What happened**: Coordinates added to some variables unexpectedly. I noticed this after outputting to netCDF. What I have: ``` variables: double time(time) ; time:bounds = ""time_bnds"" ; time:axis = ""T"" ; time:long_name = ""valid_time"" ; time:standard_name = ""time"" ; time:units = ""days since 1850-01-01"" ; time:calendar = ""gregorian"" ; double time_bnds(time, bnds) ; time_bnds:_FillValue = NaN ; time_bnds:coordinates = ""reftime leadtime height"" ; double lat(lat) ; lat:bounds = ""lat_bnds"" ; lat:units = ""degrees_north"" ; lat:axis = ""Y"" ; lat:long_name = ""latitude"" ; lat:standard_name = ""latitude"" ; double lat_bnds(lat, bnds) ; lat_bnds:_FillValue = NaN ; lat_bnds:coordinates = ""reftime height"" ; double lon(lon) ; lon:bounds = ""lon_bnds"" ; lon:units = ""degrees_east"" ; lon:axis = ""X"" ; lon:long_name = ""Longitude"" ; lon:standard_name = ""longitude"" ; double lon_bnds(lon, bnds) ; lon_bnds:_FillValue = NaN ; lon_bnds:coordinates = ""reftime height"" ; double height ; height:_FillValue = NaN ; height:units = ""m"" ; height:axis = ""Z"" ; height:positive = ""up"" ; height:long_name = ""height"" ; height:standard_name = ""height"" ; float tas(time, lat, lon) ; tas:_FillValue = 1.e+20f ; tas:standard_name = ""air_temperature"" ; tas:long_name = ""Near-Surface Air Temperature"" ; tas:comment = ""near-surface (usually, 2 meter) air temperature"" ; tas:units = ""K"" ; tas:cell_methods = ""area: time: mean"" ; tas:cell_measures = ""area: areacella"" ; tas:history = ""2019-05-11T15:53:32Z altered by CMOR: Treated scalar dimension: \'height\'. 2019-05-11T15:53:32Z altered by CMOR: Reordered dimensions, original order: lat lon time."" ; tas:coordinates = ""height reftime leadtime"" ; tas:missing_value = 1.e+20f ; int reftime ; reftime:long_name = ""Start date of the forecast"" ; reftime:standard_name = ""forecast_reference_time"" ; reftime:units = ""days since 1850-01-01"" ; reftime:calendar = ""gregorian"" ; double leadtime(time) ; leadtime:_FillValue = NaN ; leadtime:long_name = ""Time elapsed since the start of the forecast"" ; leadtime:standard_name = ""forecast_period"" ; leadtime:units = ""days"" ; int realization ; realization:long_name = ""realization"" ; realization:comment = ""For more information on the ripf, refer to the variant_label, initialization_description, physics_description and forcing_description global attributes"" ; realization:coordinates = ""reftime height"" ; ``` On `time_bnds`, `lon_bnds`, `lat_bnds` and `realization` there is `coordinates` that I wouldn't expect to be there. **What you expected to happen**: Looking only at the coordinates attribute, I expected my ncdump to show: ``` variables: int reftime ; reftime:long_name = ""Start date of the forecast"" ; reftime:standard_name = ""forecast_reference_time"" ; reftime:units = ""days since 1850-01-01"" ; reftime:calendar = ""gregorian"" ; double leadtime(time) ; leadtime:long_name = ""Time elapsed since the start of the forecast"" ; leadtime:standard_name = ""forecast_period"" ; leadtime:units = ""days"" ; int realization ; realization:long_name = ""realization"" ; realization:comment = ""For more information on the ripf, refer to the variant_label, initialization_description, physics_description and forcing_description global attributes"" ; double time(time) ; time:bounds = ""time_bnds"" ; time:axis = ""T"" ; time:standard_name = ""time"" ; time:units = ""days since 1850-01-01"" ; time:calendar = ""gregorian"" ; time:long_name = ""valid_time"" ; double time_bnds(time, bnds) ; time_bnds:units = ""days since 1850-01-01"" ; double lat(lat) ; lat:bounds = ""lat_bnds"" ; lat:units = ""degrees_north"" ; lat:axis = ""Y"" ; lat:long_name = ""latitude"" ; lat:standard_name = ""latitude"" ; double lat_bnds(lat, bnds) ; double lon(lon) ; lon:bounds = ""lon_bnds"" ; lon:units = ""degrees_east"" ; lon:axis = ""X"" ; lon:long_name = ""Longitude"" ; lon:standard_name = ""longitude"" ; double lon_bnds(lon, bnds) ; double height ; height:units = ""m"" ; height:axis = ""Z"" ; height:positive = ""up"" ; height:long_name = ""height"" ; height:standard_name = ""height"" ; float tas(time, lat, lon) ; tas:standard_name = ""air_temperature"" ; tas:long_name = ""Near-Surface Air Temperature"" ; tas:comment = ""near-surface (usually, 2 meter) air temperature"" ; tas:units = ""K"" ; tas:cell_methods = ""area: time: mean"" ; tas:cell_measures = ""area: areacella"" ; tas:history = ""2019-05-11T15:53:32Z altered by CMOR: Treated scalar dimension: \'height\'. 2019-05-11T15:53:32Z altered by CMOR: Reordered dimensions, original order: lat lon time."" ; tas:missing_value = 1.e+20f ; tas:_FillValue = 1.e+20f ; tas:coordinates = ""height reftime leadtime"" ; ``` I tried to remove this in the xarray dataset, but whatever I tried they always ended up back in there: ``` >>> import xarray as xr >>> ds = xr.open_dataset(""file.nc"", use_cftime=True) # show coords on realization >>> ds.realization array(1, dtype=int32) Coordinates: height float64 ... reftime object ... Attributes: long_name: realization comment: For more information on the ripf, refer to the variant_label,... # try reset_coords - removes the coords >>> ds.realization.reset_coords(names=[""height"", ""reftime""], drop=True) array(1, dtype=int32) Attributes: long_name: realization comment: For more information on the ripf, refer to the variant_label,... # set realization with result of reset_coords >>> ds[""realization""] = ds.realization.reset_coords(names=[""height"", ""reftime""], drop=True) # coords back in >>> ds.realization array(1, dtype=int32) Coordinates: height float64 ... reftime object ... Attributes: long_name: realization comment: For more information on the ripf, refer to the variant_label,... # try drop_vars - same thing happens >>> ds.realization.drop_vars((""height"", ""reftime"")) array(1, dtype=int32) Attributes: long_name: realization comment: For more information on the ripf, refer to the variant_label,... >>> ds[""realization""] = ds.realization.drop_vars((""height"", ""reftime"")) >>> ds.realization array(1, dtype=int32) Coordinates: height float64 ... reftime object ... Attributes: long_name: realization comment: For more information on the ripf, refer to the variant_label,... # tried creating a new variable to see if the same thing happens - it does >>> ds[""test""] = ds.realization.drop_vars((""height"", ""reftime"")) >>> ds.test array(1, dtype=int32) Coordinates: height float64 ... reftime object ... Attributes: long_name: realization comment: For more information on the ripf, refer to the variant_label,... ``` This seems like incorrect behaviour, but perhaps it is expected? **Minimal Complete Verifiable Example**: ```python >>> data = xr.DataArray(np.random.randn(2, 3), dims=(""x"", ""y""), coords={""x"": [10, 20]}) >>> ds = xr.Dataset({""foo"": data, ""bar"": (""x"", [1, 2]), ""fake"": 10}) >>> ds = ds.assign_coords({""reftime"":np.array(""2004-11-01T00:00:00"", dtype=np.datetime64)}) >>> ds = ds.assign({""test"": 1}) >>> ds.test array(1) Coordinates: reftime datetime64[ns] 2004-11-01 >>> ds.test.reset_coords(names=[""reftime""], drop=True) array(1) >>> ds[""test""] = ds.test.reset_coords(names=[""reftime""], drop=True) >>> ds.test array(1) Coordinates: reftime datetime64[ns] 2004-11-01 ds.to_netcdf(""file.nc"") ``` ``` ncdump -h file.nc netcdf file { dimensions: x = 2 ; y = 3 ; variables: int64 x(x) ; double foo(x, y) ; foo:_FillValue = NaN ; foo:coordinates = ""reftime"" ; int64 bar(x) ; bar:coordinates = ""reftime"" ; int64 fake ; fake:coordinates = ""reftime"" ; int64 reftime ; reftime:units = ""days since 2004-11-01 00:00:00"" ; reftime:calendar = ""proleptic_gregorian"" ; int64 test ; test:coordinates = ""reftime"" ; } ``` **Environment**:
Output of xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Mar 27 2019, 16:54:48) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 18.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: ('en_GB', 'UTF-8') libhdf5: 1.10.5 libnetcdf: 4.6.3 xarray: 0.18.2 pandas: 1.1.3 numpy: 1.19.2 scipy: None netCDF4: 1.5.4 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.4.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2.30.0 distributed: 2.30.0 matplotlib: None cartopy: None seaborn: None numbagg: None pint: None setuptools: 54.1.1 pip: 21.0.1 conda: None pytest: 6.2.2 IPython: 7.21.0 sphinx: 1.8.1
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5510/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,reopened,13221727,issue 928349653,MDExOlB1bGxSZXF1ZXN0Njc2MzY3MDQ1,5514,Allow user to explicitly disable coordinates attribute,40183561,closed,0,,,2,2021-06-23T14:55:22Z,2021-07-01T16:20:42Z,2021-07-01T15:50:33Z,CONTRIBUTOR,,0,pydata/xarray/pulls/5514," - [x] Closes #5510 - [x] Tests added - added one test - not sure if it should be somewhere else - [x] Passes `pre-commit run --all-files` - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5514/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 913431220,MDU6SXNzdWU5MTM0MzEyMjA=,5448,_FillValue added with output to netcdf,40183561,closed,0,,,1,2021-06-07T11:10:57Z,2021-06-07T16:43:04Z,2021-06-07T16:43:04Z,CONTRIBUTOR,,,," **What happened**: When opening a netCDF file with xarray, then outputting to netCDF, `_FillValue` is added. This is an issue for coordinate variables time, lat and lon as CF conventions say that coordinate variables cannot have missing values, so we don't want this added `_FillValue` in this case. e.g. Before working with xarray: ``` double lat(lat) ; lat:bounds = ""lat_bnds"" ; lat:units = ""degrees_north"" ; lat:axis = ""Y"" ; lat:long_name = ""latitude"" ; lat:standard_name = ""latitude"" ; ``` After working with xarray: ``` double lat(lat) ; lat:_FillValue = NaN ; lat:bounds = ""lat_bnds"" ; lat:units = ""degrees_north"" ; lat:axis = ""Y"" ; lat:long_name = ""latitude"" ; lat:standard_name = ""latitude"" ; ``` `_FIllValue` is also added to the netCDF file for other variables but the main issue is the coordinate variables. By setting `ds.lat.encoding['_FillValue'] = None` before outputting to NetCDF, the `lat:_FillValue = NaN ;` can be removed, but I don't think it should be there in the first place? **What you expected to happen**: Output in file after `to_netcdf` to be the same as the input file. **Minimal Complete Verifiable Example**: ```python import xarray as xr my_file = xr.open_dataset(""ts_Amon_HadGEM3-GC31-LL_ssp126_r1i1p1f3_gn_201501-204912.nc"", use_cftime=True) my_file.to_netcdf(""my_file_output.nc"") ``` The file I used can be retrieved using https://dap.ceda.ac.uk/badc/cmip6/data/CMIP6/ScenarioMIP/MOHC/HadGEM3-GC31-LL/ssp126/r1i1p1f3/Amon/ts/gn/v20200114/ts_Amon_HadGEM3-GC31-LL_ssp126_r1i1p1f3_gn_201501-204912.nc, but this happens with any file. Then comparing the ncdumps: ``` ncdump -h ts_Amon_HadGEM3-GC31-LL_ssp126_r1i1p1f3_gn_201501-204912.nc netcdf ts_Amon_HadGEM3-GC31-LL_ssp126_r1i1p1f3_gn_201501-204912 { dimensions: time = UNLIMITED ; // (420 currently) bnds = 2 ; lat = 144 ; lon = 192 ; variables: double time(time) ; time:bounds = ""time_bnds"" ; time:units = ""days since 1850-01-01"" ; time:calendar = ""360_day"" ; time:axis = ""T"" ; time:long_name = ""time"" ; time:standard_name = ""time"" ; double time_bnds(time, bnds) ; double lat(lat) ; lat:bounds = ""lat_bnds"" ; lat:units = ""degrees_north"" ; lat:axis = ""Y"" ; lat:long_name = ""Latitude"" ; lat:standard_name = ""latitude"" ; double lat_bnds(lat, bnds) ; double lon(lon) ; lon:bounds = ""lon_bnds"" ; lon:units = ""degrees_east"" ; lon:axis = ""X"" ; lon:long_name = ""Longitude"" ; lon:standard_name = ""longitude"" ; double lon_bnds(lon, bnds) ; float ts(time, lat, lon) ; ts:standard_name = ""surface_temperature"" ; ts:long_name = ""Surface Temperature"" ; ts:comment = ""Temperature of the lower boundary of the atmosphere"" ; ts:units = ""K"" ; ts:original_name = ""mo: (stash: m01s00i024, lbproc: 128)"" ; ts:cell_methods = ""area: time: mean"" ; ts:cell_measures = ""area: areacella"" ; ts:history = ""2020-01-13T10:05:19Z altered by CMOR: replaced missing value flag (-1.07374e+09) with standard missing value (1e+20)."" ; ts:missing_value = 1.e+20f ; ts:_FillValue = 1.e+20f ; ``` and ``` ncdump -h my_file_output.nc netcdf my_file_output { dimensions: time = UNLIMITED ; // (420 currently) bnds = 2 ; lat = 144 ; lon = 192 ; variables: double time(time) ; time:_FillValue = NaN ; time:bounds = ""time_bnds"" ; time:axis = ""T"" ; time:long_name = ""time"" ; time:standard_name = ""time"" ; time:units = ""days since 1850-01-01"" ; time:calendar = ""360_day"" ; double time_bnds(time, bnds) ; time_bnds:_FillValue = NaN ; double lat(lat) ; lat:_FillValue = NaN ; lat:bounds = ""lat_bnds"" ; lat:units = ""degrees_north"" ; lat:axis = ""Y"" ; lat:long_name = ""Latitude"" ; lat:standard_name = ""latitude"" ; double lat_bnds(lat, bnds) ; lat_bnds:_FillValue = NaN ; double lon(lon) ; lon:_FillValue = NaN ; lon:bounds = ""lon_bnds"" ; lon:units = ""degrees_east"" ; lon:axis = ""X"" ; lon:long_name = ""Longitude"" ; lon:standard_name = ""longitude"" ; double lon_bnds(lon, bnds) ; lon_bnds:_FillValue = NaN ; float ts(time, lat, lon) ; ts:_FillValue = 1.e+20f ; ts:standard_name = ""surface_temperature"" ; ts:long_name = ""Surface Temperature"" ; ts:comment = ""Temperature of the lower boundary of the atmosphere"" ; ts:units = ""K"" ; ts:original_name = ""mo: (stash: m01s00i024, lbproc: 128)"" ; ts:cell_methods = ""area: time: mean"" ; ts:cell_measures = ""area: areacella"" ; ts:history = ""2020-01-13T10:05:19Z altered by CMOR: replaced missing value flag (-1.07374e+09) with standard missing value (1e+20)."" ; ts:missing_value = 1.e+20f ; ``` shows that ``_FillValue`` has been added to `time`, `time_bnds`, `lat`, `lat_bnds`, `lon` and `lon_bnds` **Environment**:
Output of xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Mar 27 2019, 16:54:48) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 18.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.10.5 libnetcdf: 4.6.3 xarray: 0.18.2 pandas: 1.1.3 numpy: 1.19.2 scipy: None netCDF4: 1.5.4 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.4.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2.30.0 distributed: 2.30.0 matplotlib: None cartopy: None seaborn: None numbagg: None pint: None setuptools: 54.1.1 pip: 21.0.1 conda: None pytest: 6.2.2 IPython: 7.21.0 sphinx: 1.8.1
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5448/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue