issues: 538068264

This data as json

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
538068264	MDU6SXNzdWU1MzgwNjgyNjQ=	3624	Issue serializing arrays of times with certain dtype and _FillValue encodings	6628425	closed	0			0	2019-12-15T15:44:08Z	2020-01-15T15:22:30Z	2020-01-15T15:22:30Z	MEMBER				MCVE Code Sample ``` In [1]: import numpy as np; import pandas as pd; import xarray as xr In [2]: times = pd.date_range('2000', periods=3) In [3]: da = xr.DataArray(times, dims=['a'], coords=[[1, 2, 3]], name='foo') In [4]: da.encoding['_FillValue'] = 1.0e20 In [5]: da.encoding['dtype'] = np.dtype('float64') In [6]: da.to_dataset().to_netcdf('test.nc') OverflowError Traceback (most recent call last) <ipython-input-6-cbc6b2cfdf9a> in <module> ----> 1 da.to_dataset().to_netcdf('test.nc') ~/Software/xarray/xarray/core/dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf) 1548 unlimited_dims=unlimited_dims, 1549 compute=compute, -> 1550 invalid_netcdf=invalid_netcdf, 1551 ) 1552 ~/Software/xarray/xarray/backends/api.py in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf) 1071 # to be parallelized with dask 1072 dump_to_store( -> 1073 dataset, store, writer, encoding=encoding, unlimited_dims=unlimited_dims 1074 ) 1075 if autoclose: ~/Software/xarray/xarray/backends/api.py in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims) 1117 variables, attrs = encoder(variables, attrs) 1118 -> 1119 store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims) 1120 1121 ~/Software/xarray/xarray/backends/common.py in store(self, variables, attributes, check_encoding_set, writer, unlimited_dims) 291 writer = ArrayWriter() 292 --> 293 variables, attributes = self.encode(variables, attributes) 294 295 self.set_attributes(attributes) ~/Software/xarray/xarray/backends/common.py in encode(self, variables, attributes) 380 # All NetCDF files get CF encoded by default, without this attempting 381 # to write times, for example, would fail. --> 382 variables, attributes = cf_encoder(variables, attributes) 383 variables = {k: self.encode_variable(v) for k, v in variables.items()} 384 attributes = {k: self.encode_attribute(v) for k, v in attributes.items()} ~/Software/xarray/xarray/conventions.py in cf_encoder(variables, attributes) 758 _update_bounds_encoding(variables) 759 --> 760 new_vars = {k: encode_cf_variable(v, name=k) for k, v in variables.items()} 761 762 # Remove attrs from bounds variables (issue #2921) ~/Software/xarray/xarray/conventions.py in <dictcomp>(.0) 758 _update_bounds_encoding(variables) 759 --> 760 new_vars = {k: encode_cf_variable(v, name=k) for k, v in variables.items()} 761 762 # Remove attrs from bounds variables (issue #2921) ~/Software/xarray/xarray/conventions.py in encode_cf_variable(var, needs_copy, name) 248 variables.UnsignedIntegerCoder(), 249 ]: --> 250 var = coder.encode(var, name=name) 251 252 # TODO(shoyer): convert all of these to use coders, too: ~/Software/xarray/xarray/coding/variables.py in encode(self, variable, name) 163 if fv is not None: 164 # Ensure _FillValue is cast to same dtype as data's --> 165 encoding["_FillValue"] = data.dtype.type(fv) 166 fill_value = pop_to(encoding, attrs, "_FillValue", name=name) 167 if not pd.isnull(fill_value): OverflowError: Python int too large to convert to C long ``` Expected Output I think this should succeed in writing to a netCDF file (it worked in xarray 0.14.0 and earlier). Problem Description I think this (admittedly very subtle) issue was introduced in https://github.com/pydata/xarray/pull/3502. Essentially at the time `data` enters `CFMaskCoder.encode` it does not necessarily have the `dtype` it will ultimately be encoded with. In the case of this example, `data` has type `int64`, but when it will be stored in the netCDF file it will be a double-precision float. A possible solution here might be to rely on `encoding['dtype']` (if it exists) to determine the type to cast the encoding values for `'_FillValue'` and `'missing_value'` to, instead of relying solely on `data.dtype` (maybe use that as a fallback). cc: @spencerahill Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 \| packaged by conda-forge \| (default, Dec 6 2019, 08:36:57) [Clang 9.0.0 (tags/RELEASE_900/final)] python-bits: 64 OS: Darwin OS-release: 19.0.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.1 xarray: master pandas: 0.25.3 numpy: 1.17.3 scipy: 1.3.2 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.9.0 distributed: 2.9.0 matplotlib: 3.1.2 cartopy: None seaborn: None numbagg: None setuptools: 42.0.2.post20191203 pip: 19.3.1 conda: None pytest: 5.3.2 IPython: 7.10.1 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3624/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		completed	13221727	issue

Links from other tables

0 rows from issues_id in issues_labels
0 rows from issue in issue_comments