id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1773886549,I_kwDOAMm_X85pu1xV,7942,Numpy raises warning in `xarray.coding.times.cast_to_int_if_safe`,132147,closed,0,,,2,2023-06-26T05:03:46Z,2023-09-17T08:15:27Z,2023-09-17T08:15:27Z,CONTRIBUTOR,,,,"### What happened? In recent versions of numpy, calling `numpy.asarray(arr, dtype=numpy.int64)` will raise a warning if the input array contains `numpy.nan` values. This line of code is used in `xarray.coding.times.cast_to_int_if_safe(num)`: ```python def cast_to_int_if_safe(num) -> np.ndarray: int_num = np.asarray(num, dtype=np.int64) if (num == int_num).all(): num = int_num return num ``` The function still returns the correct True/False values regardless of the warning. ### What did you expect to happen? No warning to be printed ### Minimal Complete Verifiable Example ```Python import numpy import xarray one_day = numpy.timedelta64(1, 'D') nat = numpy.timedelta64('nat') timedelta_values = (numpy.arange(5) * one_day).astype('timedelta64[ns]') timedelta_values[2] = nat timedelta_values[4] = nat dataset = xarray.Dataset(data_vars={ 'timedeltas': xarray.DataArray(data=timedelta_values, dims=['x']) }) dataset.to_netcdf('out.nc') ``` ### MVCE confirmation - [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [X] Complete example — the example is self-contained, including all data and the text of any traceback. - [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [X] New issue — a search of GitHub Issues suggests this is not a duplicate. ### Relevant log output ```Python $ python3 safe_cast.py /home/hea211/projects/emsarray/.conda/lib/python3.10/site-packages/xarray/coding/times.py:618: RuntimeWarning: invalid value encountered in cast int_num = np.asarray(num, dtype=np.int64) $ ncdump out.nc netcdf out { dimensions: x = 5 ; variables: double timedeltas(x) ; timedeltas:_FillValue = NaN ; timedeltas:units = ""days"" ; data: timedeltas = 0, 1, _, 3, _ ; } ``` ### Anything else we need to know? I saw the `numpy.can_cast` function and tried to use that to solve the issue (see PR #7834), however this function did not do what I expected it to. A search for other solutions to see whether an array of floating point values is representable as integers turned up [Numpy: Check if float array contains whole numbers](https://stackoverflow.com/questions/35042128/numpy-check-if-float-array-contains-whole-numbers) on Stack Overflow. There are a few solutions given in that question, although each has its drawbacks. The most complete solution appears to be [is_integer_ufunc](https://gitlab.com/madphysicist/is_integer_ufunc), which is a ufunc written in C. Unfortunately this is not installable via pip/conda, and is not included in numpy. ### Environment
In [2]: import xarray as xr ...: xr.show_versions() /home/hea211/projects/emsarray/.conda/lib/python3.10/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils. warnings.warn(""Setuptools is replacing distutils."") INSTALLED VERSIONS ------------------ commit: None python: 3.10.0 (default, Mar 3 2022, 09:58:08) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 5.15.0-73-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_AU.UTF-8 LOCALE: ('en_AU', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.9.1 xarray: 2023.4.2 pandas: 2.0.1 numpy: 1.24.3 scipy: None netCDF4: 1.6.3 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: 1.3.7 dask: 2023.4.1 distributed: 2023.4.1 matplotlib: 3.7.1 cartopy: 0.21.1 seaborn: None numbagg: None fsspec: 2023.5.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 65.6.3 pip: 22.3.1 conda: None pytest: 7.3.1 mypy: 1.3.0 IPython: 8.12.0 sphinx: 4.3.2
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7942/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 1705163672,PR_kwDOAMm_X85QQiiY,7834,Use `numpy.can_cast` instead of casting and checking,132147,closed,0,,,5,2023-05-11T06:36:06Z,2023-06-26T05:06:30Z,2023-06-26T05:06:29Z,CONTRIBUTOR,,1,pydata/xarray/pulls/7834,"In numpy >= 1.24 unsafe casting raises a RuntimeWarning for an operation that xarray does often to check if casting is safe. `numpy.can_cast` looks like an alternative approach designed for this exact case. - [ ] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7834/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1071806607,PR_kwDOAMm_X84va6lN,6049,Attempt datetime coding using cftime when pandas fails,132147,closed,0,,,2,2021-12-06T07:12:35Z,2022-01-04T00:28:15Z,2021-12-24T11:48:22Z,CONTRIBUTOR,,0,pydata/xarray/pulls/6049,"A netCDF4 dataset we use has a time variable defined as: ``` double time(time) ; time:axis = ""T"" ; time:bounds = ""time_bnds"" ; time:calendar = ""gregorian"" ; time:long_name = ""time"" ; time:standard_name = ""time"" ; time:units = ""days since 1970-01-01 00:00:00 00"" ; ``` Note the `units` attribute, specifically a timezone offset of `00` without any `+-` sign. xarray can successfully open this dataset and parse the time units, making a time variable with the expeced values. However, attempting to save this dataset (e.g. after slicing some geographic bounds or selecting a subset of variables), xarray would raise an error trying to reformat the time `units`. This fix applies the same logic used in the decoding step to the encoding step - specifically, attempt to use `pandas` but if that fails then use `cftime`. The decoding step catches `ValueError` to do this, but `ValueError` was not caught in the encode workflow. - [x] Tests added - [x] Passes `pre-commit run --all-files` - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6049/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull