html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/pull/4314#issuecomment-676539274,https://api.github.com/repos/pydata/xarray/issues/4314,676539274,MDEyOklzc3VlQ29tbWVudDY3NjUzOTI3NA==,5635139,2020-08-19T16:47:02Z,2020-08-19T16:47:02Z,MEMBER,"I just saw I was up for reviewing this — sorry for missing that. Looks good. @keewis 's suggestion for the back-compat code looks like a nice refinement — it includes a version check which will ease disentangling it. Could we add that and then merge?","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,673757961 https://github.com/pydata/xarray/pull/4314#issuecomment-673727756,https://api.github.com/repos/pydata/xarray/issues/4314,673727756,MDEyOklzc3VlQ29tbWVudDY3MzcyNzc1Ng==,5635139,2020-08-13T21:50:15Z,2020-08-13T21:50:15Z,MEMBER,"There are some duplicated commits — 3x `Recreate gaj...` — though still weird that GH and git disagree. I'd suggest the squash and push -f... ``` * e23529e0 - (HEAD -> astypekeepattrs, dnowacki-usgs/astypekeepattrs, pr/dnowacki-usgs/4314) Merge branch 'astypekeepattrs' of https://github.com/dnowacki-usgs/xarray into astypekeepattrs (3 hours ago) |\ | * ac42be80 - improve error handling and check if sparse array (6 days ago) | * e2d38550 - fix dtype test (7 days ago) | * e079cf86 - Merge branch 'astypekeepattrs' of https://github.com/dnowacki-usgs/xarray into astypekeepattrs (7 days ago) | |\ | | * 1296f590 - Merge branch 'master' into pr/dnowacki-usgs/4314 (8 days ago) | | |\ | | * | 5c101d85 - Accept keep_attrs flag, use apply_ufunc in variable.py, expand tests (8 days ago) | | * | d75e1fb9 - Recreate @gajomi's #2070 to keep attrs when calling astype() (8 days ago) | * | | eb86c8a6 - add to whats new (7 days ago) | * | | db014baf - Ignore casting if we get a TypeError (indicates sparse <= 0.10.0) (7 days ago) | * | | 01d1b378 - Accept keep_attrs flag, use apply_ufunc in variable.py, expand tests (7 days ago) | * | | 00bfeaa3 - Recreate @gajomi's #2070 to keep attrs when calling astype() (7 days ago) * | | | dba93b18 - improve error handling and check if sparse array (3 hours ago) * | | | 4856eec5 - fix dtype test (3 hours ago) * | | | 983f17d5 - add to whats new (3 hours ago) * | | | 9f9b06c6 - Ignore casting if we get a TypeError (indicates sparse <= 0.10.0) (3 hours ago) * | | | a277c2a4 - Accept keep_attrs flag, use apply_ufunc in variable.py, expand tests (3 hours ago) * | | | 833e444f - Recreate @gajomi's #2070 to keep attrs when calling astype() (3 hours ago) * | | | 7ee97989 - Accept keep_attrs flag, use apply_ufunc in variable.py, expand tests (4 hours ago) * | | | dd4f5494 - Recreate @gajomi's #2070 to keep attrs when calling astype() (4 hours ago) * | | | 7daad4fc - (upstream/master, upstream/HEAD, origin/master, master) Implement interp for interpolating between chunks of data (dask) (#4155) (2 days ago) * | | | d514b120 - Add @mathause to current core developers. (#4335) (2 days ago) * | | | ceadecd5 - install sphinx-autosummary-accessors from conda-forge (#4332) (3 days ago) * | | | 1791c3b6 - Use sphinx-accessors-autosummary (#4323) (4 days ago) * | | | df7b2eae - ndrolling fixes (#4329) (4 days ago) * | | | f02ca537 - DOC: fix typo argmin -> argmax in DataArray.argmax docstring (#4327) (5 days ago) * | | | 6e9edff5 - pin sphinx to 3.1(#4326) (5 days ago) * | | | 1d3dee08 - nd-rolling (#4219) (6 days ago) |/ / / * | | e04e21d6 - Implicit dask import 4164 (#4318) (7 days ago) * | | 7ba19e12 - allow customizing the inline repr of a duck array (#4248) (7 days ago) | |/ |/| * | 98dc1f4e - silence the known docs CI issues (#4316) (8 days ago) * | ed6fff2c - enh: fixed #4302 (#4315) (8 days ago) * | 1101eca6 - Remove all unused and warn-raising methods from AbstractDataStore (#4310) (8 days ago) |/ * e1dafe67 - (dnowacki-usgs/master) Fix map_blocks example (#4305) (10 days ago) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,673757961 https://github.com/pydata/xarray/pull/4314#issuecomment-673724685,https://api.github.com/repos/pydata/xarray/issues/4314,673724685,MDEyOklzc3VlQ29tbWVudDY3MzcyNDY4NQ==,5635139,2020-08-13T21:41:40Z,2020-08-13T21:41:40Z,MEMBER,"The weirdest part is that the local diff is fine; here's the output of `git --no-pager diff upstream/master`. Is it maybe because one of the merge commits was mine rather than @dnowacki-usgs ? Sorry if I contributed to this. One override is to apply this patch on master and force push this branch. I'm happy to do it if @dnowacki-usgs agrees.
```diff diff --git a/doc/whats-new.rst b/doc/whats-new.rst index 793f72fb..4b7b117b 100644 --- a/doc/whats-new.rst +++ b/doc/whats-new.rst @@ -21,7 +21,9 @@ v0.16.1 (unreleased) Breaking changes ~~~~~~~~~~~~~~~~ - +- :py:meth:`DataArray.astype` and :py:meth:`Dataset.astype` now preserve attributes. Keep the + old behavior by passing `keep_attrs=False` (:issue:`2049`, :pull:`4314`). + By `Dan Nowacki `_ and `Gabriel Joel Mitchell `_. New Features ~~~~~~~~~~~~ diff --git a/xarray/core/common.py b/xarray/core/common.py index bc5035b6..7b6fc13c 100644 --- a/xarray/core/common.py +++ b/xarray/core/common.py @@ -1305,6 +1305,48 @@ class DataWithCoords(SupportsArithmetic, AttrAccessMixin): dask=""allowed"", ) + def astype(self, dtype, casting=""unsafe"", copy=True, keep_attrs=True): + """""" + Copy of the xarray object, with data cast to a specified type. + Leaves coordinate dtype unchanged. + + Parameters + ---------- + dtype : str or dtype + Typecode or data-type to which the array is cast. + casting : {'no', 'equiv', 'safe', 'same_kind', 'unsafe'}, optional + Controls what kind of data casting may occur. Defaults to 'unsafe' + for backwards compatibility. + + * 'no' means the data types should not be cast at all. + * 'equiv' means only byte-order changes are allowed. + * 'safe' means only casts which can preserve values are allowed. + * 'same_kind' means only safe casts or casts within a kind, + like float64 to float32, are allowed. + * 'unsafe' means any data conversions may be done. + copy : bool, optional + By default, astype always returns a newly allocated array. If this + is set to False and the `dtype` requirement is satisfied, the input + array is returned instead of a copy. + keep_attrs : bool, optional + By default, astype keeps attributes. Set to False to remove + attributes in the returned object. + + See also + -------- + np.ndarray.astype + dask.array.Array.astype + """""" + from .computation import apply_ufunc + + return apply_ufunc( + duck_array_ops.astype, + self, + kwargs=dict(dtype=dtype, casting=casting, copy=copy), + keep_attrs=keep_attrs, + dask=""allowed"", + ) + def __enter__(self: T) -> T: return self diff --git a/xarray/core/duck_array_ops.py b/xarray/core/duck_array_ops.py index 3d192882..7b28d396 100644 --- a/xarray/core/duck_array_ops.py +++ b/xarray/core/duck_array_ops.py @@ -149,6 +149,33 @@ masked_invalid = _dask_or_eager_func( ) +def astype(data, **kwargs): + try: + return data.astype(**kwargs) + except TypeError as e: + # FIXME: This should no longer be necessary in future versions of sparse + # Current versions of sparse (v0.10.0) don't support the ""casting"" kwarg + # This was fixed by https://github.com/pydata/sparse/pull/392 + try: + import sparse + except ImportError: + sparse = None + if ( + ""got an unexpected keyword argument 'casting'"" in repr(e) + and sparse is not None + and isinstance(data, sparse._coo.core.COO) + ): + warnings.warn( + ""The current version of sparse does not support the 'casting' argument. It will be ignored in the call to astype()."", + RuntimeWarning, + stacklevel=4, + ) + kwargs.pop(""casting"") + else: + raise e + return data.astype(**kwargs) + + def asarray(data, xp=np): return ( data diff --git a/xarray/core/ops.py b/xarray/core/ops.py index 36753179..dad37f3b 100644 --- a/xarray/core/ops.py +++ b/xarray/core/ops.py @@ -42,7 +42,7 @@ NUM_BINARY_OPS = [ NUMPY_SAME_METHODS = [""item"", ""searchsorted""] # methods which don't modify the data shape, so the result should still be # wrapped in an Variable/DataArray -NUMPY_UNARY_METHODS = [""astype"", ""argsort"", ""clip"", ""conj"", ""conjugate""] +NUMPY_UNARY_METHODS = [""argsort"", ""clip"", ""conj"", ""conjugate""] PANDAS_UNARY_FUNCTIONS = [""isnull"", ""notnull""] # methods which remove an axis REDUCE_METHODS = [""all"", ""any""] diff --git a/xarray/core/variable.py b/xarray/core/variable.py index 1f86a403..6d53a4fe 100644 --- a/xarray/core/variable.py +++ b/xarray/core/variable.py @@ -360,6 +360,47 @@ class Variable( ) self._data = data + def astype(self, dtype, casting=""unsafe"", copy=True, keep_attrs=True): + """""" + Copy of the Variable object, with data cast to a specified type. + + Parameters + ---------- + dtype : str or dtype + Typecode or data-type to which the array is cast. + casting : {'no', 'equiv', 'safe', 'same_kind', 'unsafe'}, optional + Controls what kind of data casting may occur. Defaults to 'unsafe' + for backwards compatibility. + + * 'no' means the data types should not be cast at all. + * 'equiv' means only byte-order changes are allowed. + * 'safe' means only casts which can preserve values are allowed. + * 'same_kind' means only safe casts or casts within a kind, + like float64 to float32, are allowed. + * 'unsafe' means any data conversions may be done. + copy : bool, optional + By default, astype always returns a newly allocated array. If this + is set to False and the `dtype` requirement is satisfied, the input + array is returned instead of a copy. + keep_attrs : bool, optional + By default, astype keeps attributes. Set to False to remove + attributes in the returned object. + + See also + -------- + np.ndarray.astype + dask.array.Array.astype + """""" + from .computation import apply_ufunc + + return apply_ufunc( + duck_array_ops.astype, + self, + kwargs=dict(dtype=dtype, casting=casting, copy=copy), + keep_attrs=keep_attrs, + dask=""allowed"", + ) + def load(self, **kwargs): """"""Manually trigger loading of this variable's data from disk or a remote source into memory and return this variable. diff --git a/xarray/tests/test_dataarray.py b/xarray/tests/test_dataarray.py index f10404d7..77969e5c 100644 --- a/xarray/tests/test_dataarray.py +++ b/xarray/tests/test_dataarray.py @@ -1874,6 +1874,19 @@ class TestDataArray: bar = Variable([""x"", ""y""], np.zeros((10, 20))) assert_equal(self.dv, np.maximum(self.dv, bar)) + def test_astype_attrs(self): + for v in [self.va.copy(), self.mda.copy(), self.ds.copy()]: + v.attrs[""foo""] = ""bar"" + assert list(v.attrs.items()) == list(v.astype(float).attrs.items()) + assert [] == list(v.astype(float, keep_attrs=False).attrs.items()) + + def test_astype_dtype(self): + original = DataArray([-1, 1, 2, 3, 1000]) + converted = original.astype(float) + assert_array_equal(original, converted) + assert np.issubdtype(original.dtype, np.integer) + assert np.issubdtype(converted.dtype, np.floating) + def test_is_null(self): x = np.random.RandomState(42).randn(5, 6) x[x < 0] = np.nan diff --git a/xarray/tests/test_dataset.py b/xarray/tests/test_dataset.py index da7621dc..9a95eb24 100644 --- a/xarray/tests/test_dataset.py +++ b/xarray/tests/test_dataset.py @@ -5607,6 +5607,17 @@ class TestDataset: np.testing.assert_equal(padded[""var1""].isel(dim2=[0, -1]).data, 42) np.testing.assert_equal(padded[""dim2""][[0, -1]].data, np.nan) + def test_astype_attrs(self): + data = create_test_data(seed=123) + data.attrs[""foo""] = ""bar"" + + assert list(data.attrs.items()) == list(data.astype(float).attrs.items()) + assert list(data.var1.attrs.items()) == list( + data.astype(float).var1.attrs.items() + ) + assert [] == list(data.astype(float, keep_attrs=False).attrs.items()) + assert [] == list(data.astype(float, keep_attrs=False).var1.attrs.items()) + # Py.test tests ```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,673757961 https://github.com/pydata/xarray/pull/4314#issuecomment-672295017,https://api.github.com/repos/pydata/xarray/issues/4314,672295017,MDEyOklzc3VlQ29tbWVudDY3MjI5NTAxNw==,5635139,2020-08-11T21:45:01Z,2020-08-11T21:45:01Z,MEMBER,Great — I think this looks good — any final thoughts?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,673757961 https://github.com/pydata/xarray/pull/4314#issuecomment-670248276,https://api.github.com/repos/pydata/xarray/issues/4314,670248276,MDEyOklzc3VlQ29tbWVudDY3MDI0ODI3Ng==,5635139,2020-08-06T23:59:16Z,2020-08-06T23:59:16Z,MEMBER,"I'm fine with reraising — though I think we should reraise, otherwise isn't it a confusing message? The try-expect is probably fine given the narrowness (very open to holding too low a standard tho); one step up would be checking it is a spare array.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,673757961 https://github.com/pydata/xarray/pull/4314#issuecomment-670157140,https://api.github.com/repos/pydata/xarray/issues/4314,670157140,MDEyOklzc3VlQ29tbWVudDY3MDE1NzE0MA==,5635139,2020-08-06T19:46:25Z,2020-08-06T19:46:25Z,MEMBER,"For sure — good idea to defer to numpy rather than raise — should we do that only on existing versions though? Otherwise we won't roll over to the new version when they do release their next version? (the version check can be a single line, there are lots of examples in the repo, lmk if that's not clear)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,673757961 https://github.com/pydata/xarray/pull/4314#issuecomment-670095439,https://api.github.com/repos/pydata/xarray/issues/4314,670095439,MDEyOklzc3VlQ29tbWVudDY3MDA5NTQzOQ==,5635139,2020-08-06T18:18:27Z,2020-08-06T18:18:27Z,MEMBER,What do you think about reraising the error with a message about upgrading? Is this a narrow enough use case that we can avoid backward-compat fix in xarray?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,673757961 https://github.com/pydata/xarray/pull/4314#issuecomment-669610153,https://api.github.com/repos/pydata/xarray/issues/4314,669610153,MDEyOklzc3VlQ29tbWVudDY2OTYxMDE1Mw==,5635139,2020-08-06T00:16:33Z,2020-08-06T00:16:33Z,MEMBER,"I'm not sure re the test at first glance, I don't see how those tests are related, but master seems to pass. I pushed a merge to master as a check — hope that's OK. Let's see if that sheds any light","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,673757961