home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 1076832515

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/6149#issuecomment-1076832515 https://api.github.com/repos/pydata/xarray/issues/6149 1076832515 IC_kwDOAMm_X85ALykD 25624127 2022-03-23T21:21:28Z 2022-03-23T21:21:28Z CONTRIBUTOR

I was able to isolate the parts of my code that causes this numpy warning to be thrown.

Here's how the warning was thrown: 1. Chunk a Dataset using Dask 2. Calculate length of time by subtracting the upper and lower bounds of each time coordinate (dtype is datetime64[ns]). This produces a DataArray with a dtype of timedelta64[ns]. 3. Calculate the weights using xarray's grouped arithmetic. 4. numpy warning is thrown after xarray's groupby.sum() call - https://github.com/numpy/numpy/blob/c63f2c2bab48e0446b34f3bba5574729327d68d1/numpy/core/fromnumeric.py#L86

My workaround is to convert the dtype of the underlying dask.array from timedelta[64] to float64 before performing the grouped arithmetic. I'm still unclear as to why this only affects dask.array though.

Related code: ``` def _get_weights(self, data_var: xr.DataArray) -> xr.DataArray: """Calculates weights for a data variable using time bounds."""

    with xr.set_options(keep_attrs=True):
        time_lengths: xr.DataArray = self._time_bounds[:, 1] - self._time_bounds[:, 0]

    # Must be convert dtype from timedelta64[ns] to float64, specifically
    # when chunking DataArrays using Dask. Otherwise, the numpy warning
    # below is thrown: `DeprecationWarning: The `dtype` and `signature`
    # arguments to ufuncs only select the general DType and not details such
    # as the byte order or time unit (with rare exceptions see release
    # notes). To avoid this warning please use the scalar types
    # `np.float64`, or string notation.`
    time_lengths = time_lengths.astype(np.float64)
    time_grouped = self._groupby_multiindex(time_lengths)
    weights: xr.DataArray = time_grouped / time_grouped.sum()  # type: ignore

    self._validate_weights(data_var, weights)
    return weights

def _groupby_multiindex(self, data_var: xr.DataArray) -> DataArrayGroupBy:
    """Adds the MultiIndex to the data variable and groups by it."""
    dv = data_var.copy()
    dv.coords[self._multiindex_name] = ("time", self._multiindex)
    dv_gb = dv.groupby(self._multiindex_name)
    return dv_gb

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1098241812
Powered by Datasette · Queries took 0.8ms · About: xarray-datasette