home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 1008312049

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/5876#issuecomment-1008312049 https://api.github.com/repos/pydata/xarray/issues/5876 1008312049 IC_kwDOAMm_X848GZ7x 6628425 2022-01-09T14:51:33Z 2022-01-09T17:53:08Z MEMBER

@dcherian @mathause @andersy005 given a DataArray of times with missing values, do you have any thoughts on what the preferred result of da.dt.season would be?

One option would be to return np.nan in place of the missing time values:

``` In [3]: times = [np.datetime64("NaT"), np.datetime64("2000-01-01")]

In [4]: da = xr.DataArray(times, dims=["x"])

In [5]: da.dt.season Out[5]: <xarray.DataArray 'season' (x: 2)> array([nan, 'DJF'], dtype=object) Dimensions without coordinates: x ```

This would be somewhat in line with how pandas handles this in other contexts (e.g. https://github.com/pydata/xarray/pull/5876#discussion_r734447120). But this sort of awkwardly returns a DataArray of mixed types. Another option, and this is how @pierreloicq implemented things originally, would be to simply return a string label for missing values, e.g.

In [5]: da.dt.season Out[5]: <xarray.DataArray 'season' (x: 2)> array(['nan', 'DJF'], dtype='<U32') Dimensions without coordinates: x

As I've thought about this more, this feels nicer, because the types are consistent and "season" is just a category label, and "nan" categorizes these values just as well as np.nan. Do you agree?

{
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1030492705
Powered by Datasette · Queries took 3.409ms · About: xarray-datasette