html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/3159#issuecomment-519561332,https://api.github.com/repos/pydata/xarray/issues/3159,519561332,MDEyOklzc3VlQ29tbWVudDUxOTU2MTMzMg==,1217238,2019-08-08T15:12:27Z,2019-08-08T15:12:27Z,MEMBER,"Yes, I think it would make sense to add an option to is_scalar() to
indicate whether or not 0-d arrays should be considered ""scalars""
On Thu, Aug 8, 2019 at 6:44 AM Gerardo Rivera
wrote:
> That's a good point. I think in this case, given that it's passed to an
> arg expected an array, we should raise on 0d.
>
> I was expecting to rely on the current implementation of is_scalar to do
> the type checking since I'm moving _check_data_shape above
> as_compatible_data to do something like this
>
> if utils.is_scalar(data) and coords is not None:
>
> Otherwise everything would be filter out since as_compatible_data returns
> a 0d given a scalar value.
>
> https://github.com/pydata/xarray/blob/8d46bf09f20e022baca98b4050584d93c0ea118b/xarray/core/variable.py#L195-L196
>
> I can only imagine copying is_scalar but removing getattr(value, 'ndim',
> None) == 0 to filter out the 0d to only do the duplication on scalars.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> ,
> or mute the thread
>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,472100381
https://github.com/pydata/xarray/pull/3159#issuecomment-518822084,https://api.github.com/repos/pydata/xarray/issues/3159,518822084,MDEyOklzc3VlQ29tbWVudDUxODgyMjA4NA==,1217238,2019-08-06T20:02:35Z,2019-08-06T20:02:35Z,MEMBER,"If the default value is `NaN`, we could reuse xarray's pre-existing sentinel value for NA:
https://github.com/pydata/xarray/blob/55593a8bcaf2edb79034507990eac9c55b41a07d/xarray/core/dtypes.py#L8","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,472100381
https://github.com/pydata/xarray/pull/3159#issuecomment-518761396,https://api.github.com/repos/pydata/xarray/issues/3159,518761396,MDEyOklzc3VlQ29tbWVudDUxODc2MTM5Ng==,1217238,2019-08-06T17:13:27Z,2019-08-06T17:13:27Z,MEMBER,"> * Use a scalar array
This is the case that I'm not sure we want to support.
I think the rule we want is something like ""scalar values are repeated automatically,"" but 0-dimensional arrays are kind of a strange case -- are they really scalars or multi-dimensional arrays? My inclination is to treat these like multi-dimensional arrays, in which case we should raise an error to avoid hiding errors.
In particular, one thing that an xarray user *might* expect, but which I think don't want to support, is full [broadcasting](https://jakevdp.github.io/PythonDataScienceHandbook/02.05-computation-on-arrays-broadcasting.html) of multi-dimensional arrays to match the shape of coordinates.
> * Use `None` to get an empty array
Rather than using `None`, I would suggest using a custom sentinel value. Somebody might actually want an array full of all `None` values! If users want an empty DataArray, make them omit the argument entirely, e.g., `xr.DataArray(coord=coords, dims=dims)`.
The way we do this in xarray is with a `ReprObject`, e.g., see here for `apply_ufunc`:
https://github.com/pydata/xarray/blob/1757dffac2fa493d7b9a074b84cf8c830a706688/xarray/core/computation.py#L26
https://github.com/pydata/xarray/blob/1757dffac2fa493d7b9a074b84cf8c830a706688/xarray/core/computation.py#L692
There is also the question of what values should be inside such an empty array. Here I think there are roughly two options:
1. Fill the unspecified array with `np.nan`, to indicate invalid values.
2. Just use `np.empty`, which means the array can be filled with arbitrary invalid data.
It looks like you've currently implemented option (2), but again I'm not sure that is the most sensible default behavior for xarray. The performance gains from not filling in array values with a constant are typically *very* small (writing constant values into memory is very fast). Pandas also seems to use `NaN` as the default value:
```
>>> pandas.Series(index=[1, 2])
1 NaN
2 NaN
dtype: float64
```
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,472100381