id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
1913983402,I_kwDOAMm_X85yFRGq,8233,numbagg & flox,5635139,closed,0,,,13,2023-09-26T17:33:32Z,2023-10-15T07:48:56Z,2023-10-09T15:40:29Z,MEMBER,,,,"### What is your issue?

I've been doing some work recently on our old friend [numbagg](https://github.com/numbagg/numbagg), improving the ewm routines & adding some more.

I'm keen to get numbagg back in shape, doing the things that it does best, and trimming anything it doesn't. I notice that it has [grouped calcs](https://github.com/numbagg/numbagg/blob/main/numbagg/grouped.py). Am I correct to think that [flox](https://github.com/xarray-contrib/flox) does this better? I haven't been up with the latest. flox looks like it's particularly focused on dask arrays, whereas [numpy_groupies](https://github.com/ml31415/numpy-groupies), one of the inspirations for this, was applicable to numpy arrays too.

At least from the xarray perspective, are we OK to deprecate these numbagg functions, and direct folks to flox?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8233/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
365973662,MDU6SXNzdWUzNjU5NzM2NjI=,2459,Stack + to_array before to_xarray is much faster that a simple to_xarray,5635139,closed,0,,,13,2018-10-02T16:13:26Z,2020-07-02T20:39:01Z,2020-07-02T20:39:01Z,MEMBER,,,,"I was seeing some slow performance around `to_xarray()` on MultiIndexed series, and found that unstacking one of the dimensions before running `to_xarray()`, and then restacking with `to_array()` was ~30x faster. This time difference is consistent with larger data sizes.

To reproduce:

Create a series with a MultiIndex, ensuring the MultiIndex isn't a simple product:

```python
s = pd.Series(
    np.random.rand(100000), 
    index=pd.MultiIndex.from_product([
        list('abcdefhijk'),
        list('abcdefhijk'),
        pd.DatetimeIndex(start='2000-01-01', periods=1000, freq='B'),
    ]))

cropped = s[::3]
cropped.index=pd.MultiIndex.from_tuples(cropped.index, names=list('xyz'))

cropped.head()

# x  y  z         
# a  a  2000-01-03    0.993989
#      2000-01-06    0.850518
#      2000-01-11    0.068944
#      2000-01-14    0.237197
#      2000-01-19    0.784254
# dtype: float64
```

Two approaches for getting this into xarray;
1 - Simple `.to_xarray()`:

```python
# current_method = cropped.to_xarray()

<xarray.DataArray (x: 10, y: 10, z: 1000)>
array([[[0.993989,      nan, ...,      nan, 0.721663],
        [     nan,      nan, ..., 0.58224 ,      nan],
        ...,
        [     nan, 0.369382, ...,      nan,      nan],
        [0.98558 ,      nan, ...,      nan, 0.403732]],

       [[     nan,      nan, ..., 0.493711,      nan],
        [     nan, 0.126761, ...,      nan,      nan],
        ...,
        [0.976758,      nan, ...,      nan, 0.816612],
        [     nan,      nan, ..., 0.982128,      nan]],

       ...,

       [[     nan, 0.971525, ...,      nan,      nan],
        [0.146774,      nan, ...,      nan, 0.419806],
        ...,
        [     nan,      nan, ..., 0.700764,      nan],
        [     nan, 0.502058, ...,      nan,      nan]],

       [[0.246768,      nan, ...,      nan, 0.079266],
        [     nan,      nan, ..., 0.802297,      nan],
        ...,
        [     nan, 0.636698, ...,      nan,      nan],
        [0.025195,      nan, ...,      nan, 0.629305]]])
Coordinates:
  * x        (x) object 'a' 'b' 'c' 'd' 'e' 'f' 'h' 'i' 'j' 'k'
  * y        (y) object 'a' 'b' 'c' 'd' 'e' 'f' 'h' 'i' 'j' 'k'
  * z        (z) datetime64[ns] 2000-01-03 2000-01-04 ... 2003-10-30 2003-10-31
```

This takes *536 ms*

2 - unstack in pandas first, and then use `to_array` to do the equivalent of a restack:
```
proposed_version = (
    cropped
    .unstack('y')
    .to_xarray()
    .to_array('y')
)
```

This takes *17.3 ms*

To confirm these are identical:

```
proposed_version_adj = (
    proposed_version
    .assign_coords(y=proposed_version['y'].astype(object))
    .transpose(*current_version.dims)
)

proposed_version_adj.equals(current_version)
# True
```

#### Problem description

A default operation is much slower than a (potentially) equivalent operation that's not the default.

I need to look more at what's causing the issues. I think it's to do with the `.reindex(full_idx)`, but I'm unclear why it's so much faster in the alternative route, and whether there's a fix that we can make to make the default path fast.


#### Output of ``xr.show_versions()``

<details>
INSTALLED VERSIONS
------------------
commit: None
python: 2.7.14.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.93-linuxkit-aufs
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.utf8
LOCALE: None.None

xarray: 0.10.9
pandas: 0.23.4
numpy: 1.15.2
scipy: 1.1.0
netCDF4: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
PseudonetCDF: None
rasterio: None
iris: None
bottleneck: 1.2.1
cyordereddict: None
dask: None
distributed: None
matplotlib: 2.2.3
cartopy: 0.16.0
seaborn: 0.9.0
setuptools: 40.4.3
pip: 18.0
conda: None
pytest: 3.8.1
IPython: 5.8.0
sphinx: None
</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2459/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
115210260,MDU6SXNzdWUxMTUyMTAyNjA=,645,Display of PeriodIndex,5635139,closed,0,,,13,2015-11-05T05:01:22Z,2015-12-30T05:59:05Z,2015-12-30T05:59:05Z,MEMBER,,,,"Not the greatest issue but:
While coordinates that are given as `PeriodIndex`es are stored in that form, their `Int` representation is shown in the `DataArray` repr, which adds a frequent additional step to see what dates we're dealing with.

Or correct me if I'm making some basic mistake.

``` python
In [23]:

data_array = xray.DataArray(
    data=pd.Series(np.random.rand(20), index=pd.period_range(start='2000', periods=20, name='Date'))
)
data_array
Out[23]:
<xray.DataArray (Date: 20)>
array([ 0.95861189,  0.3607297 ,  0.9890032 ,  0.77674314,  0.39461886,
        0.98425749,  0.79044973,  0.81376587,  0.07091318,  0.02757213,
        0.87366025,  0.0496346 ,  0.45433931,  0.3339866 ,  0.67261248,
        0.91684965,  0.60889737,  0.33469611,  0.94966724,  0.50328461])
Coordinates:
  * Date     (Date) int64 10957 10958 10959 10960 10961 10962 10963 10964 ...

In [25]:

data_array.to_series()
Out[25]:
Date
2000-01-01    0.958612
2000-01-02    0.360730
2000-01-03    0.989003
2000-01-04    0.776743
2000-01-05    0.394619
2000-01-06    0.984257
2000-01-07    0.790450
2000-01-08    0.813766
2000-01-09    0.070913
2000-01-10    0.027572
2000-01-11    0.873660
2000-01-12    0.049635
2000-01-13    0.454339
2000-01-14    0.333987
2000-01-15    0.672612
2000-01-16    0.916850
2000-01-17    0.608897
2000-01-18    0.334696
2000-01-19    0.949667
2000-01-20    0.503285
Freq: D, dtype: float64
```
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/645/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue