html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/7221#issuecomment-1294262457,https://api.github.com/repos/pydata/xarray/issues/7221,1294262457,IC_kwDOAMm_X85NJOC5,1217238,2022-10-28T00:27:22Z,2022-10-28T00:27:22Z,MEMBER,"I no longer remember why I added these checks, but I certainly did not expect to see this sort of performance penalty!","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198
https://github.com/pydata/xarray/pull/7221#issuecomment-1293860075,https://api.github.com/repos/pydata/xarray/issues/7221,1293860075,IC_kwDOAMm_X85NHrzr,4160723,2022-10-27T17:40:52Z,2022-10-27T17:40:52Z,MEMBER,"Thanks @hmaarrfk!

> I haven't fully understood why we had that code though?

Me neither. I don't remember ever seeing this assertion error raised while refactoring things. Any idea @shoyer? ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198
https://github.com/pydata/xarray/pull/7221#issuecomment-1293815240,https://api.github.com/repos/pydata/xarray/issues/7221,1293815240,IC_kwDOAMm_X85NHg3I,14371165,2022-10-27T16:58:45Z,2022-10-27T16:58:45Z,MEMBER,"```
       before           after         ratio
     [c000690c]       [24753f1f]
-     3.17±0.02ms      1.94±0.01ms     0.61  merge.DatasetAddVariable.time_variable_insertion(100)
-        81.5±2ms       17.0±0.2ms     0.21  merge.DatasetAddVariable.time_variable_insertion(1000)

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.
```
Nice improvements. :)

I haven't fully understood why we had that code though?","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198
https://github.com/pydata/xarray/pull/7221#issuecomment-1291948502,https://api.github.com/repos/pydata/xarray/issues/7221,1291948502,IC_kwDOAMm_X85NAZHW,90008,2022-10-26T12:19:49Z,2022-10-26T12:23:46Z,CONTRIBUTOR,"I know it is not comparable, but I was really curious what ""dictionary insertion"" costs, in order to be able to understand if my comparisons were fair:

<details> <summary> code</summary>

```python
from tqdm import tqdm
import xarray as xr
from time import perf_counter
import numpy as np

N = 1000

# Everybody is lazy loading now, so lets force modules to get instantiated
dummy_dataset = xr.Dataset()
dummy_dataset['a'] = 1
dummy_dataset['b'] = 1
del dummy_dataset

time_elapsed = np.zeros(N)
# dataset = xr.Dataset()
dataset = {}

for i in tqdm(range(N)):
# for i in range(N):
    time_start = perf_counter()
    dataset[f""var{i}""] = i
    time_end = perf_counter()
    time_elapsed[i] = time_end - time_start
    
    
# %%
from matplotlib import pyplot as plt

plt.plot(np.arange(N), time_elapsed * 1E6, label='Time to add one variable')
plt.xlabel(""Number of existing variables"")
plt.ylabel(""Time to add a variables (us)"")
plt.ylim([0, 10])
plt.title(""Dictionary insertion"")
plt.grid(True)
```
</details>

![image](https://user-images.githubusercontent.com/90008/198024147-0965787a-32be-409b-959c-1b87adbc633a.png)

I think xarray gives me 3 order of magnitude of ""thinking"" benefit, so I'll take it!
```
python --version
Python 3.9.13
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198
https://github.com/pydata/xarray/pull/7221#issuecomment-1291894024,https://api.github.com/repos/pydata/xarray/issues/7221,1291894024,IC_kwDOAMm_X85NAL0I,90008,2022-10-26T11:32:32Z,2022-10-26T11:32:32Z,CONTRIBUTOR,"Ok. I'll want to rethink them. 

I know it looks quadratic time, but i really would like to test n=1000 and i have an idea","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198
https://github.com/pydata/xarray/pull/7221#issuecomment-1291523800,https://api.github.com/repos/pydata/xarray/issues/7221,1291523800,IC_kwDOAMm_X85M-xbY,14371165,2022-10-26T05:27:11Z,2022-10-26T05:27:11Z,MEMBER,Now the asv finishes at least! Could you make a separate PR for the asv? I don't think it runs it when comparing to the main branch.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198
https://github.com/pydata/xarray/pull/7221#issuecomment-1291501993,https://api.github.com/repos/pydata/xarray/issues/7221,1291501993,IC_kwDOAMm_X85M-sGp,14371165,2022-10-26T04:56:39Z,2022-10-26T04:57:37Z,MEMBER,"I like large datasets as well. I seem to remember getting caught in similar places when creating my datasets. I think I solved it by using Variable instead, does doing something like this improve the performance for you?

```python
import xarray as xr
dataset = xr.Dataset()
dataset['a'] = xr.Variable(dims=""time"", data=[1])
dataset['b'] = xr.Variable(dims=""time"", data=[2])
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198
https://github.com/pydata/xarray/pull/7221#issuecomment-1291493769,https://api.github.com/repos/pydata/xarray/issues/7221,1291493769,IC_kwDOAMm_X85M-qGJ,14371165,2022-10-26T04:44:43Z,2022-10-26T04:44:43Z,MEMBER,"```
Error: [ 75.90%] ··· dataset_creation.Creation.time_dataset_creation             failed
[ 75.90%] ···· asv: benchmark timed out (timeout 60.0s)
```
Maybe 1000 loops is too much. Start with 100 maybe? We still want these benchmarks to be decently fast in the CI.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198
https://github.com/pydata/xarray/pull/7221#issuecomment-1291450556,https://api.github.com/repos/pydata/xarray/issues/7221,1291450556,IC_kwDOAMm_X85M-fi8,90008,2022-10-26T03:32:53Z,2022-10-26T03:32:53Z,CONTRIBUTOR,"I'm somewhat ocnfused, I can run the benchmark locally 

```
[  1.80%] ··· dataset_creation.Creation.time_dataset_creation                                                    4.37±0s

```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198
https://github.com/pydata/xarray/pull/7221#issuecomment-1291447746,https://api.github.com/repos/pydata/xarray/issues/7221,1291447746,IC_kwDOAMm_X85M-e3C,90008,2022-10-26T03:27:36Z,2022-10-26T03:27:36Z,CONTRIBUTOR,":/ not fun, the benchmark is failing. not sure why.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198
https://github.com/pydata/xarray/pull/7221#issuecomment-1291399714,https://api.github.com/repos/pydata/xarray/issues/7221,1291399714,IC_kwDOAMm_X85M-TIi,90008,2022-10-26T02:14:40Z,2022-10-26T02:14:40Z,CONTRIBUTOR,"> Would be interesting to see whether this was covered by our existing asv benchmarks. 

I wasn't able to find something that really benchmarked ""large"" datasets.

> Would be a good benchmark to add if we don't have one already.

Added one.

","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198
https://github.com/pydata/xarray/pull/7221#issuecomment-1291389702,https://api.github.com/repos/pydata/xarray/issues/7221,1291389702,IC_kwDOAMm_X85M-QsG,90008,2022-10-26T01:59:57Z,2022-10-26T01:59:57Z,CONTRIBUTOR,"> out of interest, how did you find this?

Spyder profiler","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198
https://github.com/pydata/xarray/pull/7221#issuecomment-1291388733,https://api.github.com/repos/pydata/xarray/issues/7221,1291388733,IC_kwDOAMm_X85M-Qc9,5635139,2022-10-26T01:58:00Z,2022-10-26T01:58:00Z,MEMBER,"Gosh, that's quite dramatic! Impressive find @hmaarrfk. (out of interest, how did you find this?)

I can see how that's quadratic when looping like that. I wonder whether using `.assign(var1=1, var2=2, ...)` has the same behavior?

Would be interesting to see whether this was covered by our existing asv benchmarks. Would be a good benchmark to add if we don't have one already.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1423312198