home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 1292216344

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/7224#issuecomment-1292216344 https://api.github.com/repos/pydata/xarray/issues/7224 1292216344 IC_kwDOAMm_X85NBagY 14371165 2022-10-26T15:21:54Z 2022-10-26T15:21:54Z MEMBER

I have thought a little about this as well and went and looked in my old code. Creating a data_vars dict with the data and then at the end creating the dataset seems to be the way to go:

```python import numpy as np import xarray as xr from time import perf_counter # %% Inputs names = np.core.defchararray.add("long_variable_name", np.arange(0, 100).astype(str)) time = np.array([0, 1]) coords = dict(time=time) value = np.array(["0", "b"], dtype=str) # %% Insert to Dataset with DataArray: time_start = perf_counter() ds = xr.Dataset(coords=coords) for v in names: ds[v] = xr.DataArray(data=value, coords=coords) time_end = perf_counter() time_elapsed = time_end - time_start print("Insert to Dataset with DataArray:", time_elapsed) # %% Insert to Dataset with Variable: time_start = perf_counter() ds = xr.Dataset(coords=coords) for v in names: ds[v] = xr.Variable("time", value) time_end = perf_counter() time_elapsed = time_end - time_start print("Insert to Dataset with Variable:", time_elapsed) # %% Insert to Dataset with tuple: time_start = perf_counter() ds = xr.Dataset(coords=coords) for v in names: ds[v] = ("time", value) time_end = perf_counter() time_elapsed = time_end - time_start print("Insert to Dataset with tuple:", time_elapsed) # %% Dict of DataArray then create Dataset: time_start = perf_counter() data_vars = dict() for v in names: data_vars[v] = xr.DataArray(data=value, coords=coords) ds = xr.Dataset(data_vars=data_vars, coords=coords) time_end = perf_counter() time_elapsed = time_end - time_start print("Dict of DataArrays then create Dataset:", time_elapsed) # %% Dict of Variables then create Dataset: time_start = perf_counter() data_vars = dict() for v in names: data_vars[v] = xr.Variable("time", value) ds = xr.Dataset(data_vars=data_vars, coords=coords) time_end = perf_counter() time_elapsed = time_end - time_start print("Dict of Variables then create Dataset:", time_elapsed) # %% Dict of tuples then create Dataset: time_start = perf_counter() data_vars = dict() for v in names: data_vars[v] = ("time", value) ds = xr.Dataset(data_vars=data_vars, coords=coords) time_end = perf_counter() time_elapsed = time_end - time_start print("Dict of tuples then create Dataset:", time_elapsed) ```

python Insert to Dataset with DataArray: 0.3787728999996034 Insert to Dataset with Variable: 0.3083788999997523 Insert to Dataset with tuple: 0.30018929999960164 Dict of DataArrays then create Dataset: 0.07277609999982815 Dict of Variables then create Dataset: 0.005166500000086671 Dict of tuples then create Dataset: 0.003186699999787379 # Winner! :)

{
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1423948375
Powered by Datasette · Queries took 81.762ms · About: xarray-datasette