home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where author_association = "NONE" and issue = 365973662 sorted by updated_at descending

✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 2

  • tqfjo 2
  • brey 1

issue 1

  • Stack + to_array before to_xarray is much faster that a simple to_xarray · 3 ✖

author_association 1

  • NONE · 3 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
648721465 https://github.com/pydata/xarray/issues/2459#issuecomment-648721465 https://api.github.com/repos/pydata/xarray/issues/2459 MDEyOklzc3VlQ29tbWVudDY0ODcyMTQ2NQ== brey 5442433 2020-06-24T09:55:00Z 2020-06-24T09:55:00Z NONE

Hi All. I stumble across the same issue trying to convert a 5000 column dataframe to xarray (it was never going to happen...). I found a workaround and I am posting the test below. Hope it helps.

```python import xarray as xr import pandas as pd import numpy as np

xr.version

'0.15.1'

pd.version

'1.0.5'

df = pd.DataFrame(np.random.randn(200, 500))

%%time one = df.to_xarray()

CPU times: user 29.6 s, sys: 60.4 ms, total: 29.6 s
Wall time: 29.7 s

%%time dic={} for name in df.columns: dic.update({name:(['index'],df[name].values)})

two = xr.Dataset(dic, coords={'index': ('index', df.index.values)})

CPU times: user 17.6 ms, sys: 158 µs, total: 17.8 ms
Wall time: 17.8 ms

one.equals(two)

True

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Stack + to_array before to_xarray is much faster that a simple to_xarray 365973662
586552823 https://github.com/pydata/xarray/issues/2459#issuecomment-586552823 https://api.github.com/repos/pydata/xarray/issues/2459 MDEyOklzc3VlQ29tbWVudDU4NjU1MjgyMw== tqfjo 40251676 2020-02-15T04:31:54Z 2020-02-15T04:31:54Z NONE

@crusaderky Thanks for the pointer to xarray.DataArray(df) -- that makes my life a ton easier.


That said, if it helps anyone to know, I did just want a DataArray, but figured there was no alternative to first running the rather singular to_xarray. I also still find the runtime surprising, though I know nothing about xarray's internals.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Stack + to_array before to_xarray is much faster that a simple to_xarray 365973662
586066908 https://github.com/pydata/xarray/issues/2459#issuecomment-586066908 https://api.github.com/repos/pydata/xarray/issues/2459 MDEyOklzc3VlQ29tbWVudDU4NjA2NjkwOA== tqfjo 40251676 2020-02-14T02:25:25Z 2020-02-14T02:25:25Z NONE

I've run into this twice. This time I'm seeing a difference of very roughly 100x or more just using a transpose -- I can't test or time it properly right now, but this is what it looks like:

``` ipdb> df x a b ... c d y 0 0 ... 7 7 z ...
0 0.000000 0.0 ... 0.0 0.0 1 -0.000416 0.0 ... 0.0 0.0

[2 rows x 2932 columns] ipdb> df.to_xarray()

ipdb> df.T.to_xarray()

<Finishes instantly>

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Stack + to_array before to_xarray is much faster that a simple to_xarray 365973662

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.187ms · About: xarray-datasette
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows