home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 752154350

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/4738#issuecomment-752154350 https://api.github.com/repos/pydata/xarray/issues/4738 752154350 MDEyOklzc3VlQ29tbWVudDc1MjE1NDM1MA== 13301940 2020-12-29T16:47:03Z 2020-12-29T16:47:03Z MEMBER

Pandas has a built-in utility function pd.util.hash_pandas_object:

```python In [1]: import pandas as pd

In [3]: df = pd.DataFrame({'A': [4, 5, 6, 7], 'B': [10, 20, 30, 40], 'C': [100, 50, -30, -50]})

In [4]: df Out[4]: A B C 0 4 10 100 1 5 20 50 2 6 30 -30 3 7 40 -50

In [6]: row_hashes = pd.util.hash_pandas_object(df)

In [7]: row_hashes Out[7]: 0 14190898035981950066 1 16858535338008670510 2 1055569624497948892 3 5944630256416341839 dtype: uint64 ```

Combining the returned value of hash_pandas_object() with Python's hashlib gives something one can work with:

```python In [8]: import hashlib

In [10]: hashlib.sha1(row_hashes.values).hexdigest() # Compute overall hash of all rows. Out[10]: '1e1244d9b0489e1f479271f147025956d4994f67' ```

Regarding dask, I have no idea :) cc @TomAugspurger

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  775502974
Powered by Datasette · Queries took 2.504ms · About: xarray-datasette