home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 654568514

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/4203#issuecomment-654568514 https://api.github.com/repos/pydata/xarray/issues/4203 654568514 MDEyOklzc3VlQ29tbWVudDY1NDU2ODUxNA== 15280721 2020-07-07T02:56:22Z 2020-07-07T02:56:22Z NONE

thanks, that helps. First of all (unless I did something wrong with the read_csv call), there's a Unnamed: 0 column that has to be removed.

Other than that, your data seems to be quite sparse so that's an ideal fit for sparse:

python In [38]: %%time ...: df = pd.read_csv("/tmp/data.csv") ...: a = df.drop("Unnamed: 0", axis=1).set_index(["lat", "lon"]) ...: a = a.stack() ...: a.index.names = ["lat", "lon", "time"] ...: a = a.sort_index() ...: a.name = "T" ...: xr.DataArray.from_series(a, sparse=True) ...: ...: CPU times: user 606 ms, sys: 63.9 ms, total: 670 ms Wall time: 670 ms Out[38]: <xarray.DataArray 'T' (lat: 16100, lon: 29959, time: 31)> <COO: shape=(16100, 29959, 31), dtype=float64, nnz=1003191, fill_value=nan> Coordinates: * lat (lat) float64 37.5 37.5 37.5 37.5 37.5 ... 43.1 43.1 43.1 43.1 43.1 * lon (lon) float64 96.46 96.46 96.46 96.47 ... 102.6 102.6 102.6 102.6 * time (time) object '2011-01-01 00:00:00' ... '2011-01-31 00:00:00'

Thanks for your codes! I noticed that only two decimal numbers are kept for lat and lon. Does it mean a resample process happened? The data is a grid with 0.005 degree resolution, can I keep the resolution in the results?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  651101286
Powered by Datasette · Queries took 0.644ms · About: xarray-datasette