home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 273529203

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1217#issuecomment-273529203 https://api.github.com/repos/pydata/xarray/issues/1217 273529203 MDEyOklzc3VlQ29tbWVudDI3MzUyOTIwMw== 4849151 2017-01-18T16:43:03Z 2017-01-19T05:15:52Z NONE

The problem isn't as bad with a smaller example (though the runtime is doubled). I've attached a minimum working example, which seems to suggest that maybe there was a problem with xarray creating a MultiIndex and duplicating all the data? (I've left in input() to allow checking memory usage before the program exists, but there isn't much difference in this example). xrmin.py.txt

Edit by @shoyer: added code from attachment inline: ```python

!/usr/bin/env python3

import time import sys import numpy as np import xarray as xr

ds = xr.Dataset() ds['data1'] = xr.DataArray(np.arange(1000), coords={'t1': np.linspace(0, 1, 1000)}) ds['data1b'] = xr.DataArray(np.arange(1000, 2000), coords={'t1': np.linspace(0, 1, 1000)}) ds['data2'] = xr.DataArray(np.arange(2000, 5000), coords={'t2': np.linspace(0, 1, 3000)}) ds['data2b'] = xr.DataArray(np.arange(6000, 9000), coords={'t2': np.linspace(0, 1, 3000)}) if sys.argv[1] == "nodrop": now = time.time() print(ds.where(ds.data1 < 50, drop=True)) print("Took {} seconds".format(time.time() - now)) elif sys.argv[1] == "drop": ds1 = ds.drop('t2') now = time.time() print(ds1.where(ds1.data1 < 50, drop=True)) print("Took {} seconds".format(time.time() - now)) input("Press return to exit") ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  201617371
Powered by Datasette · Queries took 0.574ms · About: xarray-datasette