html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1217#issuecomment-273529203,https://api.github.com/repos/pydata/xarray/issues/1217,273529203,MDEyOklzc3VlQ29tbWVudDI3MzUyOTIwMw==,4849151,2017-01-18T16:43:03Z,2017-01-19T05:15:52Z,NONE,"The problem isn't as bad with a smaller example (though the runtime is doubled). I've attached a minimum working example, which seems to suggest that maybe there was a problem with xarray creating a MultiIndex and duplicating all the data?
(I've left in input() to allow checking memory usage before the program exists, but there isn't much difference in this example).
[xrmin.py.txt](https://github.com/pydata/xarray/files/714442/xrmin.py.txt)
Edit by @shoyer: added code from attachment inline:
```python
#!/usr/bin/env python3
import time
import sys
import numpy as np
import xarray as xr
ds = xr.Dataset()
ds['data1'] = xr.DataArray(np.arange(1000), coords={'t1': np.linspace(0, 1, 1000)})
ds['data1b'] = xr.DataArray(np.arange(1000, 2000), coords={'t1': np.linspace(0, 1, 1000)})
ds['data2'] = xr.DataArray(np.arange(2000, 5000), coords={'t2': np.linspace(0, 1, 3000)})
ds['data2b'] = xr.DataArray(np.arange(6000, 9000), coords={'t2': np.linspace(0, 1, 3000)})
if sys.argv[1] == ""nodrop"":
now = time.time()
print(ds.where(ds.data1 < 50, drop=True))
print(""Took {} seconds"".format(time.time() - now))
elif sys.argv[1] == ""drop"":
ds1 = ds.drop('t2')
now = time.time()
print(ds1.where(ds1.data1 < 50, drop=True))
print(""Took {} seconds"".format(time.time() - now))
input(""Press return to exit"")
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,201617371
https://github.com/pydata/xarray/issues/1217#issuecomment-273523770,https://api.github.com/repos/pydata/xarray/issues/1217,273523770,MDEyOklzc3VlQ29tbWVudDI3MzUyMzc3MA==,4849151,2017-01-18T16:25:19Z,2017-01-18T16:25:19Z,NONE,"data1 and data2 represent two stages of data acquisition within one ""shot"" of our experiment. I'd like to be able to group each shot's data into a single dataset.
I want to extract from the dataset only the values for which my where() condition is true, and I'll only be using DataArrays which share the same dimension as the one in the condition. For example, if I do:
ds_low = ds.where(ds.data1 < 0.1, drop=True)
I'll only use stuff in ds_low with the same dimension as ds.data1. So in my case extracting the data with the shared dimension using ds.drop() is appropriate.
It would be nice to have xarray throw a warning or error to prevent me chomping up all the RAM in my system if I do try to do this sort of thing though. Or it could simply mask off with NaN everything in the DataArrays which have a different dimension.
Give me a second to provide a minimal working example.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,201617371