id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 2230680765,I_kwDOAMm_X86E9Xy9,8919,Using the xarray.Dataset.where() function takes up a lot of memory,69391863,closed,0,,,4,2024-04-08T09:15:49Z,2024-04-09T02:45:09Z,2024-04-09T02:45:08Z,NONE,,,,"### What is your issue? My python script was killed because it took up too much memory. After checking, I found that the problem is the ds.where() function. The original netcdf file opened from the hard disk takes up about 10 Mb of storage, but when I mask the data that doesn't match according to the latitude and longitude location, the variable **ds** takes up a dozen GB of memory. When I deleted this variable using del ds, the memory occupied by the script immediately returned to normal. ``` # Open this netcdf file. ds = xr.open_dataset(track) # If longitude range is [-180, 180], then convert to [0, 360]. if np.any(ds[var_lon] < 0): ds[var_lon] = ds[var_lon] % 360 # Extract data by longitude and latitude. ds = ds.where((ds[var_lon] >= region[0]) & (ds[var_lon] <= region[1]) & (ds[var_lat] >= region[2]) & (ds[var_lat] <= region[3])) # Select data by range and value of some variables. for key, value in range_select.items(): ds = ds.where((ds[key] >= value[0]) & (ds[key] <= value[1])) for key, value in value_select.items(): ds = ds.where(ds[key].isin(value)) ```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8919/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue