home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 2230680765

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
2230680765 I_kwDOAMm_X86E9Xy9 8919 Using the xarray.Dataset.where() function takes up a lot of memory 69391863 closed 0     4 2024-04-08T09:15:49Z 2024-04-09T02:45:09Z 2024-04-09T02:45:08Z NONE      

What is your issue?

My python script was killed because it took up too much memory. After checking, I found that the problem is the ds.where() function.

The original netcdf file opened from the hard disk takes up about 10 Mb of storage, but when I mask the data that doesn't match according to the latitude and longitude location, the variable ds takes up a dozen GB of memory. When I deleted this variable using del ds, the memory occupied by the script immediately returned to normal.

```

Open this netcdf file.

ds = xr.open_dataset(track)

If longitude range is [-180, 180], then convert to [0, 360].

if np.any(ds[var_lon] < 0): ds[var_lon] = ds[var_lon] % 360

Extract data by longitude and latitude.

ds = ds.where((ds[var_lon] >= region[0]) & (ds[var_lon] <= region[1]) & (ds[var_lat] >= region[2]) & (ds[var_lat] <= region[3]))

Select data by range and value of some variables.

for key, value in range_select.items(): ds = ds.where((ds[key] >= value[0]) & (ds[key] <= value[1])) for key, value in value_select.items(): ds = ds.where(ds[key].isin(value)) ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8919/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 0 rows from issue in issue_comments
Powered by Datasette · Queries took 1.469ms · About: xarray-datasette