issues: 1676561243
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1676561243 | I_kwDOAMm_X85j7ktb | 7772 | Process getting killed due to high memory consumption of xarray's nbytes method | 123355381 | closed | 0 | 6 | 2023-04-20T11:46:02Z | 2023-04-24T10:51:16Z | 2023-04-24T10:50:02Z | NONE | What is your issue?The When I call The code to generate the sample dataset is below:
After running the above code( @profile def get_dataset_size() : dataset = xa.open_dataset("test_1.nc") print(dataset.nbytes) if name == "main":
get_dataset_size()
Note: if the machine's memory size is only 8 gigabytes, this process will be killed. Instead, we can use another method to calculate the size of the file, which will not consume too much memory to compute and provide the same result as the nbytes method: Other Method's code: ``` import xarray as xa from memory_profiler import profile @profile def get_dataset_size() : dataset = xa.open_dataset("test_1.nc") print(sum(v.size * v.dtype.itemsize for v in dataset.variables.values())) if name == "main":
get_dataset_size()
Line # Mem usage Increment Occurrences Line Contents
So why have that |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7772/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |