home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 1061602285

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/2186#issuecomment-1061602285 https://api.github.com/repos/pydata/xarray/issues/2186 1061602285 IC_kwDOAMm_X84_RsPt 17830036 2022-03-08T10:00:07Z 2022-03-08T10:00:07Z NONE

Hello,

I am facing the same memory leak issue. I am using mpirun and dask-mpi on a slurm batch submission (see below). I am running through a time loop to perform some computations. After few iterations, the code blows up because out of memory issue. This does not happen if I execute the same code as a serial job. ``` from dask_mpi import initialize initialize()

from dask.distributed import Client client = Client()

main code goes here

ds = xr.open_mfdataset("*nc")

for i in range(0, len(ds.time)): ds1 = ds.isel(time=i) # perform some computations here

    ds1.close()

ds.close() ````

I have tried the following - explicit ds.close() calls on datasets - gc.collect() - client.cancel(vars)

None of the solutions worked for me. I have also tried increasing RAM but that didn't help either. I was wondering if anyone has found a work around this problem. @lumbric @shoyer @lkilcher

I am using dask 2022.2.0 dask-mpi 2021.11.0 xarray 0.21.1

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  326533369
Powered by Datasette · Queries took 0.815ms · About: xarray-datasette