html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/6904#issuecomment-1210976795,https://api.github.com/repos/pydata/xarray/issues/6904,1210976795,IC_kwDOAMm_X85ILgob,1217238,2022-08-10T16:43:36Z,2022-08-10T16:43:36Z,MEMBER,"You might look into different multiprocessing modes: https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods It may also be that the NetCDF or HDF5 libraries were simply not written in a way that can support multi-processing. This would not surprise me. > BTW is there any advantage or difference in terms of cpu and memory consumption in opening the file only one or let it open by every process? I'm asking because I thought opening in every process was just plain stupid but it seems to perform exactly the same, so maybe I'm just creating a problem where there is none I agree, maybe this isn't worth the trouble. I have not seen it done successfully before.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1333650265 https://github.com/pydata/xarray/issues/6904#issuecomment-1210255676,https://api.github.com/repos/pydata/xarray/issues/6904,1210255676,IC_kwDOAMm_X85IIwk8,1217238,2022-08-10T07:10:41Z,2022-08-10T07:10:41Z,MEMBER,"> Will that work in the same way if I still use `process_map`, which uses `concurrent.futures` under the hood? Yes it should, as long as you're using multi-processing under the covers. If you do multi-threading, then you would want to use `threading.Lock()`. But I believe we already apply a thread lock by default.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1333650265 https://github.com/pydata/xarray/issues/6904#issuecomment-1210233503,https://api.github.com/repos/pydata/xarray/issues/6904,1210233503,IC_kwDOAMm_X85IIrKf,1217238,2022-08-10T06:45:06Z,2022-08-10T06:45:06Z,MEMBER,"Can you try explicitly passing in a multiprocessing lock into the `open_dataset()` constructor? Something like: ```python from multiprocessing import Lock ds = xarray.open_dataset(file, lock=Lock()) ``` (We automatically select appropriate locks if using Dask, but I'm not sure how we would do that more generally...)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1333650265