Thanks, what you said seems to be right, I just checked and solved it.
It might be caused by a conflict between openmpi and mpich library.
在 2022/11/2 02:06, Pritchard Jr., Howard 写道:
HI,
You are using MPICH or a vendor derivative of MPICH. You probably
want to resend this email to the mpich users/help mail list.
Howard
*From: *users <users-boun...@lists.open-mpi.org> on behalf of mrlong
via users <users@lists.open-mpi.org>
*Reply-To: *Open MPI Users <users@lists.open-mpi.org>
*Date: *Tuesday, November 1, 2022 at 11:26 AM
*To: *"de...@lists.open-mpi.org" <de...@lists.open-mpi.org>,
"users@lists.open-mpi.org" <users@lists.open-mpi.org>
*Cc: *mrlong <mrlong...@gmail.com>
*Subject: *[EXTERNAL] [OMPI users] OFI,
destroy_vni_context(1137).......: OFI domain close failed
(ofi_init.c:1137:destroy_vni_context:Device or resource busy)
Hi, teachers
code:
import mpi4py
import time
import numpy as np
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
print("rank",rank)
if __name__ == '__main__':
if rank == 0:
mem = np.array([0], dtype='i')
win = MPI.Win.Create(mem, comm=comm)
else:
win = MPI.Win.Create(None, comm=comm)
print(rank, "end")
(py3.6.8) ➜ ~ mpirun -n 2 python -u test.py
<https://urldefense.com/v3/__http:/test.py__;!!Bt8fGhp8LhKGRg!EpS4l-5_ADRkiOPiRrqKHV_deuvAYDui9_niJetq7MR6TwaQ5cLC_akDsMLZGdFmPOtiSFaby1mi2zqnczR1$>
rank 0
rank 1
0 end
1 end
Abort(806449679): Fatal error in internal_Finalize: Other MPI error,
error stack:
internal_Finalize(50)...........: MPI_Finalize failed
MPII_Finalize(345)..............:
MPID_Finalize(511)..............:
MPIDI_OFI_mpi_finalize_hook(895):
destroy_vni_context(1137).......: OFI domain close failed
(ofi_init.c:1137:destroy_vni_context:Device or resource busy)
*Why is this happening? How to debug? This error is not reported on
the other machine.*