> Even inside MPICH2, I have given little attention to threadsafety and > the MPI-IO routines. In MPICH2, each MPI_File* function grabs the big > critical section lock -- not pretty but it gets the job done. > When ported to OpenMPI, I don't know how the locking works. > Furthermore, the MPI-IO library inside OpenMPI-1.4.3 is pretty old. I > wonder if the locking we added over the years will help? Can you try > openmpi-1.5.3 and report what happens?
In Openmpi-1.5.3 with enabled threading support, the MPI-IO routines work without any problems. However, the dead lock now occurs when calling mpi_finalize with the backtrace given below. This deadlock is independent of the number of mpi tasks. However, the deadlock during mpi_finalize does not occur when no MPI-IO routines where called before. Unfortunately, the program terminates with a segfault in this case, after returning from mpi_finalize (at the end of the program) Fabian opal_mutex_lock(): Resource deadlock avoided #0 0x0012e416 in __kernel_vsyscall () #1 0x01035941 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #2 0x01038e42 in abort () at abort.c:92 #3 0x00d9da68 in ompi_attr_free_keyval (type=COMM_ATTR, key=0xbffda0e4, predefined=0 '\000') at attribute/attribute.c:656 #4 0x00dd8aa2 in PMPI_Keyval_free (keyval=0xbffda0e4) at pkeyval_free.c:52 #5 0x01bf3e6a in ADIOI_End_call (comm=0xf1c0c0, keyval=10, attribute_val=0x0, extra_state=0x0) at ad_end.c:82 #6 0x00da01bb in ompi_attr_delete. (type=UNUSED_ATTR, object=0x6, attr_hash=0x2c64, key=14285602, predefined=232 '\350', need_lock=128 '\200') at attribute/attribute.c:726 #7 0x00d9fb22 in ompi_attr_delete_all (type=COMM_ATTR, object=0xf1c0c0, attr_hash=0x8d0fee8) at attribute/attribute.c:1043 #8 0x00dbda65 in ompi_mpi_finalize () at runtime/ompi_mpi_finalize.c:133 #9 0x00dd12c2 in PMPI_Finalize () at pfinalize.c:46 #10 0x00d6b515 in mpi_finalize_f (ierr=0xbffda2b8) at pfinalize_f.c:62 .