Hi, I've got a question on suspending/resuming an process started with "mpirun", I've already found the FAQ entry on this http://www.open-mpi.de/faq/?category=running#suspend-resume but I've still got a question on this. Basically for now let's assume I'm running all MPI processes on one host only with one multi-core CPU (so I could directly send SIGSTOP to other processes if I want to). What I wonder about is the following: I want to start multiple (let's say four) instances of my program with "mpirun -np 4 ./mybinary" and at some point during the program execution I want to suspend two of those four processes, those two processes are waiting at an MPI_Barrier() at this point. The goal of that is to suspend execution so that those processes don't use the CPU at all while they are suspended (that's not the case with MPI_Barrier as far as I understand this). So now my question basically is: Will it work when I send SIGSTOP signal from my MPI rank 0 process to these two processes while they are waiting at an MPI_Barrier and then those two processes won't use the CPU anymore? Later I want to resume the processes with SIGCONT when the other two processes also arrived at this MPI_Barrier. Performance of the barrier does not matter here, what matters for me is that those suspended processes don't cause any CPU usage. I never used SIGSTOP signal so far, so I'm not sure if this will work. And before I start coding the logic for this into my program, I thought I'll ask here first if this will work at all :).
Frank