[OMPI users] How to cease the process triggered by OPENMPI
Dear all, I have enjoyed the openmpi a couple of days. With the help of openmpi I could run ESPRESSO efficiently. I started the mpi-job by the openmpi command like this, " nohup mpirun -hostfile ~/hostfile -np 64 pw.x < input > output &". When I want to stop the job before it finished, I find it not easy to stop all the process manually. When I killed the process in one node of the cluster, the processes in other nodes were still running. So I must ssh to every node, find the process id and kill the process. If there are 100 processors or more for one mpi job, the situation even worse. Is there a command for openmpi to force all the process to stop in the cluster or a list of nodes to stop. vega Vega Lew (weijia liu) PH.D Candidate in Chemical Engineering State Key Laboratory of Materials-oriented Chemical Engineering College of Chemistry and Chemical Engineering Nanjing University of Technology, 210009, Nanjing, Jiangsu, China _ Explore the seven wonders of the world http://search.msn.com/results.aspx?q=7+wonders+world&mkt=en-US&form=QBRE
Re: [OMPI users] How to cease the process triggered by OPENMPI
Does the cluster your using use a batch system? Like SLURM, PBS or other? If so many have native ways to launch jobs that OMPI can use. SO that when the job is killed all the children are also. Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Jul 26, 2008, at 12:25 PM, vega lew wrote: Dear all, I have enjoyed the openmpi a couple of days. With the help of openmpi I could run ESPRESSO efficiently. I started the mpi-job by the openmpi command like this, " nohup mpirun -hostfile ~/hostfile -np 64 pw.x < input > output &". When I want to stop the job before it finished, I find it not easy to stop all the process manually. When I killed the process in one node of the cluster, the processes in other nodes were still running. So I must ssh to every node, find the process id and kill the process. If there are 100 processors or more for one mpi job, the situation even worse. Is there a command for openmpi to force all the process to stop in the cluster or a list of nodes to stop. vega Vega Lew (weijia liu) PH.D Candidate in Chemical Engineering State Key Laboratory of Materials-oriented Chemical Engineering College of Chemistry and Chemical Engineering Nanjing University of Technology, 210009, Nanjing, Jiangsu, China Explore the seven wonders of the world Learn more! ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] how to make a process start and then join a MPI group
Okay, so I've gotten a little bit closer. I'm using MPI_Comm_spawn to start several children processes. The problem is that the children are in their own group, separate from the parent (just the like the documentation says). I want to merge the children's group with the parent group so I can efficiently Send/Recv data between them.. Is this possible? Plan B: I guess if there is no elegant way to merge all those processes into one group, I can connect sockets and make intercomms to talk from the parent directly to each child. -- Mark Mark Borgerding wrote: I am writing a code module that plugs into a larger application framework. That framework loads my code module as a shared object. So I do not control how the first process gets started, but I still want it to be able to start and participate in an MPI group. Here's roughly what I want to happen ( I think): framework app running (not under my control) -> framework loads mycode.so shared object into its process -> mycode.so starts mpi programs on several hosts (e.g. via system call to mpiexec ) -> initial mycode.so process participates in the group he just started (e.g. he shows up in MPI_Comm_group, can use MPI_Send, MPI_Recv, etc. ) Can this be done? I am running under Centos 5.2 Thanks, Mark ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] how to make a process start and then join a MPI group
MPI_Intercomm_merge is what you are looking for. Aurelien Le 26 juil. 08 à 13:23, Mark Borgerding a écrit : Okay, so I've gotten a little bit closer. I'm using MPI_Comm_spawn to start several children processes. The problem is that the children are in their own group, separate from the parent (just the like the documentation says). I want to merge the children's group with the parent group so I can efficiently Send/ Recv data between them.. Is this possible? Plan B: I guess if there is no elegant way to merge all those processes into one group, I can connect sockets and make intercomms to talk from the parent directly to each child. -- Mark Mark Borgerding wrote: I am writing a code module that plugs into a larger application framework. That framework loads my code module as a shared object. So I do not control how the first process gets started, but I still want it to be able to start and participate in an MPI group. Here's roughly what I want to happen ( I think): framework app running (not under my control) -> framework loads mycode.so shared object into its process -> mycode.so starts mpi programs on several hosts (e.g. via system call to mpiexec ) -> initial mycode.so process participates in the group he just started (e.g. he shows up in MPI_Comm_group, can use MPI_Send, MPI_Recv, etc. ) Can this be done? I am running under Centos 5.2 Thanks, Mark ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users