Thanks a lot Gus for you help again. I only have one CPU per node. The -n X option (no matter what the value of X is) shows X processes running on one node only (the other one is free). If I add the machinefile option with WN1 and WN2 in it, the right behavior is manifested. According to the documentation, mpirun should get the PBS_NODEFILE automatically from the PBS. So, I do not need to use machinefile.
Any ideas? Thanks a lot in advance. ~Belaid. > Date: Tue, 1 Dec 2009 15:42:30 -0500 > From: g...@ldeo.columbia.edu > To: us...@open-mpi.org > Subject: Re: [OMPI users] mpirun is using one PBS node only > > Hi Belaid Moa > > Belaid MOA wrote: > > Hi everyone, > > Here is another elementary question. I tried the following steps found > > in the FAQ section of www.open-mpi.org with a simple hello world example > > (with PBS/torque): > > $ qsub -l nodes=2 my_script.sh > > > > my_script.sh is pasted below: > > ======================== > > #!/bin/sh -l > > #PBS -N helloTest > > #PBS -j eo > > echo `cat $PBS_NODEFILE` # shows two nodes: WN1 WN2 > > cd $PBS_O_WORKDIR > > /usr/local/bin/mpirun hello > > ======================== > > > > When the job is submitted, only one process is ran. When I add the -n 2 > > option to the mpirun command, > > two processes are ran but on one node only. > > Do you have a single CPU/core per node? > Or are they multi-socket/multi-core? > > Check "man mpiexec" for the options that control on which nodes and > slots, etc your program will run. > ("Man mpiexec" will tell you more than I possibly can.) > > The default option is "-byslot", > which will use all "slots" (actually cores > or CPUs) available on a node before it moves to the next node. > Reading your question and your surprise with the result, > I would guess what you want is "-bynode" (not the default). > > Also, if you have more than one CPU/core per node, > you need to put this information in your Torque/PBS "nodes" file > (and restart your pbs_server daemon). > Something like this (for 2 CPUs/cores per node): > > WN1 np=2 > WN2 np=2 > > I hope this helps, > Gus Correa > --------------------------------------------------------------------- > Gustavo Correa > Lamont-Doherty Earth Observatory - Columbia University > Palisades, NY, 10964-8000 - USA > --------------------------------------------------------------------- > > > > Note that echo `cat > > $PBS_NODEFILE` outputs > > the two nodes I am using: WN1 and WN2. > > > > The output from ompi_info is shown below: > > > > $ ompi_info | grep tm > > MCA memory: ptmalloc2 (MCA v2.0, API v2.0, Component v1.3.3) > > MCA ras: tm (MCA v2.0, API v2.0, Component v1.3.3) > > MCA plm: tm (MCA v2.0, API v2.0, Component v1.3.3) > > > > Any help on why openMPI/mpirun is using only one PBS node is very > > appreciated. > > > > Thanks a lot in advance and sorry for bothering you guys with my > > elementary questions! > > > > ~Belaid. > > > > > > > > ------------------------------------------------------------------------ > > Windows Live: Keep your friends up to date with what you do online. > > <http://go.microsoft.com/?linkid=9691810> > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users _________________________________________________________________ Windows Live: Keep your friends up to date with what you do online. http://go.microsoft.com/?linkid=9691815