Thanks a lot Gus for you help again. I only have one CPU per node. 
The -n X option (no matter what the value of X is) shows X processes running on 
one node only (the other one is free).
If I add the machinefile option with WN1 and WN2 in it, the right behavior is 
manifested. According to the documentation,
mpirun should get the PBS_NODEFILE automatically from the PBS. So, I do not 
need to use machinefile.

Any ideas?

Thanks a lot in advance.
~Belaid. 


> Date: Tue, 1 Dec 2009 15:42:30 -0500
> From: g...@ldeo.columbia.edu
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] mpirun is using one PBS node only
> 
> Hi Belaid Moa
> 
> Belaid MOA wrote:
> > Hi everyone,
> >  Here is another elementary question. I tried the following steps found 
> > in the FAQ section of www.open-mpi.org with a simple hello world example 
> > (with PBS/torque):
> >  $  qsub -l nodes=2 my_script.sh
> > 
> > my_script.sh is pasted below:
> > ========================
> > #!/bin/sh -l
> > #PBS -N helloTest
> > #PBS -j eo
> > echo `cat $PBS_NODEFILE` # shows two nodes: WN1 WN2
> > cd $PBS_O_WORKDIR
> > /usr/local/bin/mpirun hello
> > ========================
> > 
> > When the job is submitted, only one process is ran. When I add the -n 2 
> > option to the mpirun command,
> > two processes are ran but on one node only. 
> 
> Do you have a single CPU/core per node?
> Or are they multi-socket/multi-core?
> 
> Check "man mpiexec" for the options that control on which nodes and
> slots, etc your program will run.
> ("Man mpiexec" will tell you more than I possibly can.)
> 
> The default option is "-byslot",
> which will use all "slots" (actually cores
> or CPUs) available on a node before it moves to the next node.
> Reading your question and your surprise with the result,
> I would guess what you want is "-bynode" (not the default).
> 
> Also, if you have more than one CPU/core per node,
> you need to put this information in your Torque/PBS "nodes" file
> (and restart your pbs_server daemon).
> Something like this (for 2 CPUs/cores per node):
> 
> WN1 np=2
> WN2 np=2
> 
> I hope this helps,
> Gus Correa
> ---------------------------------------------------------------------
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------
> 
> 
> > Note that  echo `cat 
> > $PBS_NODEFILE` outputs
> > the two nodes I am using: WN1 and WN2.
> > 
> > The output from ompi_info is shown below:
> > 
> > $ ompi_info | grep tm
> >               MCA memory: ptmalloc2 (MCA v2.0, API v2.0, Component v1.3.3)
> >                  MCA ras: tm (MCA v2.0, API v2.0, Component v1.3.3)
> >                  MCA plm: tm (MCA v2.0, API v2.0, Component v1.3.3)
> > 
> >  Any help on why openMPI/mpirun is using only one PBS node is very 
> > appreciated.
> > 
> > Thanks a lot in advance and sorry for bothering you guys with my 
> > elementary questions!
> > 
> > ~Belaid. 
> > 
> > 
> > 
> > ------------------------------------------------------------------------
> > Windows Live: Keep your friends up to date with what you do online. 
> > <http://go.microsoft.com/?linkid=9691810>
> > 
> > 
> > ------------------------------------------------------------------------
> > 
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
                                          
_________________________________________________________________
Windows Live: Keep your friends up to date with what you do online.
http://go.microsoft.com/?linkid=9691815

Reply via email to