Hi Belaid Moa

Belaid MOA wrote:
Thank you very very much Gus. Does this mean that OpenMPI does not copy the executable from the master node to the worker nodes?

Not that I know.
Making the executable available on the nodes, and any
input files the program may need, is the user's responsibility,
not of mpiexec.

On the other hand,
Torque/PBS has a "stage_in/stage_out" feature that is supposed to
copy files over to the nodes, if you want to give it a shot.
See "man qsub" and look into the (numerous) "-W" option under
the "stage[in,out]=file_list" sub-options.
This is a relic from the old days where everything had to be on
local disks on the nodes, and NFS ran over Ethernet 10/100,
but it is still used by people that
run MPI programs with heavy I/O, to avoid pounding on NFS or
even on parallel file systems.
I tried the stage_in/out feature a loooong time ago,
(old PBS before Torque), but it had issues.
It probably works now with the newer/better
versions of Torque.

However, the easy way to get this right is just to use an NFS mounted
directory.

If that's case, I will go ahead and NFS mount my working directory.


This would make your life much easier.

My $0.02.
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------




~Belaid.


 > Date: Tue, 1 Dec 2009 13:50:57 -0500
 > From: g...@ldeo.columbia.edu
 > To: us...@open-mpi.org
> Subject: Re: [OMPI users] Elementary question on openMPI application location when using PBS submission
 >
 > Hi Belaid MOA
 >
 > See this FAQ:
> http://www.open-mpi.org/faq/?category=running#do-i-need-a-common-filesystem
 > http://www.open-mpi.org/faq/?category=building#where-to-install
 > http://www.open-mpi.org/faq/?category=tm#tm-obtain-host
 >
 > Your executable needs to be on a directory that is accessible
 > by all nodes in your node pool.
 > An easy way to achieve this is to put it in a directory that
 > is NFS mounted on all nodes, and launch your pbs script from there.
 >
 > A less convenient alternative, if no NFS directory is available,
 > is to copy the executable over to the nodes.
 >
 > I also find it easier to write a PBS script instead of putting
 > all the PBS directives in the command line.
 > In this case you can put the lines below in your PBS script,
 > to ensure all nodes will be on your work directory (cd $PBS_O_WORKDIR):
 >
 > ########
 >
 > #PBS ... (PBS directives)
 > ...
 > cd $PBS_O_WORKDIR
 > mpiexec -n ....
 >
 > ########
 >
 > IIRR, by default Torque/PBS puts you in your home directory on
 > the nodes, which may or may not be the location of your executable.
 >
 > I hope this helps,
 > Gus Correa
 > ---------------------------------------------------------------------
 > Gustavo Correa
 > Lamont-Doherty Earth Observatory - Columbia University
 > Palisades, NY, 10964-8000 - USA
 > ---------------------------------------------------------------------
 >
 > Belaid MOA wrote:
 > > Hello everyone,
> > I am new to this list and I have a very elementary question: suppose we
 > > have three machines, HN (Head Node hosting the pbs server), WN1 (A
> > worker node) and WN (another worker node). The PBS nodefile has WN1 and
 > > WN2 in it (DOES NOT HAVE HN).
 > > My openMPI program (hello) and PBS script(my_script.sh) reside on the
 > > HN. When I submit my PBS script using qsub -l nodes=2 my_script.sh, I
 > > get the following error:
 > >
> > --------------------------------------------------------------------------
 > > mpirun was unable to launch the specified application as it could not
 > > find an executable:
 > >
 > > Executable: hello
 > > Node: WN2
 > >
 > > while attempting to start process rank 0.
> > --------------------------------------------------------------------------
 > >
 > > How come my hello program is not copied automatically to the worker
 > > nodes? This leads to my elementary question:
 > > where the application should be when using a PBS submission?
 > >
> > Note that when I run mpirun from HN with machinefile containing WN1 and
 > > WN2, I get the right output.
 > >
 > > Any help on this is very appreciated.
 > >
 > > ~Belaid.
 > >
 > >
> > ------------------------------------------------------------------------
 > > Windows Live: Keep your friends up to date with what you do online.
 > > <http://go.microsoft.com/?linkid=9691810>
 > >
 > >
> > ------------------------------------------------------------------------
 > >
 > > _______________________________________________
 > > users mailing list
 > > us...@open-mpi.org
 > > http://www.open-mpi.org/mailman/listinfo.cgi/users
 >
 > _______________________________________________
 > users mailing list
 > us...@open-mpi.org
 > http://www.open-mpi.org/mailman/listinfo.cgi/users

------------------------------------------------------------------------
Windows Live: Make it easier for your friends to see what you’re up to on Facebook. <http://go.microsoft.com/?linkid=9691811>


------------------------------------------------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to