thanks But, the code is too long.
Jack Oct. 25 2010 > Date: Mon, 25 Oct 2010 14:08:54 -0400 > From: g...@ldeo.columbia.edu > To: us...@open-mpi.org > Subject: Re: [OMPI users] Open MPI program cannot complete > > Your job may be queued, not executing, because there are no > resources available, all nodes are busy. > Try qstat -a. > > Posting a code snippet with all your MPI calls may prove effective. > You might get a trove of advice for a thrift of effort. > > Jeff Squyres wrote: > > Check the man page for qsub for proper use. > > > > > > On Oct 25, 2010, at 1:49 PM, Jack Bryan wrote: > > > >> thanks > >> > >> I use > >> qsub -I nsga2_job.sh > >> qsub: waiting for job 48270.clusterName to start > >> > >> By qstat > >> I found the job name is none and no results show up. > >> > >> No shell prompt appear, the command line is hang there , no response. > >> > >> Any help is appreciated. > >> > >> Thanks > >> > >> Jack > >> > >> Oct. 25 2010 > >> > >>> From: jsquy...@cisco.com > >>> Date: Mon, 25 Oct 2010 13:39:30 -0400 > >>> To: us...@open-mpi.org > >>> Subject: Re: [OMPI users] Open MPI program cannot complete > >>> > >>> Can you use the interactive mode of PBS to get 5 cores on 1 node? IIRC, > >>> "qsub -I ..." ? > >>> > >>> Then you get a shell prompt with your allocated cores and can run stuff > >>> interactively. I don't know if your site allows this, but interactive > >>> debugging here might be *significantly* easier than try to automate some > >>> debugging. > >>> > >>> > >>> On Oct 25, 2010, at 1:35 PM, Jack Bryan wrote: > >>> > >>>> thanks > >>>> > >>>> I have to use #PBS to submit any jobs in my cluster. > >>>> I cannot use command line to hang a job on my cluster. > >>>> > >>>> this is my script: > >>>> -------------------------------------- > >>>> #!/bin/bash > >>>> #PBS -N jobname > >>>> #PBS -l walltime=00:08:00,nodes=1 > >>>> #PBS -q queuename > >>>> COMMAND=/mypath/myprog > >>>> NCORES=5 > >>>> > >>>> cd $PBS_O_WORKDIR > >>>> NODES=`cat $PBS_NODEFILE | wc -l` > >>>> NPROC=$(( $NCORES * $NODES )) > >>>> > >>>> mpirun -np $NPROC --mca btl self,sm,openib $COMMAND > >>>> > >>>> ------------------------------------------- > >>>> > >>>> Where should I put the (gdb --batch -ex 'bt full' -ex 'info reg' -pid > >>>> ZOMBIE_PID) in the script ? > >>>> And how to get ZOMBIE_PID from the script ? > >>>> > >>>> Any help is appreciated. > >>>> > >>>> thanks > >>>> > >>>> Oct. 25 2010 > >>>> > >>>> Date: Mon, 25 Oct 2010 19:24:35 +0200 > >>>> From: j...@59a2.org > >>>> To: us...@open-mpi.org > >>>> Subject: Re: [OMPI users] Open MPI program cannot complete > >>>> > >>>> On Mon, Oct 25, 2010 at 19:07, Jack Bryan <dtustud...@hotmail.com> wrote: > >>>> I need to use #PBS parallel job script to submit a job on MPI cluster. > >>>> > >>>> Is it not possible to reproduce locally? Most clusters have a way to > >>>> submit an interactive job (which would let you start this thing and then > >>>> inspect individual processes). Ashley's Padb suggestion will certainly > >>>> be better in a non-interactive environment. > >>>> > >>>> Where should I put the (gdb --batch -ex 'bt full' -ex 'info reg' -pid > >>>> ZOMBIE_PID) in the script ? > >>>> > >>>> Is control returning to your script after rank 0 has exited? In that > >>>> case, you can just put this on the next line. > >>>> > >>>> How to get the ZOMBIE_PID ? > >>>> > >>>> "ps" from the command line, or getpid() from C code. > >>>> > >>>> Jed > >>>> > >>>> _______________________________________________ users mailing list > >>>> us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users > >>>> _______________________________________________ > >>>> users mailing list > >>>> us...@open-mpi.org > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >>> > >>> -- > >>> Jeff Squyres > >>> jsquy...@cisco.com > >>> For corporate legal information go to: > >>> http://www.cisco.com/web/about/doing_business/legal/cri/ > >>> > >>> > >>> _______________________________________________ > >>> users mailing list > >>> us...@open-mpi.org > >>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >> _______________________________________________ > >> users mailing list > >> us...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users