thanks
But, the code is too long.

Jack Oct. 25 2010
> Date: Mon, 25 Oct 2010 14:08:54 -0400
> From: g...@ldeo.columbia.edu
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] Open MPI program cannot complete
> 
> Your job may be queued, not executing, because there are no
> resources available, all nodes are busy.
> Try qstat -a.
> 
> Posting a code snippet with all your MPI calls may prove effective.
> You might get a trove of advice for a thrift of effort.
> 
> Jeff Squyres wrote:
> > Check the man page for qsub for proper use.
> > 
> > 
> > On Oct 25, 2010, at 1:49 PM, Jack Bryan wrote:
> > 
> >> thanks
> >>
> >> I use 
> >>         qsub -I nsga2_job.sh
> >>         qsub: waiting for job 48270.clusterName to start
> >>
> >> By qstat
> >>     I found the job name is none and no results show up. 
> >>
> >> No shell prompt appear, the command line is hang there , no response. 
> >>
> >> Any help is appreciated. 
> >>
> >> Thanks
> >>
> >> Jack 
> >>
> >> Oct. 25 2010
> >>
> >>> From: jsquy...@cisco.com
> >>> Date: Mon, 25 Oct 2010 13:39:30 -0400
> >>> To: us...@open-mpi.org
> >>> Subject: Re: [OMPI users] Open MPI program cannot complete
> >>>
> >>> Can you use the interactive mode of PBS to get 5 cores on 1 node? IIRC, 
> >>> "qsub -I ..." ?
> >>>
> >>> Then you get a shell prompt with your allocated cores and can run stuff 
> >>> interactively. I don't know if your site allows this, but interactive 
> >>> debugging here might be *significantly* easier than try to automate some 
> >>> debugging.
> >>>
> >>>
> >>> On Oct 25, 2010, at 1:35 PM, Jack Bryan wrote:
> >>>
> >>>> thanks
> >>>>
> >>>> I have to use #PBS to submit any jobs in my cluster. 
> >>>> I cannot use command line to hang a job on my cluster. 
> >>>>
> >>>> this is my script: 
> >>>> --------------------------------------
> >>>> #!/bin/bash
> >>>> #PBS -N jobname
> >>>> #PBS -l walltime=00:08:00,nodes=1
> >>>> #PBS -q queuename
> >>>> COMMAND=/mypath/myprog
> >>>> NCORES=5
> >>>>
> >>>> cd $PBS_O_WORKDIR
> >>>> NODES=`cat $PBS_NODEFILE | wc -l`
> >>>> NPROC=$(( $NCORES * $NODES ))
> >>>>
> >>>> mpirun -np $NPROC --mca btl self,sm,openib $COMMAND
> >>>>
> >>>> -------------------------------------------
> >>>>
> >>>> Where should I put the (gdb --batch -ex 'bt full' -ex 'info reg' -pid 
> >>>> ZOMBIE_PID) in the script ? 
> >>>> And how to get ZOMBIE_PID from the script ? 
> >>>>
> >>>> Any help is appreciated. 
> >>>>
> >>>> thanks
> >>>>
> >>>> Oct. 25 2010
> >>>>
> >>>> Date: Mon, 25 Oct 2010 19:24:35 +0200
> >>>> From: j...@59a2.org
> >>>> To: us...@open-mpi.org
> >>>> Subject: Re: [OMPI users] Open MPI program cannot complete
> >>>>
> >>>> On Mon, Oct 25, 2010 at 19:07, Jack Bryan <dtustud...@hotmail.com> wrote:
> >>>> I need to use #PBS parallel job script to submit a job on MPI cluster. 
> >>>>
> >>>> Is it not possible to reproduce locally? Most clusters have a way to 
> >>>> submit an interactive job (which would let you start this thing and then 
> >>>> inspect individual processes). Ashley's Padb suggestion will certainly 
> >>>> be better in a non-interactive environment.
> >>>>
> >>>> Where should I put the (gdb --batch -ex 'bt full' -ex 'info reg' -pid 
> >>>> ZOMBIE_PID) in the script ? 
> >>>>
> >>>> Is control returning to your script after rank 0 has exited? In that 
> >>>> case, you can just put this on the next line.
> >>>>
> >>>> How to get the ZOMBIE_PID ? 
> >>>>
> >>>> "ps" from the command line, or getpid() from C code.
> >>>>
> >>>> Jed
> >>>>
> >>>> _______________________________________________ users mailing list 
> >>>> us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>> _______________________________________________
> >>>> users mailing list
> >>>> us...@open-mpi.org
> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>
> >>> -- 
> >>> Jeff Squyres
> >>> jsquy...@cisco.com
> >>> For corporate legal information go to:
> >>> http://www.cisco.com/web/about/doing_business/legal/cri/
> >>>
> >>>
> >>> _______________________________________________
> >>> users mailing list
> >>> us...@open-mpi.org
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >> _______________________________________________
> >> users mailing list
> >> us...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > 
> > 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
                                          

Reply via email to