Hi,

Am 09.04.2010 um 23:48 schrieb Cristobal Navarro:

> Thanks, 
> now i get mixed results and everything seems to be working ok with mixed mpi 
> xecution
> 
> is it normal that after receiving the results, the hosts remain busy like 15 
> seconds ??
> example

yes. This is the time SGE needs for housekeeping, ist can even take some 
minutes (especially if you kill a parallel job).

-- Reuti


> master:common master$ qrsh -verbose -pe orte 10 /opt/openmpi-1.4.1/bin/mpirun 
> -np 10 hostname
> Your job 65 ("mpirun") has been submitted
> waiting for interactive job to be scheduled ...
> Your interactive job 65 has been successfully scheduled.
> Establishing builtin session to host worker00.local ...
> worker00.local
> worker00.local
> worker00.local
> worker00.local
> worker00.local
> master.local
> master.local
> master.local
> master.local
> master.local
> #after some seconds, i query the hosts status and slots are still used
> master:common master$ qstat -f
> queuename                      qtype resv/used/tot. load_avg arch          
> states
> ---------------------------------------------------------------------------------
> all.q@master.local             BIP   0/5/16         0.02     darwin-x86    
>      65 0.55500 mpirun     master       r     04/09/2010 17:44:36     5       
>  
> ---------------------------------------------------------------------------------
> all.q@worker00.local           BIP   0/5/16         0.01     darwin-x86    
>      65 0.55500 mpirun     master       r     04/09/2010 17:44:36     5       
>  
> master:common master$ 
> 
> but after waiting more time, they get free again
> master:common master$ qstat -f
> queuename                      qtype resv/used/tot. load_avg arch          
> states
> ---------------------------------------------------------------------------------
> all.q@master.local             BIP   0/0/16         0.01     darwin-x86    
> ---------------------------------------------------------------------------------
> all.q@worker00.local           BIP   0/0/16         0.01     darwin-x86 
> 
> anyways these are just details, thanks to your help the important aspects are 
> working.
> Cristobal
> 
> 
> 
> 
> On Fri, Apr 9, 2010 at 1:34 PM, Reuti <re...@staff.uni-marburg.de> wrote:
> Am 09.04.2010 um 18:57 schrieb Cristobal Navarro:
> 
> > sorry the command was missing a number
> >
> > as you said it should be
> >
> > qrsh -verbose -pe pempi 6 mpirun -np 6 hostname
> > waiting for interactive job to be scheduled ...
> >
> > Your "qrsh" request could not be scheduled, try again later.
> > ---
> > this is my parallel enviroment
> > qconf -sp pempi
> > pe_name            pempi
> > slots              210
> > user_lists         NONE
> > xuser_lists        NONE
> > start_proc_args    /usr/bin/true
> > stop_proc_args     /usr/bin/true
> > allocation_rule    $pe_slots
> 
> $pe_slots means that all slots must come from one and the same machine (e.g. 
> for smp jobs). You can try $round_robin.
> 
> -- Reuti
> 
> 
> > control_slaves     TRUE
> > job_is_first_task  FALSE
> > urgency_slots      min
> > accounting_summary TRUE
> >
> > this is the queue
> > qconf -sq cola.q
> > qname                 cola.q
> > hostlist              @allhosts
> > seq_no                0
> > load_thresholds       np_load_avg=1.75
> > suspend_thresholds    NONE
> > nsuspend              1
> > suspend_interval      00:05:00
> > priority              0
> > min_cpu_interval      00:05:00
> > processors            UNDEFINED
> > qtype                 BATCH INTERACTIVE
> > ckpt_list             NONE
> > pe_list               make pempi
> > rerun                 FALSE
> > slots                 2
> > tmpdir                /tmp
> > shell                 /bin/csh
> >
> > i noticed that if i put 2 slots (since the queue has 2 slots) on the -pe 
> > pempi N   argument and also the full path to mpirun as you guys pointed, it 
> > works!!!
> > cristobal@neoideo:~$ qrsh -verbose -pe pempi 2 
> > /opt/openmpi-1.4.1/bin/mpirun -np 6 hostname
> > Your job 125 ("mpirun") has been submitted
> > waiting for interactive job to be scheduled ...
> > Your interactive job 125 has been successfully scheduled.
> > Establishing builtin session to host ijorge.local ...
> > ijorge.local
> > ijorge.local
> > ijorge.local
> > ijorge.local
> > ijorge.local
> > ijorge.local
> > cristobal@neoideo:~$ qrsh -verbose -pe pempi 2 
> > /opt/openmpi-1.4.1/bin/mpirun -np 6 hostname
> > Your job 126 ("mpirun") has been submitted
> > waiting for interactive job to be scheduled ...
> > Your interactive job 126 has been successfully scheduled.
> > Establishing builtin session to host neoideo ...
> > neoideo
> > neoideo
> > neoideo
> > neoideo
> > neoideo
> > neoideo
> > cristobal@neoideo:~$
> >
> > i just wonder why i didnt get mixed hostnames? like
> > neoideo
> > neoideo
> > ijorge.local
> > ijorge.local
> > neoideo
> > ijorge.local
> >
> > ??
> >
> > thanks for the help already!!!
> >
> > Cristobal
> >
> >
> >
> >
> > On Fri, Apr 9, 2010 at 8:58 AM, Huynh Thuc Cuoc <htc...@gmail.com> wrote:
> > Dear friend,
> > 1.
> > I prefer to use sge qsub cmd, for examples:
> >
> > [huong@ioitg2 MyPhylo]$ qsub -pe orte 3 myphylo.qsub
> > Your job 35 ("myphylo.qsub") has been submitted
> > [huong@ioitg2 MyPhylo]$ qstat
> > job-ID  prior   name       user         state submit/start at     queue     
> >                      slots ja-task-ID
> > -----------------------------------------------------------------------------------------------------------------
> >      35 0.55500 myphylo.qs huong        r     04/09/2010 19:28:59 
> > al...@node2.ioit-grid.ac.vn        3
> > [huong@ioitg2 MyPhylo]$ qstat
> > [huong@ioitg2 MyPhylo]$
> >
> > This job is running on node2 of my cluster.
> > My softs as following:
> > headnode: 4 CPUs. $GRAM, CentOS 5.4 + sge 6.2u4 (qmaster and also execd 
> > host) + openmpi 1.4.1
> > nodes 4CPUs, 1GRAM, CentOS 5.4 + sgeexecd + openmpi1.4.1
> > PE=orte and set to 4 slots.
> > The app myphylo.qsub has the long cmd in the shell:
> > /opt/openmpi/bin/mpirun -np 10 $HOME/MyPhylo/bin/par-phylo-builder --data . 
> > . . .
> > Try to set PE as orte, use default PE = make instead.
> >
> > 2. I test your cmd on my sytem as:
> > a.
> > [huong@ioitg2 MyPhylo]$ qrsh -verbose -pe make mpirun -np 6 hostname
> > error: Numerical value invalid!
> > The initial portion of string "mpirun" contains no decimal number
> > [huong@ioitg2 MyPhylo]$ qrsh -verbose -pe orte 2 mpirun -np 6 hostname
> > Your job 36 ("mpirun") has been submitted
> >
> > waiting for interactive job to be scheduled ...
> > Your interactive job 36 has been successfully scheduled.
> > Establishing builtin session to host ioitg2.ioit-grid.ac.vn ...
> > bash: mpirun: command not found
> > [huong@ioitg2 MyPhylo]$
> >
> > ERROR ! So I try:
> > [huong@ioitg2 MyPhylo]$ qrsh -verbose -pe orte 2 /opt/openmpi/bin/mpirun 
> > -np 6 hostname
> > Your job 38 ("mpirun") has been submitted
> >
> > waiting for interactive job to be scheduled ...
> > Your interactive job 38 has been successfully scheduled.
> > Establishing builtin session to host ioitg2.ioit-grid.ac.vn ...
> > ioitg2.ioit-grid.ac.vn
> > ioitg2.ioit-grid.ac.vn
> > ioitg2.ioit-grid.ac.vn
> > ioitg2.ioit-grid.ac.vn
> > ioitg2.ioit-grid.ac.vn
> > ioitg2.ioit-grid.ac.vn
> > [huong@ioitg2 MyPhylo]$
> >
> > This OK.
> > What is: the PATH points to where mpirun is located.
> >
> > TRY.
> >
> > Good chance
> > HT Cuoc
> >
> >
> > On Fri, Apr 9, 2010 at 11:02 AM, Cristobal Navarro <axisch...@gmail.com> 
> > wrote:
> > Hello,
> >
> > after some days of work and testing, i managed to install SGE on two 
> > machines, also installed openMPI 1.4.1 for each one.
> >
> > SGE is working, i can submit jobs and it schedules the jobs to the 
> > available cores total of 6,
> >
> > my problem is that im trying to run an openMPI job and i cant.
> >
> > this is an example of what i am trying.
> >
> >
> > $qrsh -verbose -pe pempi mpirun -np 6 hostname
> > Your job 105 ("mpirun") has been submitted
> > waiting for interactive job to be scheduled ...
> >
> > Your "qrsh" request could not be scheduled, try again later.
> >
> > im not sure what this can be,
> > in the ompi_info i have gridengine support.
> >
> > where do you recommend to look ??
> > thanks in advance
> >
> > Cristobal
> >
> >
> >
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to