Please keep the list posted.

Am 29.10.2014 um 18:47 schrieb Disny Disny:

> Hello Reuti
> this is the output of qhost and qstat -f but i don't know what it means so 
> i'm hoping you can help
> 
> kind regards.. 
> 
> root@sgemstr:~# qhost
> HOSTNAME                ARCH         NCPU NSOC NCOR NTHR NLOAD  MEMTOT  
> MEMUSE  SWAPTO  SWAPUS
> ----------------------------------------------------------------------------------------------
> global                  -               -    -    -    -     -       -       
> -       -       -
> gcl1                    lx-amd64        4    1    4    4     -    3.8G       
> -    6.7G       -
> gcl2                    lx-amd64        4    1    4    4     -    3.7G       
> -    3.8G       -
> gcl3                    lx-amd64        4    1    4    4     -    1.9G       
> -    6.7G       -
> shdwgcl4                lx-amd64        4    1    4    4     -    3.8G       
> -    3.8G       -
> root@sgemstr:~# qstat -f
> queuename                      qtype resv/used/tot. np_load  arch          
> states
> ---------------------------------------------------------------------------------
> all.q@gcl1                     BIP   0/0/4          -NA-     lx-amd64      au
> ---------------------------------------------------------------------------------
> all.q@gcl2                     BIP   0/0/4          -NA-     lx-amd64      au
> ---------------------------------------------------------------------------------
> all.q@gcl3                     BIP   0/0/4          -NA-     lx-amd64      au
> ---------------------------------------------------------------------------------
> all.q@shdwgcl4                 BIP   0/0/4          -NA-     lx-amd64      au

This looks like there is no communication between the qmaster and the execds. 
Checking the output of:

$ ps -e f

shows the `sgemaster` resp. `sgexecd` running on the systems? Do you have a 
firewall in place? Maybe the port 6444 and 6445 needs to be opened.

-- Reuti


>  
> ############################################################################
>  - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
> ############################################################################
>       4 0.00000 Sleeper    root         qw    10/23/2014 09:20:09     1       
> root@sgemstr:~# 
> 
> 
> On Thursday, October 23, 2014 6:38 PM, Reuti <re...@staff.uni-marburg.de> 
> wrote:
> 
> 
> Please check in `qhost` resp. `qstat -f` the state of the machines, i.e. 
> whether the execd can be reached by returning a suitable value for the 
> machines. - Reuti
> 
> Am 23.10.2014 um 17:35 schrieb Disny Disny:
> 
> > Yes during the exec installation it added a startup script but is there 
> > other startup i need to add to it manually??
> > 
> > 
> > From: Reuti <re...@staff.uni-marburg.de>; 
> > To: Disny Disny <disny.wo...@yahoo.com>; 
> > Cc: grid Engine Mailing List <users@gridengine.org>; 
> > Subject: Re: Queue instances dropped 
> > Sent: Thu, Oct 23, 2014 3:29:58 PM 
> > 
> > Am 23.10.2014 um 17:23 schrieb Disny Disny:
> > 
> > 
> > > I have a problem with Sge ..after installing the cluster everything 
> > > wotked fine but when i shut down the pcs and in other time i start them 
> > > and try to submit ajob i got this message :
> > > queue instance "all.q@gcl2" droped because It is temprerly not available
> > > 
> > > queue instance "all.q@gcl3" droped because It is temprerly not available
> > > 
> > > queue instance "all.q@shdwgcl4" droped because It is temprerly not 
> > > available
> > > 
> > > queue instance "all.q@gcl1" droped because It is temprerly not available
> > > all queues are dropped because of overload or full.
> > > I appreaciate any help.
> > 
> > 
> > Are the execd's running on the ndoes - maybe they need to be added to your 
> > startup mechanism to do it automatically in case you shutdown the machines?
> > 
> > -- Reuti
> 
> 
> 


_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to