Hi,

Am 05.09.2014 um 18:13 schrieb Dan Hyatt:

> Question: I do not seem to be removing server from the queue list correctly. 
> What is the best way to do it.

Are they attached by hostgroup or by an individual hostname to the queue? When 
you remove a machine from a queue, the queue instance may become "orphaned" 
with an "o" in `qstat -f` while there are still jobs running on it.


> Question 2: shouldn't grid engine remove servers from receiving jobs if it 
> cannot talk to the server, such as server down?
> I have 3 blades, which should not be accepting jobs
> (OK, I am tracking using qmon)...I know go command line like I do for 
> everything else.

You mean they don't appear in `qhost` with a valid load but SGE schedules jobs 
thereto anyway? And the jobs are hanging then there or what happens next?

-- Reuti


> Why is the queue still sending jobs to the blades which are "down"
> 
> 
> But under cluster queue control "HOSTS" tab
> loadAvg/CPU/MemUsed/and Swap used  I have dashes which I expect because they 
> are not online.
> queue instances has
> AU  under "states" which I thought indicated "not accepting jobs"
> 
> One of the blades was actually removed from the all.q  which is used by 
> normal queue to schedule jobs.
> 
> 
> _______________________________________________
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to