Dear all

I've install for a course a Rocks Cluster of 2 nodes, with SGE. Each node are a 4 cores nodes.
I do a shutdown of a node, and so i have ready uniquely 4 cores:

$ qstat -f
queuename qtype resv/used/tot. load_avg arch states
---------------------------------------------------------------------------------
all.q@compute-0-0.local        BIP   0/0/4          0.00     linux-x64
---------------------------------------------------------------------------------
all.q@compute-0-1.local BIP 0/0/4 -NA- linux-x64 au



But i come in a strange issue, that i can't explain yet:
My user submit a paralele job with 8 cores.
When i check my job state, in "qw" state, i've get back thios message:

$ qtsat j 58
 ../..

scheduling info: queue instance "all.q@compute-0-1.local" dropped because it is temporarily not available cannot run in PE "orte" because it only offers 7 slots

If i power on the second node, the message is ths same:

$ qstat -f
queuename qtype resv/used/tot. load_avg arch states
---------------------------------------------------------------------------------
all.q@compute-0-0.local        BIP   0/0/4          0.00     linux-x64
---------------------------------------------------------------------------------
all.q@compute-0-1.local        BIP   0/0/4          0.10     linux-x64


$ qstat -j 58

../..

parallel environment:  orte range: 8
version:                    3
scheduling info: cannot run in PE "orte" because it only offers 7 slots


I've search on all of the configuration of SGE. I do too the reinstalation of the 2 nodes. But the same message appears, that uniquely 7 slots free !

Someone can't get me some help?

Regards


--
-- Jérôme
On n'a jamais vu un aveugle dans un camp de nudistes.
       (Woody Allen)
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to