Dear Reuti
I've get it! I've was checking for slots or core definition.
The error was that it was define the h_vmem=7G in the "global" value of
Extecution Hosts, in place of each compute node definition.. So, as my
job was asking for 1G / cores, the limit of 7 slots.
Thank's a lot Reuti to let me check where i didn't !
Regards
Le 24/10/2016 à 12:34, Reuti a écrit :
Hi,
Am 24.10.2016 um 19:15 schrieb Jerome <jer...@ibt.unam.mx>:
Dear all
I've install for a course a Rocks Cluster of 2 nodes, with SGE. Each node are a
4 cores nodes.
I do a shutdown of a node, and so i have ready uniquely 4 cores:
$ qstat -f
queuename qtype resv/used/tot. load_avg arch states
---------------------------------------------------------------------------------
all.q@compute-0-0.local BIP 0/0/4 0.00 linux-x64
---------------------------------------------------------------------------------
all.q@compute-0-1.local BIP 0/0/4 -NA- linux-x64 au
But i come in a strange issue, that i can't explain yet:
My user submit a paralele job with 8 cores.
When i check my job state, in "qw" state, i've get back thios message:
$ qtsat j 58
../..
scheduling info: queue instance "all.q@compute-0-1.local" dropped
because it is temporarily not available
cannot run in PE "orte" because it only offers 7
slots
The error message does not always reflect the correct cause and further
investigation is necessary. Nevertheless, with only 4 slots available, the job
can't start as you request 8.
If i power on the second node, the message is ths same:
$ qstat -f
queuename qtype resv/used/tot. load_avg arch states
---------------------------------------------------------------------------------
all.q@compute-0-0.local BIP 0/0/4 0.00 linux-x64
---------------------------------------------------------------------------------
all.q@compute-0-1.local BIP 0/0/4 0.10 linux-x64
$ qstat -j 58
../..
parallel environment: orte range: 8
version: 3
scheduling info: cannot run in PE "orte" because it only offers 7
slots
Now the job should start, and something else is blocking it, not the number of
free slots.
I've search on all of the configuration of SGE. I do too the reinstalation of
the 2 nodes. But the same message appears, that uniquely 7 slots free !
Did you request any resources for the job like memory? Any RQS in place? Do you use
"job_load_adjustments" in the scheduler configuration?
-- Reuti
Someone can't get me some help?
Regards
--
-- Jérôme
On n'a jamais vu un aveugle dans un camp de nudistes.
(Woody Allen)
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
-- Jérôme
- Mon dévouement vous est acquéris.
- Acquis, acquis ! souffle une voix charitable.
- À qui ? Mais à tous, citoyens.
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users