On 15 May 2012 14:39, Arturo <[email protected]> wrote:
> Hi William,
>
> you were right, it was running in various nodos:
>
>   74545 0.60500 test     arturo       r     05/15/2012 15:17:46
> [email protected]      MASTER
>
>                                 [email protected]      SLAVE
>   74545 0.60500 test     arturo       r     05/15/2012 15:17:46
> [email protected]        SLAVE
>   74545 0.60500 test     arturo       r     05/15/2012 15:17:46
> [email protected]        SLAVE
>
> Well, looking deeply, the problem is that I created a complex value
> "slotsfree" consumable and requestable and I assigned it to the node045
> with the value:
> slotsfree=8 (for example).
>
> If I submit a job using a parallel environment to this node without
> configuring this complex_value, it works perfectly.
> And when I submit a job without using a PE to this node, but with this
> complex_value configured, it also works,
> but when I submit the same job, using a PE  and the complex_value, it
> doen't work, and in the output it only says this:
What output are you talking about here?
>
> cannot run in PE "openmpi" because it only offers 2 slots


>
>
> Is it more clear now? Why does not work if I PE is configured without
> slot imitation, the node has 64 slots, and the slotsfree value is
> greated than 4?
Depends how you configured slotsfree.   The node may have 64 slots but
is possible the queue instance only
has 2.  If there are only 8 slotsfree available in total and you
request 2 slots of the PE and 4 slotsfree per slot
then your job should start any more of either and you will exceed the
slotsfree amount.


Calling a consumable slotsfree is not a great idea IMHO it makes for
confusion with the built in slots complex.



I think to get any further we need a bit more info
qconf -sq test
qconf -se node045
qconf -se node046
qconf -se node047

Full output of qalter -w v and qstat -j run on a job that won't start

>
> Thanks for your help.
>
> Regards
> Arturo
>
>
> El 15/05/12 14:33, William Hay escribió:
>> On 15 May 2012 13:05, Arturo<[email protected]>  wrote:
>>> Hi,
>>>
>>> I have a very strange behaviour when I try to use a parallel environment
>>> with hard_queue_list option.
>>>
>>> In my script I have a parallel configuration:
>>>
>>>      #$ -pe openmpi 4
>>>
>>> and if submit the script in the following way it works and runs in node
>>> test@node045
>>>
>>>      qsub script.sh
>>>
>>> But If I submit the script using the hard_queue_list it doesn't run:
>>>
>>>      qsub -q test script.sh
>>>
>>> With this error:
>>>
>>>      cannot run in PE "openmpi" because it only offers 2 slots
>>>
>>> Obviously, the node is always empty. What may be wrong?
>> It's hard to diagnose what's going on without knowing more about your
>> configuration.
>> Are you certain the entire job is running in the queue instance
>> test@node045 when you submit without a queue list?
>> One possibility is that queue test@node045 has only two slots.  The
>> master slot of the job plus one slave runs
>> in test@node045 while the remaining slots run elsewhere.
>>
>> When the job is running what output do you get from qstat -g t?
>>
>> William
>
>
>
>

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to