Am 15.05.2012 um 16:23 schrieb Arturo:

> <snip>
> 
> 
> It doesn't matter to which queue I submit the script. 
> 
> I would use the built in slot complex, but when I use it gives me this error:
> 
> qsub -q conmat -l slots=5 submit.sh 
> Unable to run job: "job" denied: use parallel environments instead of 
> requesting slots explicitly.
> Exiting.

As the message says: you are not requesting a PE with the proper slot count?

$ qsub  -q conmat -pe foobar 5 submit.sh

-- Reuti


> Regards
> 
> El 15/05/12 16:12, William Hay escribió:
>> Ok that makes more sense.  The queue instance on node045 is called
>> conmat not test.   If test only exists as a single slot on each of
>> node046 and node047
>> then when you request -q test you are restricting it to those two
>> slots which isn't enough for a 4 slot job.
>> We would really need the full output of qstat -f to be sure though.
>> 
>> 
>> William
>> On 15 May 2012 14:42, Arturo 
>> <[email protected]>
>>  wrote:
>> 
>>> More info:
>>> 
>>> output of qstat -f
>>> 
>>> ---------------------------------------------------------------------------------
>>> 
>>> [email protected]
>>>       BIP   0/0/64         0.00     lx26-amd64
>>> ---------------------------------------------------------------------------------
>>> 
>>> ############################################################################
>>>  - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
>>> ############################################################################
>>>   74550 0.60500 test     arturo       qw    05/15/2012 15:26:50     4
>>> 
>>> qconf -sq test |grep slot
>>> 
>>>     slots                 64
>>> 
>>> 
>>> qconf -sp openmpi |grep slots
>>> 
>>> slots              99999
>>> urgency_slots      min
>>> 
>>> Regards
>>> 
>>> El 15/05/12 15:39, Arturo escribió:
>>> 
>>> Hi William,
>>> 
>>> you were right, it was running in various nodos:
>>> 
>>>   74545 0.60500 test     arturo       r     05/15/2012 15:17:46
>>> 
>>> [email protected]
>>>       MASTER
>>> 
>>>                                 
>>> [email protected]
>>>       SLAVE
>>>   74545 0.60500 test     arturo       r     05/15/2012 15:17:46
>>> 
>>> [email protected]
>>>         SLAVE
>>>   74545 0.60500 test     arturo       r     05/15/2012 15:17:46
>>> 
>>> [email protected]
>>>         SLAVE
>>> 
>>> Well, looking deeply, the problem is that I created a complex value
>>> "slotsfree" consumable and requestable and I assigned it to the node045 with
>>> the value:
>>> slotsfree=8 (for example).
>>> 
>>> If I submit a job using a parallel environment to this node without
>>> configuring this complex_value, it works perfectly.
>>> And when I submit a job without using a PE to this node, but with this
>>> complex_value configured, it also works,
>>> but when I submit the same job, using a PE  and the complex_value, it doen't
>>> work, and in the output it only says this:
>>> 
>>> cannot run in PE "openmpi" because it only offers 2 slots
>>> 
>>> 
>>> Is it more clear now? Why does not work if I PE is configured without slot
>>> imitation, the node has 64 slots, and the slotsfree value is greated than 4?
>>> 
>>> Thanks for your help.
>>> 
>>> Regards
>>> Arturo
>>> 
>>> 
>>> El 15/05/12 14:33, William Hay escribió:
>>> 
>>> On 15 May 2012 13:05, Arturo
>>> <[email protected]>
>>>   wrote:
>>> 
>>> Hi,
>>> 
>>> I have a very strange behaviour when I try to use a parallel environment
>>> with hard_queue_list option.
>>> 
>>> In my script I have a parallel configuration:
>>> 
>>>      #$ -pe openmpi 4
>>> 
>>> and if submit the script in the following way it works and runs in node
>>> test@node045
>>> 
>>>      qsub script.sh
>>> 
>>> But If I submit the script using the hard_queue_list it doesn't run:
>>> 
>>>      qsub -q test script.sh
>>> 
>>> With this error:
>>> 
>>>      cannot run in PE "openmpi" because it only offers 2 slots
>>> 
>>> Obviously, the node is always empty. What may be wrong?
>>> 
>>> It's hard to diagnose what's going on without knowing more about your
>>> configuration.
>>> Are you certain the entire job is running in the queue instance
>>> test@node045 when you submit without a queue list?
>>> One possibility is that queue test@node045 has only two slots.  The
>>> master slot of the job plus one slave runs
>>> in test@node045 while the remaining slots run elsewhere.
>>> 
>>> When the job is running what output do you get from qstat -g t?
>>> 
>>> William
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Arturo Giner Gracia
>>> HPC research group System Administrator
>>> Instituto de Biocomputación y Física de Sistemas Complejos (BIFI)
>>> Universidad de Zaragoza
>>> e-mail: 
>>> [email protected]
>>> 
>>> phone: (+34) 976762992
>>> 
> 
> 
> -- 
> Arturo Giner Gracia
> HPC research group System Administrator
> Instituto de Biocomputación y Física de Sistemas Complejos (BIFI)
> Universidad de Zaragoza
> e-mail: 
> [email protected]
> 
> phone: (+34) 976762992 
> 
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to