Ok that makes more sense. The queue instance on node045 is called conmat not test. If test only exists as a single slot on each of node046 and node047 then when you request -q test you are restricting it to those two slots which isn't enough for a 4 slot job. We would really need the full output of qstat -f to be sure though.
William On 15 May 2012 14:42, Arturo <[email protected]> wrote: > More info: > > output of qstat -f > > --------------------------------------------------------------------------------- > [email protected] BIP 0/0/64 0.00 lx26-amd64 > --------------------------------------------------------------------------------- > > ############################################################################ > - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS > ############################################################################ > 74550 0.60500 test arturo qw 05/15/2012 15:26:50 4 > > qconf -sq test |grep slot > > slots 64 > > > qconf -sp openmpi |grep slots > > slots 99999 > urgency_slots min > > Regards > > El 15/05/12 15:39, Arturo escribió: > > Hi William, > > you were right, it was running in various nodos: > > 74545 0.60500 test arturo r 05/15/2012 15:17:46 > [email protected] MASTER > > [email protected] SLAVE > 74545 0.60500 test arturo r 05/15/2012 15:17:46 > [email protected] SLAVE > 74545 0.60500 test arturo r 05/15/2012 15:17:46 > [email protected] SLAVE > > Well, looking deeply, the problem is that I created a complex value > "slotsfree" consumable and requestable and I assigned it to the node045 with > the value: > slotsfree=8 (for example). > > If I submit a job using a parallel environment to this node without > configuring this complex_value, it works perfectly. > And when I submit a job without using a PE to this node, but with this > complex_value configured, it also works, > but when I submit the same job, using a PE and the complex_value, it doen't > work, and in the output it only says this: > > cannot run in PE "openmpi" because it only offers 2 slots > > > Is it more clear now? Why does not work if I PE is configured without slot > imitation, the node has 64 slots, and the slotsfree value is greated than 4? > > Thanks for your help. > > Regards > Arturo > > > El 15/05/12 14:33, William Hay escribió: > > On 15 May 2012 13:05, Arturo<[email protected]> wrote: > > Hi, > > I have a very strange behaviour when I try to use a parallel environment > with hard_queue_list option. > > In my script I have a parallel configuration: > > #$ -pe openmpi 4 > > and if submit the script in the following way it works and runs in node > test@node045 > > qsub script.sh > > But If I submit the script using the hard_queue_list it doesn't run: > > qsub -q test script.sh > > With this error: > > cannot run in PE "openmpi" because it only offers 2 slots > > Obviously, the node is always empty. What may be wrong? > > It's hard to diagnose what's going on without knowing more about your > configuration. > Are you certain the entire job is running in the queue instance > test@node045 when you submit without a queue list? > One possibility is that queue test@node045 has only two slots. The > master slot of the job plus one slave runs > in test@node045 while the remaining slots run elsewhere. > > When the job is running what output do you get from qstat -g t? > > William > > > > > > -- > Arturo Giner Gracia > HPC research group System Administrator > Instituto de Biocomputación y Física de Sistemas Complejos (BIFI) > Universidad de Zaragoza > e-mail: [email protected] > phone: (+34) 976762992 _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
