I've setup PE but I'm having problems submitting jobs.

- Here's the PE I created:

# qconf -sp cores
pe_name            cores
slots              999
user_lists         NONE
xuser_lists        NONE
start_proc_args    /bin/true
stop_proc_args     /bin/true
allocation_rule    $pe_slots
control_slaves     FALSE
job_is_first_task  TRUE
urgency_slots      min
accounting_summary FALSE
qsort_args         NONE

- I've then added this to all.q:

qconf -aattr queue pe_list cores all.q



- Now I submit a job:

# qsub -V -b y -cwd -now n -pe cores 2 -q all.q@ibm038 xclock
Your job 89 ("xclock") has been submitted
# qstat
job-ID  prior   name       user         state submit/start at     queue         
                 slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
     89 0.00000 xclock     johnt        qw    12/09/2016 15:14:25               
                     2
# qalter -w p 89
Job 89 cannot run in PE "cores" because it only offers 0 slots
verification: no suitable queues
# qstat -f
queuename                      qtype resv/used/tot. load_avg arch          
states
---------------------------------------------------------------------------------
all.q@ibm038                   BIP   0/0/8          0.00     lx-amd64

############################################################################
 - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
############################################################################
     89 0.55500 xclock     johnt        qw    12/09/2016 15:14:25     2


----------------------------------------------------

It looks like all.q@ibm038 should have 8 free slots, so why is it only offering 
0?

Hope you can help me.
Thanks
John






-----Original Message-----
From: Reuti [mailto:re...@staff.uni-marburg.de]
Sent: Monday, December 05, 2016 6:32
To: John_Tai
Cc: users@gridengine.org
Subject: Re: [gridengine users] CPU complex

Hi,

> Am 05.12.2016 um 09:36 schrieb John_Tai <john_...@smics.com>:
>
> Thank you so much for your reply!
>
>>> Will you use the consumable virtual_free here instead mem?
>
> Yes I meant to write virtual_free, not mem. Apologies.
>
>>> For parallel jobs you need to configure a (or some) so called PE (Parallel 
>>> Environment).
>
> My jobs are actually just one process which uses multiple cores, so for 
> example in top one process "simv" is currently using 2 cpu cores (200%).

Yes, then it's a parallel job for SGE. Although the entries for start_proc_args 
resp. stop_proc_args can be left untouched to the default, a PE is the paradigm 
in SGE for a parallel job.


>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 3017 kelly     20   0 3353m 3.0g 165m R 200.0  0.6  15645:46 simv
>
> So I'm not sure PE is suitable for my case, since it is not multiple parallel 
> processes running at the same time. Am I correct?
>
> If so, I am trying to find a way to get SGE to keep track of the number of 
> cores used, but I believe it only keeps track of the total CPU usage in %. I 
> guess I could use this and and the <total num cores> to get the <num of cores 
> in use>, but how to integrate it in SGE?

You can specify a necessary number of cores for your job in the -pe parameter, 
which can also be a range. The granted allocation by SGE you can check in the 
job script $NHOSTS, $NSLOTS, $PE_HOSTFILE.

Having this setup, SGE will track the number of used cores per machine. The 
available ones you define in the queue definition. In case you have more than 
one queue per exechost, we need to setup in addition an overall limit of cores 
which can be used at the same time to avoid oversubscription.

-- Reuti

> Thank you again for your help.
>
> John
>
> -----Original Message-----
> From: Reuti [mailto:re...@staff.uni-marburg.de]
> Sent: Monday, December 05, 2016 4:21
> To: John_Tai
> Cc: users@gridengine.org
> Subject: Re: [gridengine users] CPU complex
>
> Hi,
>
> Am 05.12.2016 um 08:00 schrieb John_Tai:
>
>> Newbie here, hope to understand SGE usage.
>>
>> I've successfully configured virtual_free as a complex for telling SGE how 
>> much memory is needed when submitting a job, as described here:
>>
>> https://docs.oracle.com/cd/E19957-01/820-0698/6ncdvjclk/index.html#i1000029
>>
>> How do I do the same for telling SGE how many CPU cores a job needs? For 
>> example:
>>
>>                qsub -l mem=24G,cpu=4 myjob
>
> Will you use the consumable virtual_free here instead mem?
>
>
>> Obviously I'd need for SGE to keep track of the actual CPU utilization in 
>> the host, just as virtual_free is being tracked independently of the SGE 
>> jobs.
>
> For parallel jobs you need to configure a (or some) so called PE (Parallel 
> Environment). Purpose of this is, to make preparations for the parallel jobs 
> like rearranging the list of granted slots, prepare shared directories 
> between the nodes,...
>
> These PEs were of higher importance in former times, when parallel libraries 
> were not programmed to integrate automatically in SGE for a tight 
> integration. Your submissions could read:
>
>    qsub -pe smp 4 myjob        # allocation_rule $peslots, control_slaves true
>    qsub -pe orte 16 myjob        # allovation_rule $round_robin, 
> control_slaves tue
>
> where smp resp. orte is the chosen parallel environment for OpenMP resp. Open 
> MPI. Its settings are explained in `man sge_pe`, the "-pe" parameter to in 
> the submission command in `man qsub`.
>
> -- Reuti
> ________________________________
>
> This email (including its attachments, if any) may be confidential and 
> proprietary information of SMIC, and intended only for the use of the named 
> recipient(s) above. Any unauthorized use or disclosure of this email is 
> strictly prohibited. If you are not the intended recipient(s), please notify 
> the sender immediately and delete this email from your computer.
>

________________________________

This email (including its attachments, if any) may be confidential and 
proprietary information of SMIC, and intended only for the use of the named 
recipient(s) above. Any unauthorized use or disclosure of this email is 
strictly prohibited. If you are not the intended recipient(s), please notify 
the sender immediately and delete this email from your computer.

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to