Hi,

Am 09.12.2016 um 08:20 schrieb John_Tai:

> I've setup PE but I'm having problems submitting jobs.
> 
> - Here's the PE I created:
> 
> # qconf -sp cores
> pe_name            cores
> slots              999
> user_lists         NONE
> xuser_lists        NONE
> start_proc_args    /bin/true
> stop_proc_args     /bin/true
> allocation_rule    $pe_slots
> control_slaves     FALSE
> job_is_first_task  TRUE
> urgency_slots      min
> accounting_summary FALSE
> qsort_args         NONE
> 
> - I've then added this to all.q:
> 
> qconf -aattr queue pe_list cores all.q

How many "slots" were defined in there queue definition for all.q?

-- Reuti


> - Now I submit a job:
> 
> # qsub -V -b y -cwd -now n -pe cores 2 -q all.q@ibm038 xclock
> Your job 89 ("xclock") has been submitted
> # qstat
> job-ID  prior   name       user         state submit/start at     queue       
>                    slots ja-task-ID
> -----------------------------------------------------------------------------------------------------------------
>     89 0.00000 xclock     johnt        qw    12/09/2016 15:14:25              
>                       2
> # qalter -w p 89
> Job 89 cannot run in PE "cores" because it only offers 0 slots
> verification: no suitable queues
> # qstat -f
> queuename                      qtype resv/used/tot. load_avg arch          
> states
> ---------------------------------------------------------------------------------
> all.q@ibm038                   BIP   0/0/8          0.00     lx-amd64
> 
> ############################################################################
> - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
> ############################################################################
>     89 0.55500 xclock     johnt        qw    12/09/2016 15:14:25     2
> 
> 
> ----------------------------------------------------
> 
> It looks like all.q@ibm038 should have 8 free slots, so why is it only 
> offering 0?
> 
> Hope you can help me.
> Thanks
> John
> 
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Reuti [mailto:re...@staff.uni-marburg.de]
> Sent: Monday, December 05, 2016 6:32
> To: John_Tai
> Cc: users@gridengine.org
> Subject: Re: [gridengine users] CPU complex
> 
> Hi,
> 
>> Am 05.12.2016 um 09:36 schrieb John_Tai <john_...@smics.com>:
>> 
>> Thank you so much for your reply!
>> 
>>>> Will you use the consumable virtual_free here instead mem?
>> 
>> Yes I meant to write virtual_free, not mem. Apologies.
>> 
>>>> For parallel jobs you need to configure a (or some) so called PE (Parallel 
>>>> Environment).
>> 
>> My jobs are actually just one process which uses multiple cores, so for 
>> example in top one process "simv" is currently using 2 cpu cores (200%).
> 
> Yes, then it's a parallel job for SGE. Although the entries for 
> start_proc_args resp. stop_proc_args can be left untouched to the default, a 
> PE is the paradigm in SGE for a parallel job.
> 
> 
>> PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>> 3017 kelly     20   0 3353m 3.0g 165m R 200.0  0.6  15645:46 simv
>> 
>> So I'm not sure PE is suitable for my case, since it is not multiple 
>> parallel processes running at the same time. Am I correct?
>> 
>> If so, I am trying to find a way to get SGE to keep track of the number of 
>> cores used, but I believe it only keeps track of the total CPU usage in %. I 
>> guess I could use this and and the <total num cores> to get the <num of 
>> cores in use>, but how to integrate it in SGE?
> 
> You can specify a necessary number of cores for your job in the -pe 
> parameter, which can also be a range. The granted allocation by SGE you can 
> check in the job script $NHOSTS, $NSLOTS, $PE_HOSTFILE.
> 
> Having this setup, SGE will track the number of used cores per machine. The 
> available ones you define in the queue definition. In case you have more than 
> one queue per exechost, we need to setup in addition an overall limit of 
> cores which can be used at the same time to avoid oversubscription.
> 
> -- Reuti
> 
>> Thank you again for your help.
>> 
>> John
>> 
>> -----Original Message-----
>> From: Reuti [mailto:re...@staff.uni-marburg.de]
>> Sent: Monday, December 05, 2016 4:21
>> To: John_Tai
>> Cc: users@gridengine.org
>> Subject: Re: [gridengine users] CPU complex
>> 
>> Hi,
>> 
>> Am 05.12.2016 um 08:00 schrieb John_Tai:
>> 
>>> Newbie here, hope to understand SGE usage.
>>> 
>>> I've successfully configured virtual_free as a complex for telling SGE how 
>>> much memory is needed when submitting a job, as described here:
>>> 
>>> https://docs.oracle.com/cd/E19957-01/820-0698/6ncdvjclk/index.html#i1000029
>>> 
>>> How do I do the same for telling SGE how many CPU cores a job needs? For 
>>> example:
>>> 
>>>               qsub -l mem=24G,cpu=4 myjob
>> 
>> Will you use the consumable virtual_free here instead mem?
>> 
>> 
>>> Obviously I'd need for SGE to keep track of the actual CPU utilization in 
>>> the host, just as virtual_free is being tracked independently of the SGE 
>>> jobs.
>> 
>> For parallel jobs you need to configure a (or some) so called PE (Parallel 
>> Environment). Purpose of this is, to make preparations for the parallel jobs 
>> like rearranging the list of granted slots, prepare shared directories 
>> between the nodes,...
>> 
>> These PEs were of higher importance in former times, when parallel libraries 
>> were not programmed to integrate automatically in SGE for a tight 
>> integration. Your submissions could read:
>> 
>>   qsub -pe smp 4 myjob        # allocation_rule $peslots, control_slaves true
>>   qsub -pe orte 16 myjob        # allovation_rule $round_robin, 
>> control_slaves tue
>> 
>> where smp resp. orte is the chosen parallel environment for OpenMP resp. 
>> Open MPI. Its settings are explained in `man sge_pe`, the "-pe" parameter to 
>> in the submission command in `man qsub`.
>> 
>> -- Reuti
>> ________________________________
>> 
>> This email (including its attachments, if any) may be confidential and 
>> proprietary information of SMIC, and intended only for the use of the named 
>> recipient(s) above. Any unauthorized use or disclosure of this email is 
>> strictly prohibited. If you are not the intended recipient(s), please notify 
>> the sender immediately and delete this email from your computer.
>> 
> 
> ________________________________
> 
> This email (including its attachments, if any) may be confidential and 
> proprietary information of SMIC, and intended only for the use of the named 
> recipient(s) above. Any unauthorized use or disclosure of this email is 
> strictly prohibited. If you are not the intended recipient(s), please notify 
> the sender immediately and delete this email from your computer.
> 


_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to