Hi, Am 09.12.2016 um 08:20 schrieb John_Tai:
> I've setup PE but I'm having problems submitting jobs. > > - Here's the PE I created: > > # qconf -sp cores > pe_name cores > slots 999 > user_lists NONE > xuser_lists NONE > start_proc_args /bin/true > stop_proc_args /bin/true > allocation_rule $pe_slots > control_slaves FALSE > job_is_first_task TRUE > urgency_slots min > accounting_summary FALSE > qsort_args NONE > > - I've then added this to all.q: > > qconf -aattr queue pe_list cores all.q How many "slots" were defined in there queue definition for all.q? -- Reuti > - Now I submit a job: > > # qsub -V -b y -cwd -now n -pe cores 2 -q all.q@ibm038 xclock > Your job 89 ("xclock") has been submitted > # qstat > job-ID prior name user state submit/start at queue > slots ja-task-ID > ----------------------------------------------------------------------------------------------------------------- > 89 0.00000 xclock johnt qw 12/09/2016 15:14:25 > 2 > # qalter -w p 89 > Job 89 cannot run in PE "cores" because it only offers 0 slots > verification: no suitable queues > # qstat -f > queuename qtype resv/used/tot. load_avg arch > states > --------------------------------------------------------------------------------- > all.q@ibm038 BIP 0/0/8 0.00 lx-amd64 > > ############################################################################ > - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS > ############################################################################ > 89 0.55500 xclock johnt qw 12/09/2016 15:14:25 2 > > > ---------------------------------------------------- > > It looks like all.q@ibm038 should have 8 free slots, so why is it only > offering 0? > > Hope you can help me. > Thanks > John > > > > > > > -----Original Message----- > From: Reuti [mailto:re...@staff.uni-marburg.de] > Sent: Monday, December 05, 2016 6:32 > To: John_Tai > Cc: users@gridengine.org > Subject: Re: [gridengine users] CPU complex > > Hi, > >> Am 05.12.2016 um 09:36 schrieb John_Tai <john_...@smics.com>: >> >> Thank you so much for your reply! >> >>>> Will you use the consumable virtual_free here instead mem? >> >> Yes I meant to write virtual_free, not mem. Apologies. >> >>>> For parallel jobs you need to configure a (or some) so called PE (Parallel >>>> Environment). >> >> My jobs are actually just one process which uses multiple cores, so for >> example in top one process "simv" is currently using 2 cpu cores (200%). > > Yes, then it's a parallel job for SGE. Although the entries for > start_proc_args resp. stop_proc_args can be left untouched to the default, a > PE is the paradigm in SGE for a parallel job. > > >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> 3017 kelly 20 0 3353m 3.0g 165m R 200.0 0.6 15645:46 simv >> >> So I'm not sure PE is suitable for my case, since it is not multiple >> parallel processes running at the same time. Am I correct? >> >> If so, I am trying to find a way to get SGE to keep track of the number of >> cores used, but I believe it only keeps track of the total CPU usage in %. I >> guess I could use this and and the <total num cores> to get the <num of >> cores in use>, but how to integrate it in SGE? > > You can specify a necessary number of cores for your job in the -pe > parameter, which can also be a range. The granted allocation by SGE you can > check in the job script $NHOSTS, $NSLOTS, $PE_HOSTFILE. > > Having this setup, SGE will track the number of used cores per machine. The > available ones you define in the queue definition. In case you have more than > one queue per exechost, we need to setup in addition an overall limit of > cores which can be used at the same time to avoid oversubscription. > > -- Reuti > >> Thank you again for your help. >> >> John >> >> -----Original Message----- >> From: Reuti [mailto:re...@staff.uni-marburg.de] >> Sent: Monday, December 05, 2016 4:21 >> To: John_Tai >> Cc: users@gridengine.org >> Subject: Re: [gridengine users] CPU complex >> >> Hi, >> >> Am 05.12.2016 um 08:00 schrieb John_Tai: >> >>> Newbie here, hope to understand SGE usage. >>> >>> I've successfully configured virtual_free as a complex for telling SGE how >>> much memory is needed when submitting a job, as described here: >>> >>> https://docs.oracle.com/cd/E19957-01/820-0698/6ncdvjclk/index.html#i1000029 >>> >>> How do I do the same for telling SGE how many CPU cores a job needs? For >>> example: >>> >>> qsub -l mem=24G,cpu=4 myjob >> >> Will you use the consumable virtual_free here instead mem? >> >> >>> Obviously I'd need for SGE to keep track of the actual CPU utilization in >>> the host, just as virtual_free is being tracked independently of the SGE >>> jobs. >> >> For parallel jobs you need to configure a (or some) so called PE (Parallel >> Environment). Purpose of this is, to make preparations for the parallel jobs >> like rearranging the list of granted slots, prepare shared directories >> between the nodes,... >> >> These PEs were of higher importance in former times, when parallel libraries >> were not programmed to integrate automatically in SGE for a tight >> integration. Your submissions could read: >> >> qsub -pe smp 4 myjob # allocation_rule $peslots, control_slaves true >> qsub -pe orte 16 myjob # allovation_rule $round_robin, >> control_slaves tue >> >> where smp resp. orte is the chosen parallel environment for OpenMP resp. >> Open MPI. Its settings are explained in `man sge_pe`, the "-pe" parameter to >> in the submission command in `man qsub`. >> >> -- Reuti >> ________________________________ >> >> This email (including its attachments, if any) may be confidential and >> proprietary information of SMIC, and intended only for the use of the named >> recipient(s) above. Any unauthorized use or disclosure of this email is >> strictly prohibited. If you are not the intended recipient(s), please notify >> the sender immediately and delete this email from your computer. >> > > ________________________________ > > This email (including its attachments, if any) may be confidential and > proprietary information of SMIC, and intended only for the use of the named > recipient(s) above. Any unauthorized use or disclosure of this email is > strictly prohibited. If you are not the intended recipient(s), please notify > the sender immediately and delete this email from your computer. > _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users