Re: [SGE-discuss] Resource Reservation

Marco Donauer Tue, 05 Apr 2016 04:28:29 -0700

Dear Narsimha,

an explicit disable of the backfilling mechanism is not possible in the free Grid Engine version.
It was implemented some time ago to the commercial version, which allows you to set parameter explicitly to false/true,
to disable/enable backfilling.
If you want to disable it in all other version you can workaround it by setting the default duration to a extremely value, which
triggers that no job will fit into the smaller gap in front of the reserving jobs and this more or less disables it.
This workaround was already mentioned by Reuti 2 posts before.

Regarding your FIFO setup. It can be setup to get a FIFO behaviour. The jobs order depends on policies, weight time weight and urgency weights you have setup.
For instance a pe job, will always get a higher urgency, due to the slots urgency which is setup by default, or if a sharetree, function share or override ticket
policy is setup. As long as this is in use, the job priorities are influenced and the FIFO gets broken.

So you can get this behaviour by disabling a job priority influencing factors.

Regards,
Marco

On 04/05/2016 06:19 AM, Narsimha Reddy wrote:

Dear Sir,

Thank you for reply.

But my requirement is to stop the backfilling and enable a pure FIFO model in grid engine. In the above case it is should block the job 194 and allow 193 to get 1 more core to run the job.

As mentioned in the mail i have verified the job with 4, 5 & 3 cores but after submission of the job the 4 and 3 core jobs are submitted at a time and 5 core is in qw till it gets 5 cores. This is normal grid engine behavior, can we change this working.

Can you let me know how to get it done.

Thanks & Regards,

A.Narsimha Reddy.

> Subject: Re: [SGE-discuss] Resource Reservation
> From: re...@staff.uni-marburg.de
> Date: Mon, 4 Apr 2016 19:14:08 +0200
> CC: sge-disc...@liverpool.ac.uk
> To: narsimha....@outlook.com
>
> You see the backfiling working. There is nothing to prevent this*. You have to interpret it this way:
>
> - Job 192 uses 4 cores, leaving 3 cores idling
> - To allow job 193 to start, it's necessary that job 192 exits as the remaining cores won't satisfy the 4 cores requirement
> - Job 194 uses three cores which can fit in the remaining resources
>
> I.e.: whether job 194 starts or not does not influence the start of job 193 at all.
>
> ===
>
> You can try this setup with jobs requesting 4, 5 and 3 cores. The 5 slot job should block the last job requesting 3 cores in this case.
>
> ===
>
> As another test, you can then lower the runtime of the 3 core job, so that it fits in the remaining time of the 4 core job. In this case backfilling should occur again, as the start of the 5 core job won't be delayed by the 3 core job (as long as the estimated runtime are estimated as best as possible).
>
> -- Reuti
>
> *) Backfilling won't happen if resource reservation is switched off. But than jobs can starve in the queue, as smaller job always may slip in.
>
>
> > Am 04.04.2016 um 18:59 schrieb Narsimha Reddy <narsimha....@outlook.com>:
> >
> > Dear Sir,
> >
> > Thank you for the reply.
> >
> > I am having total of 7 cores currently. I have created a sample job as shown below:-
> > #!/bin/bash
> > #$ -S /bin/bash
> > #$ -l h_rt=01:00:00
> > #$ -R y
> > #$ -j y
> > #$ -e /home/chem/test/$JOB_ID.error
> > #$ -o /home/chem/test/$JOB_ID.out
> > #$ -pe orte 4
> > # print date and time
> > date
> > mpirun -np $NSLOTS hostname
> >
> > So i have created 2 sample files with 4 core each and 3 core each and the jobs are submitted as
> > qsub sample4.sh; qsub sample4.sh; qsub sample2.sh
> >
> > But as per my requirement first 4 core job has to be run and the next 4 core job has to wait till the first 4 core job gets executed as per FIFO but her first 4 core and the 3 core job are running. Kindly help me to get FIFO based architecture.
> >
> > My qconf ssconf output is
> > [locuz@ge ~]$ qconf -ssconf
> > algorithm default
> > schedule_interval 0:0:5
> > maxujobs 0
> > queue_sort_method load
> > job_load_adjustments np_load_avg=0.50
> > load_adjustment_decay_time 0:7:30
> > load_formula np_load_avg
> > schedd_job_info true
> > flush_submit_sec 0
> > flush_finish_sec 0
> > params none
> > reprioritize_interval 0:0:0
> > halftime 168
> > usage_weight_list cpu=1.000000,mem=0.000000,io=0.000000
> > compensation_factor 5.000000
> > weight_user 0.250000
> > weight_project 0.250000
> > weight_department 0.250000
> > weight_job 0.250000
> > weight_tickets_functional 1
> > weight_tickets_share 1
> > share_override_tickets TRUE
> > share_functional_shares TRUE
> > max_functional_jobs_to_schedule 200
> > report_pjob_tickets TRUE
> > max_pending_tasks_per_job 50
> > halflife_decay_list none
> > policy_hierarchy NONE
> > weight_ticket 0.010000
> > weight_waiting_time 0.000000
> > weight_deadline 3600000.000000
> > weight_urgency 0.100000
> > weight_priority 1.000000
> > max_reservation 20
> > default_duration 8760:00:00
> >
> >
> > Also i have changed h_rt value as
> > h_rt 2:00:00
> >
> > queuename qtype resv/used/tot. load_avg arch states
> > ---------------------------------------------------------------------------------
> > al...@c1.test.server BIP 0/4/4 0.00 linux-x64
> > 192 0.60500 sample4.sh chem r 04/04/2016 22:33:50 1
> > 194 0.50500 sample2.sh chem r 04/04/2016 22:33:50 3
> > ---------------------------------------------------------------------------------
> > al...@ge.test.server BIP 0/3/3 0.02 linux-x64
> > 192 0.60500 sample4.sh chem r 04/04/2016 22:33:50 3
> >
> > ############################################################################
> > - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
> > ############################################################################
> > 193 0.60500 sample4.sh chem qw 04/04/2016 22:33:47 4
> >
> > Username:- chem; queue:- all.q; nodes:- ge.test.server(3 core),c1.test.server(4 core)
> > Kindly help me to resolve the issue.
> >
> >
> > Thanks & Regards,
> > A.Narsimha Reddy.
> >
> >
> > > Subject: Re: [SGE-discuss] Resource Reservation
> > > From: re...@staff.uni-marburg.de
> > > Date: Mon, 4 Apr 2016 17:25:09 +0200
> > > CC: sge-disc...@liverpool.ac.uk
> > > To: narsimha....@outlook.com
> > >
> > >
> > > > Am 02.04.2016 um 19:01 schrieb Narsimha Reddy <narsimha....@outlook.com>:
> > > >
> > > > Dear Team,
> > > >
> > > > Kindly help me out for the implementation of resource reservation in the Grid Engine.
> > > >
> > > > I want this for the implementation of pure FIFO model to be applied for grid engine jobs as per the job ids.
> > >
> > > Mostly you need these settings in the scheduler configuration:
> > >
> > > $ qconf -ssconf
> > > ...
> > > policy_hierarchy NONE
> > > ...
> > > max_reservation 20
> > > default_duration 8760:00:00
> > >
> > > and jobs should be submitted with `qsub -R y ...` (could be defined as default in "sge_request"). It's advisable to replace the default runtime with the real expected one. Despite the FIFO scheduling, backfilling could still occur as this would use resources which would idle otherwise and won't influence the FIFO of other jobs.
> > >
> > > -- Reuti
>
_______________________________________________
SGE-discuss mailing list
SGE-discuss@liv.ac.uk
https://arc.liv.ac.uk/mailman/listinfo/sge-discuss

_______________________________________________
SGE-discuss mailing list
SGE-discuss@liv.ac.uk
https://arc.liv.ac.uk/mailman/listinfo/sge-discuss

Re: [SGE-discuss] Resource Reservation

Reply via email to