> On Aug 20, 2020, at 3:35 PM, Vincent Guittot <vincent.guit...@linaro.org> > wrote: > > On Thu, 20 Aug 2020 at 02:13, benbjiang(蒋彪) <benbji...@tencent.com> wrote: >> >> >> >>> On Aug 19, 2020, at 10:55 PM, Vincent Guittot <vincent.guit...@linaro.org> >>> wrote: >>> >>> On Wed, 19 Aug 2020 at 16:27, benbjiang(蒋彪) <benbji...@tencent.com> wrote: >>>> >>>> >>>> >>>>> On Aug 19, 2020, at 7:55 PM, Dietmar Eggemann <dietmar.eggem...@arm.com> >>>>> wrote: >>>>> >>>>> On 19/08/2020 13:05, Vincent Guittot wrote: >>>>>> On Wed, 19 Aug 2020 at 12:46, Dietmar Eggemann >>>>>> <dietmar.eggem...@arm.com> wrote: >>>>>>> >>>>>>> On 17/08/2020 14:05, benbjiang(蒋彪) wrote: >>>>>>>> >>>>>>>> >>>>>>>>> On Aug 17, 2020, at 4:57 PM, Dietmar Eggemann >>>>>>>>> <dietmar.eggem...@arm.com> wrote: >>>>>>>>> >>>>>>>>> On 14/08/2020 01:55, benbjiang(蒋彪) wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>>> On Aug 13, 2020, at 2:39 AM, Dietmar Eggemann >>>>>>>>>>> <dietmar.eggem...@arm.com> wrote: >>>>>>>>>>> >>>>>>>>>>> On 12/08/2020 05:19, benbjiang(蒋彪) wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>>> On Aug 11, 2020, at 11:54 PM, Dietmar Eggemann >>>>>>>>>>>>> <dietmar.eggem...@arm.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> On 11/08/2020 02:41, benbjiang(蒋彪) wrote: >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Aug 10, 2020, at 9:24 PM, Dietmar Eggemann >>>>>>>>>>>>>>> <dietmar.eggem...@arm.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 06/08/2020 17:52, benbjiang(蒋彪) wrote: >>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Aug 6, 2020, at 9:29 PM, Dietmar Eggemann >>>>>>>>>>>>>>>>> <dietmar.eggem...@arm.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 03/08/2020 13:26, benbjiang(蒋彪) wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Aug 3, 2020, at 4:16 PM, Dietmar Eggemann >>>>>>>>>>>>>>>>>>> <dietmar.eggem...@arm.com> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 01/08/2020 04:32, Jiang Biao wrote: >>>>>>>>>>>>>>>>>>>> From: Jiang Biao <benbji...@tencent.com> >>>>>>> >>>>>>> [...] >>>>>>> >>>>>>>>> Are you sure about this? >>>>>>>> Yes. :) >>>>>>>>> >>>>>>>>> The math is telling me for the: >>>>>>>>> >>>>>>>>> idle task: (3 / (1024 + 1024 + 3))^(-1) * 4ms = 2735ms >>>>>>>>> >>>>>>>>> normal task: (1024 / (1024 + 1024 + 3))^(-1) * 4ms = 8ms >>>>>>>>> >>>>>>>>> (4ms - 250 Hz) >>>>>>>> My tick is 1ms - 1000HZ, which seems reasonable for 600ms? :) >>>>>>> >>>>>>> OK, I see. >>>>>>> >>>>>>> But here the different sched slices (check_preempt_tick()-> >>>>>>> sched_slice()) between normal tasks and the idle task play a role to. >>>>>>> >>>>>>> Normal tasks get ~3ms whereas the idle task gets <0.01ms. >>>>>> >>>>>> In fact that depends on the number of CPUs on the system >>>>>> :sysctl_sched_latency = 6ms * (1 + ilog(ncpus)) . On a 8 cores system, >>>>>> normal task will run around 12ms in one shoot and the idle task still >>>>>> one tick period >>>>> >>>>> True. This is on a single CPU. >>>> Agree. :) >>>> >>>>> >>>>>> Also, you can increase even more the period between 2 runs of idle >>>>>> task by using cgroups and min shares value : 2 >>>>> >>>>> Ah yes, maybe this is what Jiang wants to do then? If his runtime does >>>>> not have other requirements preventing this. >>>> That could work for increasing the period between 2 runs. But could not >>>> reduce the single runtime of idle task I guess, which means normal task >>>> could have 1-tick schedule latency because of idle task. >>> >>> Yes. An idle task will preempt an always running task during 1 tick >>> every 680ms. But also you should keep in mind that a waking normal >>> task will preempt the idle task immediately which means that it will >>> not add scheduling latency to a normal task but "steal" 0.14% of >>> normal task throughput (1/680) at most >> That’s true. But in the VM case, when VM are busy(MWAIT passthrough >> or running cpu eating works), the 1-tick scheduling latency could be >> detected by cyclictest running in the VM. >> >> OTOH, we compensate vruntime in place_entity() to boot waking >> without distinguish SCHED_IDLE task, do you think it’s necessary to >> do that? like >> >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -4115,7 +4115,7 @@ place_entity(struct cfs_rq *cfs_rq, struct >> sched_entity *se, int initial) >> vruntime += sched_vslice(cfs_rq, se); >> >> /* sleeps up to a single latency don't count. */ >> - if (!initial) { >> + if (!initial && likely(!task_has_idle_policy(task_of(se)))) { >> unsigned long thresh = sysctl_sched_latency; > > Yeah, this is a good improvement. Thanks, I’ll send a patch for that. :)
> Does it solve your problem ? > Not exactly. :) I wonder if we can make SCHED_IDLE more pure(harmless)? Or introduce a switch(or flag) to control it, and make it available for cases like us. Thanks a lot. Regards, Jiang >> >>> >>>> OTOH, cgroups(shares) could introduce extra complexity. :) >>>> >>>> I wonder if there’s any possibility to make SCHED_IDLEs’ priorities >>>> absolutely >>>> lower than SCHED_NORMAL(OTHER), which means no weights/shares >>>> for them, and they run only when no other task’s runnable. >>>> I guess there may be priority inversion issue if we do that. But maybe we >>> >>> Exactly, that's why we must ensure a minimum running time for sched_idle >>> task >> >> Still for VM case, different VMs have been much isolated from each other, >> priority inversion issue could be very rare, we’re trying to make offline >> tasks >> absoultly harmless to online tasks. :) >> >> Thanks a lot for your time. >> Regards, >> Jiang >> >>> >>>> could avoid it by load-balance more aggressively, or it(priority inversion) >>>> could be ignored in some special case.