On Thu, 2018-03-22 at 18:09 +0100, Rafael J. Wysocki wrote: > > +++ linux-pm/drivers/cpuidle/poll_state.c > @@ -10,6 +10,7 @@ > #include <linux/sched/idle.h> > > #define POLL_IDLE_TIME_LIMIT (TICK_NSEC / 16) > +#define POLL_IDLE_COUNT 1000 > > static int __cpuidle poll_idle(struct cpuidle_device *dev, > struct cpuidle_driver *drv, int > index) > @@ -18,9 +19,14 @@ static int __cpuidle poll_idle(struct cp > > local_irq_enable(); > if (!current_set_polling_and_test()) { > + unsigned int loop_count = 0; > + > while (!need_resched()) { > cpu_relax(); > + if (loop_count++ < POLL_IDLE_COUNT) > + continue; > > + loop_count = 0; > if (local_clock() - time_start > > POLL_IDLE_TIME_LIMIT) > break; > }
OK, I am still seeing a performance degradation with the above, though not throughout the entire workload. It appears that making the idle loop do anything besides cpu_relax() for a significant amount of time slows things down. I plan to try two more things: 1) Disable polling on SMT systems, with the idea that putting one thread to sleep with monitor/mwait in C1 will allow the other thread to run faster. 2) Insert more cpu_relax() calls into the main loop, so the CPU core spends more of its time in cpu_relax() and less time doing other things: static int __cpuidle poll_idle(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { u64 time_start = local_clock(); local_irq_enable(); if (!current_set_polling_and_test()) { unsigned int loop_count = 0; while (!need_resched()) { cpu_relax(); cpu_relax(); cpu_relax(); cpu_relax(); cpu_relax(); cpu_relax(); cpu_relax(); cpu_relax(); if (loop_count++ < POLL_IDLE_COUNT) continue; loop_count = 0; if (local_clock() - time_start > POLL_IDLE_TIME_LIMIT) break; } } current_clr_polling(); return index; } I will let you know how they perform. -- All Rights Reversed.
signature.asc
Description: This is a digitally signed message part