Hi experts, do you have any clue on this? thanks.
On Tue, Dec 20, 2016 at 06:21:28PM +0800, Chen Yu wrote: > This is a debug patch to descibe/workaround the issue > we encountered recently. > > Problem and the cause: > Currently we are suffering from *extremely* slow CPU online > speed during system resuming from S3. Say, the MacBookPro 2015 > has 4 CPUs, and it took more than 1 second each for both CPU1 > and CPU3 to be brought back to idle thread again. Further ftrace > result showed that, *each* instruction the CPU1 and CPU3 execute > will take much longer time than it will take during normal cpu > online via sysfs(without S3 involved). And more interesting > thing was found that after resumed back, every instruction CPU1 and > CPU3 execute is back to its normal speed(unixbench has the same score > before/after S3). So it smells like there is something wrong with > the cache/tlb settings only during resuming back from S3. > Finally we have found this might be related to BIOS who has > scribbled the mtrr/pat before it resumed back to the OS, and every > instruction seems to be run in an uncached behavior, fortunately > later after all the APs have been brought up again, mtrr_aps_init() > will be invoked to synchronize the mtrr on these APs to the value > once saved by CPU0 before suspended, thus everything is back > to normal after resumed. > > Workaround: > So it turns out to be that if we can synchronize the APs with boot CPU > ASAP, rather than waiting till all CPUS online, it might reduce the > impact of the bogus BIOS who scribbled the mtrr/pat. So here is the > hack patch to let the users to synchronize mtrr on APs earlier. > With the following debug patch applied, the resume time for CPU1 and > CPU3 have dropped a lot. > > (Notice, the following result were tested with ftrace function_graph enabled > during suspend/resume, by this tool: > https://01.org/suspendresume > > > Before patch applied: > [ 619.810899] Enabling non-boot CPUs ... > [ 619.825528] x86: Booting SMP configuration: > [ 619.825537] smpboot: Booting Node 0 Processor 1 APIC 0x2 > -------skip-------- > [ 621.723809] CPU1 is up > [ 621.762843] smpboot: Booting Node 0 Processor 2 APIC 0x1 > -------skip-------- > [ 621.766679] CPU2 is up > [ 621.840228] smpboot: Booting Node 0 Processor 3 APIC 0x3 > -------skip-------- > [ 626.690900] CPU3 is up > > So it took CPU1 621.723809 - 619.825537 = 1898.272 ms, and > CPU3 626.690900 - 621.840228 = 4850.672 ms ! > > > After patch applied: > [ 106.931790] smpboot: Booting Node 0 Processor 1 APIC 0x2 > -------skip-------- > [ 106.948360] CPU1 is up > [ 106.963975] smpboot: Booting Node 0 Processor 2 APIC 0x1 > -------skip-------- > [ 106.968087] CPU2 is up > [ 106.986534] smpboot: Booting Node 0 Processor 3 APIC 0x3 > -------skip-------- > [ 106.990702] CPU3 is up > > It took CPU1 106.948360 - 106.931790 = 16.57 ms, and > CPU3 106.990702 - 106.986534 = 4.16 ms > > Question: > So it turns out to be a BIOS issue, but Linux should also deal with > this bogus BIOS, right? I studied the commit we delay the synchronization > until all the APs are brought up, and according to: > > Commit d0af9eed5aa9 ("x86, pat/mtrr: Rendezvous all the cpus > for MTRR/PAT init") > > It seems that there would be problem if we do not sync APs at the same > time(some CPUs run with cache disabled will hang the system, because its > sibling is trying to adjust the mtrr which might disable its cache) on > some special platforms? But I have a question that, even in our current > solution which defers the synchronization, the scenario mentioned above > can not be avoided because at the time CPU3 is trying to restore mtrr, > its sibling CPU1 might also be doing some kworker or ticking tasks, > the CPU1 might also run with cache disabled? > I'm not sure if I understand the code correctly, and it would be > appreciated if people could give any comments/suggestions on how to deal > with this situation found on MacProBook, and if you need me to do any test > please feel free to let me know. > > Thanks, > Yu > > Cc: Thomas Gleixner <t...@linutronix.de> > Cc: Ingo Molnar <mi...@redhat.com> > Cc: "H. Peter Anvin" <h...@zytor.com> > Cc: Len Brown <len.br...@intel.com> > Cc: "Rafael J. Wysocki" <r...@rjwysocki.net> > Cc: Suresh Siddha <sbsid...@gmail.com> > Cc: Borislav Petkov <b...@suse.de> > Cc: Lukas Wunner <lu...@wunner.de> > Cc: "Brandt, Todd E" <todd.e.bra...@intel.com> > Cc: Rui Zhang <rui.zh...@intel.com> > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Chen Yu <yu.c.c...@intel.com> > --- > arch/x86/kernel/cpu/mtrr/main.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c > index 24e87e7..eddaa89 100644 > --- a/arch/x86/kernel/cpu/mtrr/main.c > +++ b/arch/x86/kernel/cpu/mtrr/main.c > @@ -813,15 +813,28 @@ void mtrr_save_state(void) > put_online_cpus(); > } > > +static bool __read_mostly no_aps_delay; > + > +static int __init no_aps_setup(char *str) > +{ > + no_aps_delay = true; > + pr_info("hack: do not delay aps mtrr/pat initialization.\n"); > + > + return 0; > +} > + > void set_mtrr_aps_delayed_init(void) > { > if (!mtrr_enabled()) > return; > if (!use_intel()) > return; > + if (no_aps_delay) > + return; > > mtrr_aps_delayed_init = true; > } > +early_param("no_aps_delay", no_aps_setup); > > /* > * Delayed MTRR initialization for all AP's > -- > 2.7.4 >