Hi!

> >> +       u64 p0, p1;
> >>        int ret;
> >> 
> >>        atomic_set(&late_cpus_in,  0);
> >>        atomic_set(&late_cpus_out, 0);
> >> 
> >> +       p0 = rdtsc_ordered();
> >> +
> >>        ret = stop_machine_cpuslocked(__reload_late, NULL, cpu_online_mask);
> >> +
> >> +       p1 = rdtsc_ordered();
> >> +
> >>        if (ret > 0)
> >>                microcode_check();
> >> 
> >>        pr_info("Reload completed, microcode revision: 0x%x\n", 
> >> boot_cpu_data.microcode);
> >> 
> >> +       pr_info("p0: %lld, p1: %lld, diff: %lld\n", p0, p1, p1 - p0);
> >> +
> >>        return ret;
> >> }
> >> 
> >> We have used a machine with a broken microcode in BIOS and no microcode in
> >> initramfs (to bypass early loading).
> >> 
> >> Here are the results for parallel loading (we made two measurements):
> > 
> >> [   18.197760] microcode: updated to revision 0x200005e, date = 2019-04-02
> >> [   18.201225] x86/CPU: CPU features have changed after loading microcode, 
> >> but might not take effect.
> >> [   18.201230] microcode: Reload completed, microcode revision: 0x200005e
> >> [   18.201232] microcode: p0: 118138123843052, p1: 118138153732656, diff: 
> >> 29889604
> > 
> >> Here are the results of serial loading:
> >> 
> >> [   17.542518] microcode: updated to revision 0x200005e, date = 2019-04-02
> >> [   17.898365] x86/CPU: CPU features have changed after loading microcode, 
> >> but might not take effect.
> >> [   17.898370] microcode: Reload completed, microcode revision: 0x200005e
> >> [   17.898372] microcode: p0: 149220216047388, p1: 149221058945422, diff: 
> >> 842898034
> >> 
> >> One can see that the difference is an order magnitude.
> > 
> > Well, that's impressive, but it seems to finish 300 msec later? Where does 
> > that difference
> > come from / how much real time do you gain by this?
> 
> The difference comes from the large amount of cores/threads the machine has: 
> 72 in this case, but there are machines with more. As the commit message says 
> initially the microcode was applied serially one by one and now the microcode 
> is updated in parallel on all cores.
> 
> 300ms seems nothing but it is enough to cause disruption in some critical 
> services (e.g. storage) - 300ms in which we do not execute anything on CPUs. 
> Also this 300ms is increasing when the machine is fully loaded with guests.
> 

Yes, but if you look at the dmesgs I quoted, paralel microcode update
actually finished 300msec _later_.
                                                                        Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

Attachment: signature.asc
Description: Digital signature

Reply via email to