On 27/11/2018 10:15, Jan Beulich wrote: >>>> On 27.11.18 at 11:10, <julien.gr...@arm.com> wrote: >> Hi, >> >> On 11/27/18 10:00 AM, Sergey Dyasli wrote: >>> Some x86 CPUs has errata regarding microcode updates. The most notorious >>> is Broadwell's BDX90: "Loading Microcode ... May Result in a System Hang". >>> (URL: >> https://www.intel.com/content/dam/www/public/us/en/documents/specification-up >> >> dates/xeon-e7-v4-spec-update.pdf) >>> >>> CPUs are supposed to be idle during initial microcode update. Idle-scrub >>> changes this, making a CPU to go scrubbing (memset) right after it was >>> brought up. This can get in a way of microcode update for other CPUs, >>> which results in a system hang: >>> >>> [ 0.000000] CPU Vendor: Intel, Family 6 (0x6), Model 71 (0x47), >> Stepping 1 (raw 00040671) >>> ... >>> [ 2.598813] HVM: Hardware Assisted Paging (HAP) detected >>> [ 2.600211] HVM: HAP page sizes: 4kB, 2MB, 1GB >>> [ 0.000000] microcode: CPU2 updated from revision 0x11 to 0x1e, >>> date >> = 2018-04-03 >>> [ 0.000000] microcode: CPU4 updated from revision 0x11 to 0x1e, d€ >> [2J[1;1H[2J >>> >>> Prevent this situation by disabling idle scrubbing until >>> SYS_STATE_smp_booted is reached. >> >> I am not aware of any issue on Arm that requires delaying the idle >> scrubbing. It is actually probably better to avoid delaying it as it may >> take a long time to boot all CPUs on platform with a high number of >> cores (48 cores or upper). > > And even on x86 it would perhaps be better to delay things only > if there really is a problem (i.e. Broadwell with too low a ucode > revision).
Except this happens on a different CPU model (still Broadwell though). BDX90 is for model 0x4F (INTEL_FAM6_BROADWELL_X). But this bug occurs on model 0x47 (INTEL_FAM6_BROADWELL_GT3E). I'm yet to hear from Intel if this is BDX90 erratum or not. -- Thanks, Sergey _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel