tag 800574 + patch thanks Attached updated version of amd64/local-blacklist-on-TSX-Haswell.diff. I believe it should be renamed to "amd64/local-blacklist-for-Intel-TSX.diff" as it is not just about Intel Haswell anymore.
The updated patch has been package-compile-tested on glibc 2.19-22. This new version of the blacklist patch had the patch header text and blacklist code comments updated. It doesn't change anything for Haswell. It adds to the blacklist the current Broadwell CPU models and steppings. Broadwell-H with a very recent microcode update (rev 0x12, from 2015-06-04) was confirmed to have broken TSX-NI (RTM) and to _leave it enabled_ in CPUID, causing glibc with lock elision enabled to SIGSEGV. An even more recent Broadwell-H microcode update, rev 0x13 from 2015-08-03, is confirmed to (finally) disable the HLE and RTM CPUID bits. This should make blacklisting signature 0x40671 uncontroversial. Refer to https://bugzilla.kernel.org/show_bug.cgi?id=103351 for details. This version of the blacklist patch leaves upcoming Broadwell-E unblacklisted. It also leaves Skylake unblacklisted, as I have not been able to confirm whether the newest Skylake-S microcode updates have working Intel TSX-NI, or have it disabled. I propose that the updated blacklist patch be added to glibc in unstable, and after it spends a few weeks in testing, that it should also be the added to stable through a stable update. -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique de Moraes Holschuh <h...@debian.org>
Intel TSX is broken on Haswell based processors (erratum HSD136/HSW136) and a microcode update is available to simply disable the corresponding instructions. A live microcode update will disable the TSX instructions causing already started binaries to segfault. This patch simply disable Intel TSX (HLE and RTM) on processors which might receive a microcode update, so that it doesn't happen. We might expect newer steppings to fix the issue (e.g. as Haswell-EX did). Intel TSX-NI is also broken on Broadwell systems, and documented as being unavailable in their specification updates errata list. However, some end-user systems were shipped with old microcode that left Intel TSX-NI still enabled in CPUID on these processors. We must not allow RTM to be used by glibc on these systems, due to runtime system misbehavior and live-update of microcode hazards. Author: Henrique de Moraes Holschuh <h...@debian.org> Index: glibc-2.19/sysdeps/x86_64/multiarch/init-arch.c =================================================================== --- glibc-2.19.orig/sysdeps/x86_64/multiarch/init-arch.c 2014-02-07 07:04:38.000000000 -0200 +++ glibc-2.19/sysdeps/x86_64/multiarch/init-arch.c 2015-10-07 09:07:59.272156212 -0300 @@ -26,7 +26,7 @@ static void -get_common_indeces (unsigned int *family, unsigned int *model) +get_common_indeces (unsigned int *family, unsigned int *model, unsigned int *stepping) { __cpuid (1, __cpu_features.cpuid[COMMON_CPUID_INDEX_1].eax, __cpu_features.cpuid[COMMON_CPUID_INDEX_1].ebx, @@ -36,6 +36,7 @@ unsigned int eax = __cpu_features.cpuid[COMMON_CPUID_INDEX_1].eax; *family = (eax >> 8) & 0x0f; *model = (eax >> 4) & 0x0f; + *stepping = eax & 0x0f; } @@ -47,6 +48,7 @@ unsigned int edx; unsigned int family = 0; unsigned int model = 0; + unsigned int stepping = 0; enum cpu_features_kind kind; __cpuid (0, __cpu_features.max_cpuid, ebx, ecx, edx); @@ -56,7 +58,7 @@ { kind = arch_kind_intel; - get_common_indeces (&family, &model); + get_common_indeces (&family, &model, &stepping); unsigned int eax = __cpu_features.cpuid[COMMON_CPUID_INDEX_1].eax; unsigned int extended_family = (eax >> 20) & 0xff; @@ -131,7 +133,7 @@ { kind = arch_kind_amd; - get_common_indeces (&family, &model); + get_common_indeces (&family, &model, &stepping); ecx = __cpu_features.cpuid[COMMON_CPUID_INDEX_1].ecx; @@ -176,6 +178,24 @@ } } + /* Disable Intel TSX (HLE and RTM) due to erratum HSD136/HSW136 + on all Haswell processors, except Haswell-EX/Xeon E7-v3 (306F4), + to work around outdated microcode that doesn't disable the + broken feature by default. + + Disable TSX on Broadwell, due to errata BDM53/BDW51/BDD51/ + BDE42. The errata documentation states that RTM is unusable, + and that it should not be advertised by CPUID at all on any + such processors. Unfortunately, it _is_ advertised in some + (older) microcode versions. Exceptions: Broadwell-E (406Fx), + likely already fixed at launch */ + if (kind == arch_kind_intel && family == 6 && + ((model == 63 && stepping <= 2) || (model == 60 && stepping <= 3) || + (model == 69 && stepping <= 1) || (model == 70 && stepping <= 1) || + (model == 61 && stepping <= 4) || (model == 71 && stepping <= 1) || + (model == 86 && stepping <= 2) )) + __cpu_features.cpuid[COMMON_CPUID_INDEX_7].ebx &= ~(bit_RTM | bit_HLE); + __cpu_features.family = family; __cpu_features.model = model; atomic_write_barrier (); Index: glibc-2.19/sysdeps/x86_64/multiarch/init-arch.h =================================================================== --- glibc-2.19.orig/sysdeps/x86_64/multiarch/init-arch.h 2014-02-07 07:04:38.000000000 -0200 +++ glibc-2.19/sysdeps/x86_64/multiarch/init-arch.h 2015-10-06 09:43:18.000000000 -0300 @@ -40,6 +40,7 @@ /* COMMON_CPUID_INDEX_7. */ #define bit_RTM (1 << 11) +#define bit_HLE (1 << 4) /* XCR0 Feature flags. */ #define bit_XMM_state (1 << 1)