Add buslock.rst to explain bus lock problem and how to detect and handle it.
Signed-off-by: Fenghua Yu <fenghua...@intel.com> Reviewed-by: Tony Luck <tony.l...@intel.com> --- Documentation/x86/buslock.rst | 110 ++++++++++++++++++++++++++++++++++ 1 file changed, 110 insertions(+) create mode 100644 Documentation/x86/buslock.rst diff --git a/Documentation/x86/buslock.rst b/Documentation/x86/buslock.rst new file mode 100644 index 000000000000..4deaf8b82338 --- /dev/null +++ b/Documentation/x86/buslock.rst @@ -0,0 +1,110 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=============================== +Bus lock detection and handling +=============================== + +:Copyright: |copy| 2021 Intel Corporation +:Authors: - Fenghua Yu <fenghua...@intel.com> + - Tony Luck <tony.l...@intel.com> + +Problem +======= + +A split lock is any atomic operation whose operand crosses two cache lines. +Since the operand spans two cache lines and the operation must be atomic, +the system locks the bus while the CPU accesses the two cache lines. + +A bus lock is acquired through either split locked access to writeback (WB) +memory or any locked access to non-WB memory. This is typically thousands of +cycles slower than an atomic operation within a cache line. It also disrupts +performance on other cores and brings the whole system to its knees. + +Detection +========= + +Intel processors may support either or both of the following hardware +mechanisms to detect split locks and bus locks. + +#AC exception for split lock detection +-------------------------------------- + +Beginning with the Tremont Atom CPU split lock operations may raise an +Alignment Check (#AC) exception when a split lock operation is attemped. + +#DB exception for bus lock detection +------------------------------------ + +Some CPUs have ability to notify the kernel by an #DB trap after a user +instruction acquires a bus lock and is executed. This allows the kernel +to enforce user application throttling or mitigation. + +Software handling +================= + +The kernel #AC and #DB handlers handle bus lock based on kernel parameter +"split_lock_detect". Here is a summary of different options: + ++------------------+----------------------------+-----------------------+ +|split_lock_detect=|#AC for split lock |#DB for bus lock | ++------------------+----------------------------+-----------------------+ +|off |Do nothing |Do nothing | ++------------------+----------------------------+-----------------------+ +|warn |Kernel OOPs |Warn once per task and | +|(default) |Warn once per task and |and continues to run. | +| |disable future checking | | +| |When both features are | | +| |supported, warn in #AC | | ++------------------+----------------------------+-----------------------+ +|fatal |Kernel OOPs |Send SIGBUS to user. | +| |Send SIGBUS to user | | +| |When both features are | | +| |supported, fatal in #AC | | ++------------------+----------------------------+-----------------------+ + +Usages +====== + +Detecting and handling bus lock may find usages in various areas: + +It is critical for real time system designers who build consolidated real +time systems. These systems run hard real time code on some cores and +run "untrusted" user processes on some other cores. The hard real time +cannot afford to have any bus lock from the untrusted processes to hurt +real time performance. To date the designers have been unable to deploy +these solutions as they have no way to prevent the "untrusted" user code +from generating split lock and bus lock to block the hard real time code +to access memory during bus locking. + +It may also find usage in cloud. A user process with bus lock running +in one guest can block other cores from accessing shared memory. + +Bus lock may open a security hole where malicious user code may slow +down overall system by executing instructions with bus lock. + + +Guidance +======== +off +--- + +Disable checking for split lock and bus lock. This option may be +useful if there are legacy applications that trigger these events +at a low rate so that mitigation is not needed. + +warn +---- + +The bus lock is warned so that it can be found and fixed. This is the +default behavior. + +It may be useful to find and fix bus lock. The warning information has +process id and faulting instruction address to help pin point bus lock +and fix it. + +fatal +----- + +In this case, the bus lock is not tolerated and the process is killed. + +It is useful in hard real time system. -- 2.31.1