This patch set enables the Intel flexible return and event delivery
(FRED) architecture for x86-64.
The FRED architecture defines simple new transitions that change
privilege level (ring transitions). The FRED architecture was
designed with the following goals:
1) Improve overall performance and
Add an always inline API __wrmsrns() to embed the WRMSRNS instruction
into the code.
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
arch/x86/include/asm/msr.h | 18 ++
1 file changed, 18 insertions(+)
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 65
Add the opcode used by WRMSRNS, which is the non-serializing version of
WRMSR and may replace it to improve performance, to the x86 opcode map.
Tested-by: Shan Kang
Signed-off-by: Xin Li
Acked-by: Masami Hiramatsu (Google)
---
arch/x86/lib/x86-opcode-map.txt | 2 +-
tools/arch/x86/lib/x8
Intel VT-x classifies events into eight different types, which is
inherited by FRED for event identification. As such, event type
becomes a common x86 concept, and should be defined in a common x86
header.
Add event type macros to , and use it in .
Suggested-by: H. Peter Anvin (Intel)
Tested-by:
From: "H. Peter Anvin (Intel)"
Add the configuration option CONFIG_X86_FRED to enable FRED.
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
arch/x86/Kconfig | 9 +
1 file changed, 9 insertions(+)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
i
idtentry_sysvec is really just DECLARE_IDTENTRY defined in
, no need to define it separately.
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
arch/x86/entry/entry_32.S | 4
arch/x86/entry/entry_64.S | 8
arch/x86/include/asm/idtentry.h | 2 +-
3 files changed, 1 inserti
From: "H. Peter Anvin (Intel)"
ERETU returns from an event handler while making a transition to ring 3,
and ERETS returns from an event handler while staying in ring 0.
Add instruction opcodes used by ERET[US] to the x86 opcode map; opcode
numbers are per FRED spec v5.0.
Signed-off-by: H. Peter
From: "H. Peter Anvin (Intel)"
Add MSR numbers for the FRED configuration registers per FRED spec 5.0.
Originally-by: Megha Dey
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
arch/x86/include/asm/msr-index.h | 13 -
tools/arch/x86/incl
From: "H. Peter Anvin (Intel)"
Update the objtool decoder to know about the ERET[US] instructions
(type INSN_CONTEXT_SWITCH).
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
tools/objtool/arch/x86/decode.c | 19 ++-
1 file changed, 14 inse
Briefly introduce FRED, and its advantages compared to IDT.
Signed-off-by: Xin Li
---
Changes since v10:
* Reword a sentence to improve readability (Nikolay Borisov).
---
Documentation/arch/x86/x86_64/fred.rst | 96 +
Documentation/arch/x86/x86_64/index.rst | 1 +
2 fi
From: "H. Peter Anvin (Intel)"
Add CONFIG_X86_FRED to to make
cpu_feature_enabled() work correctly with FRED.
Originally-by: Megha Dey
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
Changes since v10:
* FRED feature is defined in cpuid word 12, not 13
WRMSRNS is an instruction that behaves exactly like WRMSR, with
the only difference being that it is not a serializing instruction
by default. Under certain conditions, WRMSRNS may replace WRMSR to
improve performance.
Add the CPU feature bit for WRMSRNS.
Tested-by: Shan Kang
Signed-off-by: Xin
From: "H. Peter Anvin (Intel)"
Any FRED CPU will always have the following features as its baseline:
1) LKGS, load attributes of the GS segment but the base address into
the IA32_KERNEL_GS_BASE MSR instead of the GS segment’s descriptor
cache.
2) WRMSRNS, non-serializing WRMSR for f
struct pt_regs is hard to read because the member or section related
comments are not aligned with the members.
The 'cs' and 'ss' members of pt_regs are type of 'unsigned long' while
in reality they are only 16-bit wide. This works so far as the
remaining space is unused, but FRED will use the rem
From: "H. Peter Anvin (Intel)"
Add X86_CR4_FRED macro for the FRED bit in %cr4. This bit must not be
changed after initialization, so add it to the pinned CR4 bits.
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
Changes since v9:
* Avoid a type cast by d
From: "H. Peter Anvin (Intel)"
Because FRED always restores the full value of %rsp, ESPFIX is
no longer needed when it's enabled.
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
arch/x86/kernel/espfix_64.c | 8
1 file changed, 8 insertions(+)
di
From: "H. Peter Anvin (Intel)"
Add a header file for FRED prototypes and definitions.
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
Changes since v6:
* Replace pt_regs csx flags prefix FRED_CSL_ with FRED_CSX_.
---
arch/x86/include/asm/fred.h | 68
From: "H. Peter Anvin (Intel)"
SWAPGS is no longer needed thus NOT allowed with FRED because FRED
transitions ensure that an operating system can _always_ operate
with its own GS base address:
- For events that occur in ring 3, FRED event delivery swaps the GS
base address with the IA32_KERNEL_
From: "H. Peter Anvin (Intel)"
Entering a new task is logically speaking a return from a system call
(exec, fork, clone, etc.). As such, if ptrace enables single stepping
a single step exception should be allowed to trigger immediately upon
entering user space. This is not optional.
NMI should *
From: "H. Peter Anvin (Intel)"
MSR_IA32_FRED_RSP0 is used during ring 3 event delivery, and needs to
be updated to point to the top of next task stack during task switch.
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
arch/x86/include/asm/switch_to.h | 8
To enable FRED, a new kernel command line option "fred" needs to be added.
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
Documentation/admin-guide/kernel-parameters.txt | 3 +++
arch/x86/kernel/cpu/common.c| 3 +++
2 files changed, 6 insertions(+)
diff --git a/Documentatio
FRED and IDT can share most of the definitions and declarations so
that in the majority of cases the actual handler implementation is the
same.
The differences are the exceptions where FRED stores exception related
information on the stack and the sysvec implementations as FRED can
handle irqentry
From: "H. Peter Anvin (Intel)"
On a FRED system, NMIs nest both with themselves and faults, transient
information is saved into the stack frame, and NMI unblocking only
happens when the stack frame indicates that so should happen.
Thus, the NMI entry stub for FRED is really quite small...
Signe
From: "H. Peter Anvin (Intel)"
When using FRED, reserve space at the top of the stack frame, just
like i386 does.
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
arch/x86/include/asm/thread_info.h | 12 +---
1 file changed, 9 insertions(+), 3 dele
Like #DB, when occurred on different ring level, i.e., from user or kernel
context, #MCE needs to be handled on different stack: User #MCE on current
task stack, while kernel #MCE on a dedicated stack.
This is exactly how FRED event delivery invokes an exception handler: ring
3 event on level 0 st
From: "H. Peter Anvin (Intel)"
On a FRED system, the faulting address (CR2) is passed on the stack,
to avoid the problem of transient state. Thus we get the page fault
address from the stack instead of CR2.
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Thomas Gleixn
FRED defines additional information in the upper 48 bits of cs/ss
fields. Therefore add the information definitions into the pt_regs
structure.
Specially introduce a new structure fred_ss to denote the FRED flags
above SS selector, which avoids FRED_SSX_ macros and makes the code
simpler and easie
From: "H. Peter Anvin (Intel)"
Let ret_from_fork_asm() jmp to asm_fred_exit_user when FRED is enabled,
otherwise the existing IDT code is chosen.
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
arch/x86/entry/entry_64.S | 6 ++
arch/x86/entry/ent
From: "H. Peter Anvin (Intel)"
The code to actually handle kernel and event entry/exit using
FRED. It is split up into two files thus:
- entry_64_fred.S contains the actual entrypoints and exit code, and
saves and restores registers.
- entry_fred.c contains the two-level event dispatch code fo
From: "H. Peter Anvin (Intel)"
When occurred on different ring level, i.e., from user or kernel context,
#DB needs to be handled on different stack: User #DB on current task
stack, while kernel #DB on a dedicated stack. This is exactly how FRED
event delivery invokes an exception handler: ring 3
Add sysvec_install() to install a system interrupt handler into the IDT
or the FRED system interrupt handler table.
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
Changes since v8:
* Introduce a macro sysvec_install() to derive the asm handler name from
a C handler, which simplifies the code
From: "Peter Zijlstra (Intel)"
PUSH_AND_CLEAR_REGS could be used besides actual entry code; in that case
%rbp shouldn't be cleared (otherwise the frame pointer is destroyed) and
UNWIND_HINT shouldn't be added.
Signed-off-by: Peter Zijlstra (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
--
In IRQ/NMI induced VM exits, KVM VMX needs to execute the respective
handlers, which requires the software to create a FRED stack frame,
and use it to invoke the handlers. Add fred_irq_entry_from_kvm() for
this job.
Export fred_entry_from_kvm() because VMX can be compiled as a module.
Suggested-b
If the stack frame contains an invalid user context (e.g. due to invalid SS,
a non-canonical RIP, etc.) the ERETU instruction will trap (#SS or #GP).
>From a Linux point of view, this really should be considered a user space
failure, so use the standard fault fixup mechanism to intercept the fault
When FRED is enabled, call fred_entry_from_kvm() to handle IRQ/NMI in
IRQ/NMI induced VM exits.
Tested-by: Shan Kang
Signed-off-by: Xin Li
Acked-by: Paolo Bonzini
---
arch/x86/kvm/vmx/vmx.c | 12 +---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/vmx/vmx.c
Because FRED uses the ring 3 FRED entrypoint for SYSCALL and SYSENTER and
ERETU is the only legit instruction to return to ring 3, there is NO need
to setup SYSCALL and SYSENTER MSRs for FRED, except the IA32_STAR MSR.
Split IDT syscall setup code into idt_syscall_init() to make it easy to
skip sy
From: "H. Peter Anvin (Intel)"
Add cpu_init_fred_exceptions() to:
- Set FRED entrypoints for events happening in ring 0 and 3.
- Specify the stack level for IRQs occurred ring 0.
- Specify dedicated event stacks for #DB/NMI/#MCE/#DF.
- Enable FRED and invalidtes IDT.
- Force 32-bit syst
From: "H. Peter Anvin (Intel)"
Let cpu_init_exception_handling() call cpu_init_fred_exceptions() to
initialize FRED. However if FRED is unavailable or disabled, it falls
back to set up TSS IST and initialize IDT.
Signed-off-by: H. Peter Anvin (Intel)
Co-developed-by: Xin Li
Tested-by: Shan Kan
38 matches
Mail list logo