On Sun, 01 Oct 2023, Christoph Hellwig wrote:
> On Fri, Sep 29, 2023 at 09:24:57AM +0200, Thomas Weißschuh wrote:
>> > This does not scale.
>>
>> Could you elaborate in which way it doesn't scale?
>
> If I send a modest cross-subsystem series it often touches 20+
> subsystems. Between mailing li
On Mon, Oct 02, 2023 at 10:36:01AM +0300, Jani Nikula wrote:
> On Sun, 01 Oct 2023, Christoph Hellwig wrote:
> > On Fri, Sep 29, 2023 at 09:24:57AM +0200, Thomas Weißschuh wrote:
> >> > This does not scale.
> >>
> >> Could you elaborate in which way it doesn't scale?
> >
> > If I send a modest cr
On Sat, Sep 30, 2023 at 08:26:38AM -0600, Jonathan Corbet wrote:
> Conor Dooley writes:
>
> > On Thu, Sep 28, 2023 at 01:29:42PM +0300, Costa Shulyupin wrote:
> >> and fix all in-tree references.
> >>
> >> Architecture-specific documentation is being moved into Documentation/arch/
> >> as a way
Hey,
On Sat, Sep 30, 2023 at 09:52:00PM +0300, Costa Shulyupin wrote:
> and fix all in-tree references.
>
> Architecture-specific documentation is being moved into Documentation/arch/
> as a way of cleaning up the top-level documentation directory and making
> the docs hierarchy more closely matc
On Tue, 26 Sep 2023 11:38:14 +0100
Christian Loehle wrote:
> >> @@ -191,7 +191,7 @@ of ftrace. Here is a list of some of the key files:
> >> A few extra pages may be allocated to accommodate buffer management
> >> meta-data. If the last page allocated has room for more bytes
> >>
Conor Dooley writes:
> On Sat, Sep 30, 2023 at 08:26:38AM -0600, Jonathan Corbet wrote:
>> Conor Dooley writes:
>>
>> > On Thu, Sep 28, 2023 at 01:29:42PM +0300, Costa Shulyupin wrote:
>> >> and fix all in-tree references.
>> >>
>> >> Architecture-specific documentation is being moved into
>>
On 2023-10-02 01:50:11-0700, Christoph Hellwig wrote:
> On Mon, Oct 02, 2023 at 10:36:01AM +0300, Jani Nikula wrote:
> > On Sun, 01 Oct 2023, Christoph Hellwig wrote:
> > > On Fri, Sep 29, 2023 at 09:24:57AM +0200, Thomas Weißschuh wrote:
> > >> > This does not scale.
> > >>
> > >> Could you elab
Commenters may not receive new versions of patches via the lists.
Without a directed notification to them they might miss those new
versions.
This is frustrating for the patch developers as they don't receive their
earned Reviewed-by.
It is also frustrating for the commenters, as they might think
WRMSRNS is an instruction that behaves exactly like WRMSR, with
the only difference being that it is not a serializing instruction
by default. Under certain conditions, WRMSRNS may replace WRMSR to
improve performance.
Add the CPU feature bit for WRMSRNS.
Tested-by: Shan Kang
Signed-off-by: Xin
This patch set enables the Intel flexible return and event delivery
(FRED) architecture for x86-64.
The FRED architecture defines simple new transitions that change
privilege level (ring transitions). The FRED architecture was
designed with the following goals:
1) Improve overall performance and
Add an always inline API __wrmsrns() to embed the WRMSRNS instruction
into the code.
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
arch/x86/include/asm/msr.h | 18 ++
1 file changed, 18 insertions(+)
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 65
idtentry_sysvec is really just DECLARE_IDTENTRY defined in
, no need to define it separately.
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
arch/x86/entry/entry_32.S | 4
arch/x86/entry/entry_64.S | 8
arch/x86/include/asm/idtentry.h | 2 +-
3 files changed, 1 inserti
Briefly introduce FRED, and its advantages compared to IDT.
Signed-off-by: Xin Li
---
Changes since v10:
* Reword a sentence to improve readability (Nikolay Borisov).
---
Documentation/arch/x86/x86_64/fred.rst | 96 +
Documentation/arch/x86/x86_64/index.rst | 1 +
2 fi
From: "H. Peter Anvin (Intel)"
Any FRED CPU will always have the following features as its baseline:
1) LKGS, load attributes of the GS segment but the base address into
the IA32_KERNEL_GS_BASE MSR instead of the GS segment’s descriptor
cache.
2) WRMSRNS, non-serializing WRMSR for f
Add the opcode used by WRMSRNS, which is the non-serializing version of
WRMSR and may replace it to improve performance, to the x86 opcode map.
Tested-by: Shan Kang
Signed-off-by: Xin Li
Acked-by: Masami Hiramatsu (Google)
---
arch/x86/lib/x86-opcode-map.txt | 2 +-
tools/arch/x86/lib/x8
To enable FRED, a new kernel command line option "fred" needs to be added.
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
Documentation/admin-guide/kernel-parameters.txt | 3 +++
arch/x86/kernel/cpu/common.c| 3 +++
2 files changed, 6 insertions(+)
diff --git a/Documentatio
From: "H. Peter Anvin (Intel)"
Add the configuration option CONFIG_X86_FRED to enable FRED.
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
arch/x86/Kconfig | 9 +
1 file changed, 9 insertions(+)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
i
struct pt_regs is hard to read because the member or section related
comments are not aligned with the members.
The 'cs' and 'ss' members of pt_regs are type of 'unsigned long' while
in reality they are only 16-bit wide. This works so far as the
remaining space is unused, but FRED will use the rem
From: "H. Peter Anvin (Intel)"
Add X86_CR4_FRED macro for the FRED bit in %cr4. This bit must not be
changed after initialization, so add it to the pinned CR4 bits.
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
Changes since v9:
* Avoid a type cast by d
Intel VT-x classifies events into eight different types, which is
inherited by FRED for event identification. As such, event type
becomes a common x86 concept, and should be defined in a common x86
header.
Add event type macros to , and use it in .
Suggested-by: H. Peter Anvin (Intel)
Tested-by:
From: "H. Peter Anvin (Intel)"
Add CONFIG_X86_FRED to to make
cpu_feature_enabled() work correctly with FRED.
Originally-by: Megha Dey
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
Changes since v10:
* FRED feature is defined in cpuid word 12, not 13
From: "H. Peter Anvin (Intel)"
Update the objtool decoder to know about the ERET[US] instructions
(type INSN_CONTEXT_SWITCH).
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
tools/objtool/arch/x86/decode.c | 19 ++-
1 file changed, 14 inse
From: "H. Peter Anvin (Intel)"
When using FRED, reserve space at the top of the stack frame, just
like i386 does.
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
arch/x86/include/asm/thread_info.h | 12 +---
1 file changed, 9 insertions(+), 3 dele
From: "H. Peter Anvin (Intel)"
ERETU returns from an event handler while making a transition to ring 3,
and ERETS returns from an event handler while staying in ring 0.
Add instruction opcodes used by ERET[US] to the x86 opcode map; opcode
numbers are per FRED spec v5.0.
Signed-off-by: H. Peter
From: "H. Peter Anvin (Intel)"
MSR_IA32_FRED_RSP0 is used during ring 3 event delivery, and needs to
be updated to point to the top of next task stack during task switch.
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
arch/x86/include/asm/switch_to.h | 8
From: "H. Peter Anvin (Intel)"
Add MSR numbers for the FRED configuration registers per FRED spec 5.0.
Originally-by: Megha Dey
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
arch/x86/include/asm/msr-index.h | 13 -
tools/arch/x86/incl
From: "H. Peter Anvin (Intel)"
Add a header file for FRED prototypes and definitions.
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
Changes since v6:
* Replace pt_regs csx flags prefix FRED_CSL_ with FRED_CSX_.
---
arch/x86/include/asm/fred.h | 68
FRED defines additional information in the upper 48 bits of cs/ss
fields. Therefore add the information definitions into the pt_regs
structure.
Specially introduce a new structure fred_ss to denote the FRED flags
above SS selector, which avoids FRED_SSX_ macros and makes the code
simpler and easie
From: "H. Peter Anvin (Intel)"
SWAPGS is no longer needed thus NOT allowed with FRED because FRED
transitions ensure that an operating system can _always_ operate
with its own GS base address:
- For events that occur in ring 3, FRED event delivery swaps the GS
base address with the IA32_KERNEL_
From: "H. Peter Anvin (Intel)"
Because FRED always restores the full value of %rsp, ESPFIX is
no longer needed when it's enabled.
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
arch/x86/kernel/espfix_64.c | 8
1 file changed, 8 insertions(+)
di
From: "H. Peter Anvin (Intel)"
Entering a new task is logically speaking a return from a system call
(exec, fork, clone, etc.). As such, if ptrace enables single stepping
a single step exception should be allowed to trigger immediately upon
entering user space. This is not optional.
NMI should *
From: "H. Peter Anvin (Intel)"
On a FRED system, the faulting address (CR2) is passed on the stack,
to avoid the problem of transient state. Thus we get the page fault
address from the stack instead of CR2.
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Thomas Gleixn
FRED and IDT can share most of the definitions and declarations so
that in the majority of cases the actual handler implementation is the
same.
The differences are the exceptions where FRED stores exception related
information on the stack and the sysvec implementations as FRED can
handle irqentry
From: "H. Peter Anvin (Intel)"
On a FRED system, NMIs nest both with themselves and faults, transient
information is saved into the stack frame, and NMI unblocking only
happens when the stack frame indicates that so should happen.
Thus, the NMI entry stub for FRED is really quite small...
Signe
From: "H. Peter Anvin (Intel)"
When occurred on different ring level, i.e., from user or kernel context,
#DB needs to be handled on different stack: User #DB on current task
stack, while kernel #DB on a dedicated stack. This is exactly how FRED
event delivery invokes an exception handler: ring 3
From: "H. Peter Anvin (Intel)"
The code to actually handle kernel and event entry/exit using
FRED. It is split up into two files thus:
- entry_64_fred.S contains the actual entrypoints and exit code, and
saves and restores registers.
- entry_fred.c contains the two-level event dispatch code fo
Add sysvec_install() to install a system interrupt handler into the IDT
or the FRED system interrupt handler table.
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
Changes since v8:
* Introduce a macro sysvec_install() to derive the asm handler name from
a C handler, which simplifies the code
If the stack frame contains an invalid user context (e.g. due to invalid SS,
a non-canonical RIP, etc.) the ERETU instruction will trap (#SS or #GP).
>From a Linux point of view, this really should be considered a user space
failure, so use the standard fault fixup mechanism to intercept the fault
In IRQ/NMI induced VM exits, KVM VMX needs to execute the respective
handlers, which requires the software to create a FRED stack frame,
and use it to invoke the handlers. Add fred_irq_entry_from_kvm() for
this job.
Export fred_entry_from_kvm() because VMX can be compiled as a module.
Suggested-b
From: "H. Peter Anvin (Intel)"
Add cpu_init_fred_exceptions() to:
- Set FRED entrypoints for events happening in ring 0 and 3.
- Specify the stack level for IRQs occurred ring 0.
- Specify dedicated event stacks for #DB/NMI/#MCE/#DF.
- Enable FRED and invalidtes IDT.
- Force 32-bit syst
When FRED is enabled, call fred_entry_from_kvm() to handle IRQ/NMI in
IRQ/NMI induced VM exits.
Tested-by: Shan Kang
Signed-off-by: Xin Li
Acked-by: Paolo Bonzini
---
arch/x86/kvm/vmx/vmx.c | 12 +---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/vmx/vmx.c
Like #DB, when occurred on different ring level, i.e., from user or kernel
context, #MCE needs to be handled on different stack: User #MCE on current
task stack, while kernel #MCE on a dedicated stack.
This is exactly how FRED event delivery invokes an exception handler: ring
3 event on level 0 st
From: "H. Peter Anvin (Intel)"
Let ret_from_fork_asm() jmp to asm_fred_exit_user when FRED is enabled,
otherwise the existing IDT code is chosen.
Signed-off-by: H. Peter Anvin (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
---
arch/x86/entry/entry_64.S | 6 ++
arch/x86/entry/ent
From: "Peter Zijlstra (Intel)"
PUSH_AND_CLEAR_REGS could be used besides actual entry code; in that case
%rbp shouldn't be cleared (otherwise the frame pointer is destroyed) and
UNWIND_HINT shouldn't be added.
Signed-off-by: Peter Zijlstra (Intel)
Tested-by: Shan Kang
Signed-off-by: Xin Li
--
From: "H. Peter Anvin (Intel)"
Let cpu_init_exception_handling() call cpu_init_fred_exceptions() to
initialize FRED. However if FRED is unavailable or disabled, it falls
back to set up TSS IST and initialize IDT.
Signed-off-by: H. Peter Anvin (Intel)
Co-developed-by: Xin Li
Tested-by: Shan Kan
Because FRED uses the ring 3 FRED entrypoint for SYSCALL and SYSENTER and
ERETU is the only legit instruction to return to ring 3, there is NO need
to setup SYSCALL and SYSENTER MSRs for FRED, except the IA32_STAR MSR.
Split IDT syscall setup code into idt_syscall_init() to make it easy to
skip sy
46 matches
Mail list logo