arch_prctl(ARCH_X86_CET_STATUS, unsigned long *addr)
Return CET feature status.
The parameter 'addr' is a pointer to a user buffer.
On returning to the caller, the kernel fills the following
information:
*addr = SHSTK/IBT status
*(addr + 1) = SHSTK base address
*(addr
To prevent function call/return spills into the next shadow stack
area, do not merge shadow stack areas.
Signed-off-by: Yu-cheng Yu
---
mm/mmap.c | 6 ++
1 file changed, 6 insertions(+)
diff --git a/mm/mmap.c b/mm/mmap.c
index 6c04292e16a7..30836512ca79 100644
--- a/mm/mmap.c
+++ b/mm/mmap.
From: "H.J. Lu"
Add ENDBR32 to vsyscall entry point.
Signed-off-by: H.J. Lu
---
arch/x86/entry/vdso/vdso32/system_call.S | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/entry/vdso/vdso32/system_call.S
b/arch/x86/entry/vdso/vdso32/system_call.S
index 263d7433dea8..2fc8141fff4e
From: "H.J. Lu"
When Intel indirect branch tracking is enabled, functions in vDSO which
may be called indirectly must have endbr32 or endbr64 as the first
instruction. Compiler must support -fcf-protection=branch so that it
can be used to compile vDSO.
Signed-off-by: H.J. Lu
---
arch/x86/entr
Add REGSET_CET64/REGSET_CET32 to get/set CET MSRs:
IA32_U_CET (user-mode CET settings) and
IA32_PL3_SSP (user-mode shadow stack)
Signed-off-by: Yu-cheng Yu
---
arch/x86/include/asm/fpu/regset.h | 7 +++---
arch/x86/kernel/fpu/regset.c | 41 +++
arch/x86
From: "H.J. Lu"
Add ENDBR64 to vsyscall entry points.
Signed-off-by: H.J. Lu
---
arch/x86/entry/vsyscall/vsyscall_emu_64.S | 9 +
1 file changed, 9 insertions(+)
diff --git a/arch/x86/entry/vsyscall/vsyscall_emu_64.S
b/arch/x86/entry/vsyscall/vsyscall_emu_64.S
index c9596a9af159..085
Look in .note.gnu.property of an ELF file and check if Indirect
Branch Tracking needs to be enabled for the task.
Signed-off-by: H.J. Lu
Signed-off-by: Yu-cheng Yu
---
arch/x86/include/uapi/asm/elf_property.h | 1 +
arch/x86/kernel/elf.c| 5 +
2 files changed, 6 insertio
Update ARCH_X86_CET_STATUS and ARCH_X86_CET_DISABLE to include
Indirect Branch Tracking features.
Introduce:
arch_prctl(ARCH_X86_CET_SET_LEGACY_BITMAP, unsigned long *addr)
Enable the Indirect Branch Tracking legacy code bitmap.
The parameter 'addr' is a pointer to a user buffer that has
Add control transfer terminating instructions:
ENDBR64/ENDBR32:
Mark a valid 64/32-bit control transfer endpoint.
Signed-off-by: Yu-cheng Yu
---
arch/x86/lib/x86-opcode-map.txt | 13 +++--
tools/objtool/arch/x86/lib/x86-opcode-map.txt | 13 +++--
2 files change
The user-mode indirect branch tracking support is done mostly by GCC
to insert ENDBR64/ENDBR32 instructions at branch targets. The kernel
provides CPUID enumeration and feature setup.
Signed-off-by: Yu-cheng Yu
---
arch/x86/Kconfig | 16
arch/x86/Makefile | 7 +++
2 files
can_follow_write_pte/pmd look for the (RO & DIRTY) PTE/PMD to
verify an exclusive RO page still exists after a broken COW.
A shadow stack PTE is RO & PAGE_DIRTY_SW when it is shared,
otherwise RO & PAGE_DIRTY_HW.
Introduce pte_exclusive() and pmd_exclusive() to also verify a
shadow stack PTE is e
The indirect branch tracking legacy bitmap takes a large address
space. This causes may_expand_vm() failure on the address limit
check. For a IBT-enabled task, add the bitmap size to the
address limit.
Signed-off-by: Yu-cheng Yu
---
arch/x86/include/asm/mmu_context.h | 10 ++
mm/mmap.c
This patch adds basic shadow stack enabling/disabling routines.
A task's shadow stack is allocated from memory with VM_SHSTK flag set
and read-only protection. It has a fixed size of RLIMIT_STACK.
Signed-off-by: Yu-cheng Yu
---
arch/x86/include/asm/cet.h| 34 ++
arch/x8
The shadow stack for clone/fork is handled as the following:
(1) If ((clone_flags & (CLONE_VFORK | CLONE_VM)) == CLONE_VM),
the kernel allocates (and frees on thread exit) a new SHSTK
for the child.
It is possible for the kernel to complete the clone syscall
and set the child's SH
WRUSS is a new kernel-mode instruction but writes directly to user
shadow stack memory. This is used to construct a return address on
the shadow stack for the signal handler.
This instruction can fault if the user shadow stack is invalid shadow
stack memory. In that case, the kernel does a fixup
Indirect Branch Tracking (IBT) provides an optional legacy code bitmap
that allows execution of legacy, non-IBT compatible library by an
IBT-enabled application. When set, each bit in the bitmap indicates
one page of legacy code.
The bitmap is allocated and setup from the application.
Signed-off
This patch implements THP shadow stack (SHSTK) copying in the same
way as in the previous patch for regular PTE.
In copy_huge_pmd(), clear the dirty bit from the PMD to cause a page
fault upon the next SHSTK access to the PMD. At that time, fix the
PMD and copy/re-use the page.
Signed-off-by: Yu
Add user-mode indirect branch tracking enabling/disabling and
supporting routines.
Signed-off-by: H.J. Lu
Signed-off-by: Yu-cheng Yu
---
arch/x86/include/asm/cet.h| 7 +
arch/x86/include/asm/disabled-features.h | 8 -
arch/x86/kernel/cet.c
When setting up a signal, the kernel creates a shadow stack restore
token at the current SHSTK address and then stores the token's
address in the signal frame, right after the FPU state. Before
restoring a signal, the kernel verifies and then uses the restore
token to set the SHSTK pointer.
Signe
There are a few places that need do_mmap() with mm->mmap_sem held.
Create an in-line function for that.
Signed-off-by: Yu-cheng Yu
---
include/linux/mm.h | 18 ++
1 file changed, 18 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 7873ac3635a7..36f72c4441
The previous version of CET Branch Tracking/PTRACE patches is at the following
link:
https://lkml.org/lkml/2018/10/11/662
Summary of changes from v5:
Remove the legacy code bitmap allocation from kernel. Now GLIBC
allocates the bitmap and passes it to the kernel.
Some small fixes.
H.J
Look in .note.gnu.property of an ELF file and check if Shadow Stack needs
to be enabled for the task.
Signed-off-by: H.J. Lu
Signed-off-by: Yu-cheng Yu
---
arch/x86/Kconfig | 4 +
arch/x86/include/asm/elf.h | 5 +
arch/x86/include/uapi/asm/elf_property.
When a task does fork(), its shadow stack (SHSTK) must be duplicated
for the child. This patch implements a flow similar to copy-on-write
of an anonymous page, but for SHSTK.
A SHSTK PTE must be RO and dirty. This dirty bit requirement is used
to effect the copying. In copy_one_pte(), clear the
A control protection exception is triggered when a control flow transfer
attempt violated shadow stack or indirect branch tracking constraints.
For example, the return address for a RET instruction differs from the
safe copy on the shadow stack; or a JMP instruction arrives at a non-
ENDBR instruct
Add CPU feature flags for Control-flow Enforcement Technology (CET).
CPUID.(EAX=7,ECX=0):ECX[bit 7] Shadow stack
CPUID.(EAX=7,ECX=0):EDX[bit 20] Indirect branch tracking
Signed-off-by: Yu-cheng Yu
Reviewed-by: Borislav Petkov
---
arch/x86/include/asm/cpufeatures.h | 2 ++
1 file changed, 2 ins
A RO and dirty PTE exists in the following cases:
(a) A page is modified and then shared with a fork()'ed child;
(b) A R/O page that has been COW'ed;
(c) A SHSTK page.
The processor does not read the dirty bit for (a) and (b), but
checks the dirty bit for (c). To prevent the use of non-SHSTK
mem
If a page fault is triggered by a shadow stack access (e.g. call/ret)
or shadow stack management instructions (e.g. wrussq), then bit[6] of
the page fault error code is set.
In access_error(), verify a shadow stack page fault is within a
shadow stack memory area. It is always an error otherwise.
Control-flow Enforcement (CET) MSR contents are XSAVES system states.
To support CET, introduce XSAVES system states first.
Signed-off-by: Yu-cheng Yu
---
arch/x86/include/asm/fpu/internal.h | 3 +-
arch/x86/include/asm/fpu/xstate.h | 4 +-
arch/x86/kernel/fpu/core.c | 6 +-
arch/x
Intel Control-flow Enforcement Technology (CET) introduces the
following MSRs.
MSR_IA32_U_CET (user-mode CET settings),
MSR_IA32_PL3_SSP (user-mode shadow stack),
MSR_IA32_PL0_SSP (kernel-mode shadow stack),
MSR_IA32_PL1_SSP (Privilege Level 1 shadow stack),
MSR_IA32_PL2_SSP (P
Update _PAGE_DIRTY to _PAGE_DIRTY_BITS in split_2MB_gtt_entry().
In order to support Control-flow Enforcement (CET), _PAGE_DIRTY is
now _PAGE_DIRTY_HW or _PAGE_DIRTY_SW.
Signed-off-by: Yu-cheng Yu
---
drivers/gpu/drm/i915/gvt/gtt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --g
When Shadow Stack is enabled, the [R/O + PAGE_DIRTY_HW] setting is
reserved only for the Shadow Stack. Non-Shadow Stack R/O PTEs use
[R/O + PAGE_DIRTY_SW].
When a PTE goes from [R/W + PAGE_DIRTY_HW] to [R/O + PAGE_DIRTY_SW],
it could become a transient Shadow Stack PTE in two cases.
The first ca
Before introducing _PAGE_DIRTY_SW for non-hardware, memory management
purposes in the next patch, rename _PAGE_DIRTY to _PAGE_DIRTY_HW and
_PAGE_BIT_DIRTY to _PAGE_BIT_DIRTY_HW to make these PTE dirty bits
more clear. There are no functional changes in this patch.
Signed-off-by: Yu-cheng Yu
---
Introduce Kconfig option X86_INTEL_SHADOW_STACK_USER.
An application has shadow stack protection when all the following are
true:
(1) The kernel has X86_INTEL_SHADOW_STACK_USER enabled,
(2) The running processor supports the shadow stack,
(3) The application is built with shadow stack enabl
VM_SHSTK indicates a shadow stack memory area.
The shadow stack is implemented only for the 64-bit kernel.
Signed-off-by: Yu-cheng Yu
---
fs/proc/task_mmu.c | 3 +++
include/linux/mm.h | 8
2 files changed, 11 insertions(+)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 47c
Explain how CET works and the no_cet_shstk/no_cet_ibt kernel
parameters.
Signed-off-by: Yu-cheng Yu
---
.../admin-guide/kernel-parameters.txt | 6 +
Documentation/index.rst | 1 +
Documentation/x86/index.rst | 13 +
Documentation/x86/intel_cet
Control-flow Enforcement (CET) MSR contents are XSAVES system states.
To support CET, introduce XSAVES system states first.
XSAVES is a "supervisor" instruction and, comparing to XSAVE, saves
additional "supervisor" states that can be modified only from CPL 0.
However, these states are per-task an
The previous version of CET Shadow Stack patches is at the following
link:
https://lkml.org/lkml/2018/10/11/642
Summary of changes from v5:
To support more threads, change compat-mode thread shadow stack to
a fixed size from RLIMIT_STACK to RLIMIT_STACK / 4. This change
applies only to
On Mon, Nov 19, 2018 at 1:55 PM Yu-cheng Yu wrote:
>
> From: "H.J. Lu"
>
> When Intel indirect branch tracking is enabled, functions in vDSO which
> may be called indirectly must have endbr32 or endbr64 as the first
> instruction. Compiler must support -fcf-protection=branch so that it
> can be
On Mon, Nov 19, 2018 at 09:45:59AM -0800, Guenter Roeck wrote:
> > In short, other than exposing it via a generic ABI to the user
> > space, how about defining some policy to maintaining it within
> > the driver?
> I think that would be a bad idea. It changes timing for everyone
> curently using t
On Mon, Nov 19, 2018 at 1:55 PM Yu-cheng Yu wrote:
>
> From: "H.J. Lu"
>
> Add ENDBR64 to vsyscall entry points.
>
> Signed-off-by: H.J. Lu
Acked-by: Andy Lutomirski
although the scenarios where this matters will be extremely rare,
given that this code is mapped NX :) Tools like 'pin' may ca
On Mon, Nov 19, 2018 at 1:55 PM Yu-cheng Yu wrote:
>
> From: "H.J. Lu"
>
> Add ENDBR32 to vsyscall entry point.
$SUBJECT should be "x86/vdso/32: Add ENDBR32 to __kernel_vsyscall entry point".
--Andy
On Mon, 2018-11-19 at 14:23 -0800, Andy Lutomirski wrote:
> On Mon, Nov 19, 2018 at 1:55 PM Yu-cheng Yu wrote:
> >
> > From: "H.J. Lu"
> >
> > Add ENDBR32 to vsyscall entry point.
>
> $SUBJECT should be "x86/vdso/32: Add ENDBR32 to __kernel_vsyscall entry
> point".
I will fix it.
Yu-cheng
On Mon, 2018-11-19 at 14:17 -0800, Andy Lutomirski wrote:
> On Mon, Nov 19, 2018 at 1:55 PM Yu-cheng Yu wrote:
> >
> > From: "H.J. Lu"
> >
> > When Intel indirect branch tracking is enabled, functions in vDSO which
> > may be called indirectly must have endbr32 or endbr64 as the first
> > instr
Hi!
> rename Documentation/x86/{intel_rdt_ui.txt => resctrl_ui.txt} (99%)
Rest of files in that directory use - as a separator; and maybe
qos.txt would be a better name then this?
Thanks,
Pavel
--
(english) http://www.liv
> From: Pavel Machek [mailto:pa...@ucw.cz]
> > rename Documentation/x86/{intel_rdt_ui.txt => resctrl_ui.txt} (99%)
>
> Rest of files in that directory use - as a separator; and maybe qos.txt would
> be a better name then this?
Actually a few other files in the directory use "_" as a separator (e
On Mon, Nov 19, 2018 at 12:48:01PM +0200, Leon Romanovsky wrote:
> Date: Mon, 19 Nov 2018 12:48:01 +0200
> From: Leon Romanovsky
> To: Kenneth Lee
> CC: Tim Sell , linux-doc@vger.kernel.org,
> Alexander Shishkin , Zaibo Xu
> , zhangfei@foxmail.com, linux...@huawei.com,
> haojian.zhu...@lin
On Mon, Nov 19, 2018 at 11:49:54AM -0700, Jason Gunthorpe wrote:
> Date: Mon, 19 Nov 2018 11:49:54 -0700
> From: Jason Gunthorpe
> To: Kenneth Lee
> CC: Leon Romanovsky , Kenneth Lee ,
> Tim Sell , linux-doc@vger.kernel.org, Alexander
> Shishkin , Zaibo Xu
> , zhangfei@foxmail.com, linux..
On Tue, Nov 20, 2018 at 11:07:02AM +0800, Kenneth Lee wrote:
> On Mon, Nov 19, 2018 at 11:49:54AM -0700, Jason Gunthorpe wrote:
> > Date: Mon, 19 Nov 2018 11:49:54 -0700
> > From: Jason Gunthorpe
> > To: Kenneth Lee
> > CC: Leon Romanovsky , Kenneth Lee ,
> > Tim Sell , linux-doc@vger.kernel.org
On Tue, Nov 20, 2018 at 11:07:02AM +0800, Kenneth Lee wrote:
> On Mon, Nov 19, 2018 at 11:49:54AM -0700, Jason Gunthorpe wrote:
> > Date: Mon, 19 Nov 2018 11:49:54 -0700
> > From: Jason Gunthorpe
> > To: Kenneth Lee
> > CC: Leon Romanovsky , Kenneth Lee ,
> > Tim Sell , linux-doc@vger.kernel.org
Hi Suzuki,
Thanks for the comments, will update next version with your comments.
On Fri, Nov 16, 2018 at 4:14 AM Suzuki K Poulose wrote:
>
> Hi,
>
> On 10/25/2018 06:59 AM, Kulkarni, Ganapatrao wrote:
> > This patch adds a perf driver for the PMU UNCORE devices DDR4 Memory
> > Controller(DMC) an
I just went looking for the memory allocation guide in the MM docs instead
of in the core API. For the benefit of the next person who makes that
mistake, link to it from the MM docs.
Signed-off-by: Matthew Wilcox
diff --git a/Documentation/core-api/memory-allocation.rst
b/Documentation/core-ap
>
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 50ce1bddaf56..f91da3d0a67e 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -670,7 +670,7 @@ PAGEFLAG_FALSE(DoubleMap)
> #define PAGE_TYPE_BASE 0xf000
> /* Reserve
On Mon, Nov 19, 2018 at 2:54 AM, Pavel Machek wrote:
> On Mon 2018-11-05 13:22:05, Daniel Colascione wrote:
>> State explicitly that holding a /proc/pid file descriptor open does
>> not reserve the PID. Also note that in the event of PID reuse, these
>> open file descriptors refer to the old, now-
On Mon, Nov 19, 2018 at 08:00:49AM -0800, Matthew Wilcox wrote:
> I just went looking for the memory allocation guide in the MM docs instead
> of in the core API. For the benefit of the next person who makes that
> mistake, link to it from the MM docs.
>
> Signed-off-by: Matthew Wilcox
Acked-by
101 - 154 of 154 matches
Mail list logo