BookE KVM is in a deep maintenance state, I'm not sure how much testing
it gets. I don't have a test setup, and it does not look like QEMU has
any HV architecture enabled. It hasn't been too painful but there are
some cases where it causes a bit of problem not being able to test, e.g.,
https://lis
File read/write is reimplemented in about 5 different ways in the
various PowerPC selftests. This indicates it should be a common util.
Add a common read_file / write_file implementation and convert users
to it where (easily) possible.
Signed-off-by: Benjamin Gray
---
tools/testing/selftests/po
Add helper functions to read and write (unsigned) long values directly
from/to files. One of the kernel interfaces uses hex strings, so we need
to allow passing a base too.
Signed-off-by: Benjamin Gray
---
tools/testing/selftests/powerpc/dscr/dscr.h | 9 +--
.../selftests/powerpc/dscr/dscr_sy
A couple of tests roll their own auto-allocating file read logic.
Add a generic implementation and convert them to use it.
Signed-off-by: Benjamin Gray
---
.../testing/selftests/powerpc/include/utils.h | 1 +
.../selftests/powerpc/nx-gzip/gzfht_test.c| 37 +
.../selftests/powerpc/s
Often a file is expected to hold an integral value. Existing functions
will use a C stdlib function like atoi or strtol to parse the file.
These operations are error prone, with complicated error conditions
(atoi returns 0 if not a number, and is undefined behaviour if not in
range. strtol returns
Debugfs files are not always integers, so make *_file return/write a
byte buffer, and *_int deal with int values specifically. This increases
consistency with the other file read/write helpers.
Signed-off-by: Benjamin Gray
---
.../testing/selftests/powerpc/include/utils.h | 6 ++--
.../selftest
No need to write inline asm for mtspr/mfspr, we have macros for this
in reg.h
Signed-off-by: Benjamin Gray
Reviewed-by: Andrew Donnellan
---
tools/testing/selftests/powerpc/dscr/dscr.h | 17 +
.../selftests/powerpc/ptrace/ptrace-hwbreak.c | 6 ++
tools/testing/selftes
Started this when writing tests for a feature I'm working on, needing a way to
read/write numbers to system files. After writing some utils to safely handle
file IO and parsing, I realised I'd made the ~6th file read/write implementation
and only(?) number parser that checks all the failure modes w
- malloc() does not zero the buffer,
- fread() does not null-terminate it's output,
- `cat /proc/sys/kernel/core_pattern | hexdump -C` shows the file is
not inherently null-terminated
So using string operations on the buffer is risky. Explicitly add a null
character to the end to make it safer.
Provide an option to build big-endian kernels using the ELFv2 ABI. This
works on GCC only for now. Clang is rumored to support this, but core
build files need updating first, at least.
This gives big-endian kernels useful advantages of the ELFv2 ABI, e.g.,
less stack usage, -mprofile-kernel suppor
This allows asm generation for big-endian ELFv2 builds.
Signed-off-by: Nicholas Piggin
---
drivers/crypto/vmx/Makefile | 12 +++-
drivers/crypto/vmx/ppc-xlate.pl | 10 ++
2 files changed, 17 insertions(+), 5 deletions(-)
diff --git a/drivers/crypto/vmx/Makefile b/drivers/cry
The elf_check_arch() function is also used to test compatibility of
usermode binaries. Kernel modules may have more specific requirements,
for example powerpc would like to test for ABI version compatibility.
Add a weak module_elf_check_arch() that defaults to true, and call it
from elf_validity_c
Override the generic module ELF check to provide a check for the ELF ABI
version. This becomes important if we allow big-endian ELF ABI V2 builds
but it doesn't hurt to check now.
Cc: Jessica Yu
Signed-off-by: Michael Ellerman
[np: split patch, added changelog, adjust to Jessica's proposal]
Sign
This is hopefully the final attempt. Luis was happy for the module
patch to go via the powerpc tree, so I've put the the ELFv2 for big
endian build patches into the series. Hopefully we can deprecate
the ELFv1 ABI
Since v5, I cleaned up patch 2 as per Christophe's review. And patch
4 I removed th
On Mon, 2022-11-28 at 13:44 +1100, Benjamin Gray wrote:
> This series is based on initial work by Chris Riedl that was not sent
> to the list.
>
> Adds a kernel interface for userspace to interact with the DEXCR.
> The DEXCR is a SPR that allows control over various execution
> 'aspects', such as
On Sat Nov 26, 2022 at 3:32 AM AEST, Laurent Dufour wrote:
> The RCU watchdog timer should be reset when restarting the CPU after a Live
> Partition Mobility operation.
>
> Signed-off-by: Laurent Dufour
Looks okay to me. xmon touches the softlockup watchdog explicitly but
is that for architecture
Add a powerpc specific implementation of queued spinlocks. This is the
build framework with a very simple (non-queued) spinlock implementation
to begin with. Later changes add queueing, and other features and
optimisations one-at-a-time. It is done this way to more easily see how
the queued spinloc
On Sat Nov 19, 2022 at 1:07 AM AEST, Nathan Lynch wrote:
> Call the just-added rtas tracepoints in do_enter_rtas(), taking care
> to avoid function name lookups in the CPU offline path.
>
> Signed-off-by: Nathan Lynch
> ---
> arch/powerpc/kernel/rtas.c | 23 +++
> 1 file chang
Thank you all for your guidance and encouragement!
I learn how to construct commit message properly and learn how
important the role
that the torture test framework plays for the Linux kernel. Hope I can
be of benefit to the community by my work.
I am going to continue to study this topic and stu
On Sat Nov 19, 2022 at 1:07 AM AEST, Nathan Lynch wrote:
> Add two sets of tracepoints to be used around RTAS entry:
>
> * rtas_input/rtas_output, which emit the function name, its inputs,
> the returned status, and any other outputs. These produce an API-level
> record of OS<->RTAS activity.
>
The DEXCR Speculative Branch Hint Enable (SBHE) aspect controls whether
the hints provided by BO field of Branch instructions are obeyed during
speculative execution.
SBHE behaviour per ISA 3.1B:
0: The hints provided by BO field of Branch instructions may be
ignored during speculati
Adds the definitions and generic handler for prctl control of the
PowerPC Dynamic Execution Control Register (DEXCR).
Signed-off-by: Benjamin Gray
---
include/uapi/linux/prctl.h | 14 ++
kernel/sys.c | 16
2 files changed, 30 insertions(+)
diff --git a
Test the kernel DEXCR[NPHIE] interface and hashchk exception handling.
Introduces with it a DEXCR utils library for common DEXCR operations.
Signed-off-by: Benjamin Gray
---
tools/testing/selftests/powerpc/Makefile | 1 +
.../selftests/powerpc/dexcr/.gitignore| 1 +
.../testing
Describe the DEXCR and document how to interact with it via the
prctl and sysctl interfaces.
Signed-off-by: Benjamin Gray
---
Documentation/powerpc/dexcr.rst | 183
Documentation/powerpc/index.rst | 1 +
2 files changed, 184 insertions(+)
create mode 100644 Do
This series is based on initial work by Chris Riedl that was not sent
to the list.
Adds a kernel interface for userspace to interact with the DEXCR.
The DEXCR is a SPR that allows control over various execution
'aspects', such as indirect branch prediction and enabling the
hashst/hashchk instructi
Add a utility 'lsdexcr' to print the current DEXCR status. Useful for
quickly checking the status when debugging test failures, using the
sysctl interfaces manually, or just wanting to check it.
Example output:
Requested: 8400 (SBHE, NPHIE)
Hypervisor enforced:
Ef
Test the prctl and sysctl interfaces of the DEXCR.
This adds a new capabilities util for getting and setting CAP_SYS_ADMIN.
Adding this avoids depending on an external libcap package. There is a
similar implementation (and reason) in the tools/testing/selftests/bpf
subtree but there's no obvious p
Adds an initial prctl interface implementation. Unprivileged processes
can query the current prctl setting, including whether an aspect is
implemented by the hardware or is permitted to be modified by a setter
prctl. Editable aspects can be changed by a CAP_SYS_ADMIN privileged
process.
The prctl
Adds more assertion variants to provide more context behind why a
failure occurred.
The SIGSAFE_FAIL_* variants are to allow safely asserting conditions
in a signal handler (though we are about to exit, so it's unlikely to
run into an issue with regular FAIL_IF_EXIT).
Also adds an ARRAY_SIZE macr
ISA 3.1B introduces the Dynamic Execution Control Register (DEXCR). It
is a per-cpu register that allows control over various CPU behaviours
including branch hint usage, indirect branch speculation, and
hashst/hashchk support.
Though introduced in 3.1B, no CPUs using 3.1 were released, so
CPU_FTR_
The functions here use struct thread_struct fields, so need to import
the full definition from . The header
that defines current only forward declares struct thread_struct.
Failing to include this header leads to a compilation
error when a translation unit does not also include
indirectly.
Sig
Recognise and pass the appropriate signal to the user program when a
hashchk instruction triggers. This is independent of allowing
configuration of DEXCR[NPHIE], as a hypervisor can enforce this aspect
regardless of the kernel.
Signed-off-by: Benjamin Gray
---
arch/powerpc/include/asm/ppc-opcode
The ISA 3.1B hashst and hashchk instructions use a per-cpu SPR HASHKEYR
to hold a key used in the hash calculation. This key should be different
for each process to make it harder for a malicious process to recreate
valid hash values for a victim process.
Add support for storing a per-thread hash
The DEXCR Non-Privileged Hash Instruction Enable (NPHIE) aspect controls
whether the hashst and hashchk instructions are treated as no-ops by the
CPU.
NPHIE behaviour per ISA 3.1B:
0: hashst and hashchk instructions are executed as no-ops
(even when allowed by PCR)
1: hashst an
On Sat Nov 19, 2022 at 1:07 AM AEST, Nathan Lynch wrote:
> Make do_enter_rtas() take a pointer to struct rtas_args and do the
> __pa() conversion in one place instead of leaving it to callers. This
> also makes it possible to introduce enter/exit tracepoints that access
> the rtas_args struct field
On Sat Nov 19, 2022 at 1:07 AM AEST, Nathan Lynch wrote:
> It's unsafe to use rtas_busy_delay() to handle a busy status from
> the ibm,os-term RTAS function in rtas_os_term():
>
> Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b
> BUG: sleeping function called from invalid co
On Sat Nov 19, 2022 at 1:07 AM AEST, Nathan Lynch wrote:
> rtas_os_term() is called during panic. Its behavior depends on a
> couple of conditions in the /rtas node of the device tree, the
> traversal of which entails locking and local IRQ state changes. If the
> kernel panics while devtree_lock is
On Mon Nov 7, 2022 at 1:32 PM AEST, Rohan McLure wrote:
> Cause pseries platforms to default to zeroising all potentially user-defined
> registers when entering the kernel by means of any interrupt source,
> reducing user-influence of the kernel and the likelihood or producing
> speculation gadgets
On Mon Nov 7, 2022 at 1:32 PM AEST, Rohan McLure wrote:
> Zero GPRS r14-r31 on entry into the kernel for interrupt sources to
> limit influence of user-space values in potential speculation gadgets.
> Prior to this commit, all other GPRS are reassigned during the common
> prologue to interrupt hand
On Mon Nov 7, 2022 at 1:32 PM AEST, Rohan McLure wrote:
> Zero user state in gprs (assign to zero) to reduce the influence of user
> registers on speculation within kernel syscall handlers. Clears occur
> at the very beginning of the sc and scv 0 interrupt handlers, with
> restores occurring follow
On Tue Nov 8, 2022 at 12:28 AM AEST, Christophe Leroy wrote:
>
>
> Le 07/11/2022 à 04:31, Rohan McLure a écrit :
> > Add Kconfig option for enabling clearing of registers on arrival in an
> > interrupt handler. This reduces the speculation influence of registers
> > on kernel internals. The option
On Fri Nov 25, 2022 at 11:25 PM AEST, Michael Ellerman wrote:
> There's no declaration for machine_check_early_boot(), which leads to a
> build failure with W=1. Add one.
>
> Fixes: 2f5182cffa43 ("powerpc/64s: early boot machine check handler")
> Signed-off-by: Michael Ellerman
Acked-by: Nicholas
Thomas Weißschuh writes:
> On 2022-11-26 07:36+, Christophe Leroy wrote:
>> Le 26/11/2022 à 06:10, Thomas Weißschuh a écrit :
>>> Commit 7ad4bd887d27 ("powerpc/book3e: get rid of #include
>>> ")
>>> removed the usage of the define UTS_VERSION but forgot to drop the
>>> include.
>>
>> What ab
randconfig-r043-20221127
m68k allmodconfig
powerpc allnoconfig
arc allyesconfig
i386 allyesconfig
x86_64randconfig-a002
alphaallyesconfig
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git
topic/ppc-kvm
branch HEAD: a96b20758b23be7e9f693218908228d6100c3c26 KVM: PPC: Book3S HV: Use
the bitmap API to allocate bitmaps
elapsed time: 743m
configs tested: 2
configs skipped: 100
The following configs have b
randconfig-r043-20221127
i386 randconfig-a001
riscvrandconfig-r042-20221127
x86_64randconfig-a015
x86_64 allyesconfig
i386 randconfig-a003
i386 randconfig-a005
On Sun, Nov 27, 2022 at 01:40:28PM +0100, Thomas Gleixner wrote:
[ . . . ]
> >> No. We are not exporting this just to make a bogus test case happy.
> >>
> >> Fix the torture code to handle -EBUSY correctly.
> > I am going to do a study on this, for now, I do a grep in the kernel tree:
> > find .
This is equal to STACK_FRAME_MIN_SIZE on 32-bit and 64-bit ELFv1, and no
longer used in 64-bit ELFv2, so replace STACK_FRAME_OVERHEAD occurrences
with STACK_FRAME_MIN_SIZE.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/ptrace.h | 24 +++-
1 file changed, 11 inse
Adjust the ELFv2 interrupt and switch frames to the minimum C ABI size,
plus pt_regs, plus 16 bytes for the aligned regs marker for the int
frame (and the switch frame needs to match that because it uses the same
regs offset as the int frame).
This saves 80 bytes of kernel stack per interrupt. It'
This affects only 64-bit ELFv2 kernels, and reduces the minimum
asm-created stack frame size from 112 to 32 byte on those kernels.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kernel/head_40x.S | 2 +-
arch/powerpc/kernel/head_44x.S | 6 +++---
arch/powerpc/kernel/head_64.S
Most callers just want to validate an arbitrary kernel stack pointer,
some need a particular size. Make the size case the exceptional one
with an extra function.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/processor.h | 15 ---
arch/powerpc/kernel/process.c| 2
Stack unwinders need LR and the back chain as a minimum. The switch
stack uses regs->nip for its return pointer rather than lrsave, so
that was not set in the fork frame, and neither was the back chain.
This change sets those fields in the stack.
With this and the previous change, a stack trace in
Backtraces will not recognise the fork system call interrupt without
the regs marker. And regular interrupt entry from userspace creates
the back chain to the user stack, so do this for the initial fork
frame too, to be consistent.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kernel/process.c
This is open-coded in process.c, ppc32 uses a different define with the
same value, and the C definition is name differently which makes it an
extra indirection to grep for.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/ptrace.h | 6 --
arch/powerpc/kernel/asm-offsets.c | 2 +
The user interrupt frame is a different size from the kernel frame, so
give it its own name.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/ptrace.h | 6 +++---
arch/powerpc/kernel/process.c | 6 +++---
arch/powerpc/kernel/stacktrace.c | 4 ++--
3 files changed, 8 insertions(+)
This is a count of longs from the stack pointer to the regs marker.
Rename it to make it more distinct from the other byte offsets. It
can be derived from the byte offset definitions just added.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/ptrace.h | 4 ++--
arch/powerpc/kernel/pr
Define a constant rather than open-code the offset for the
"regs" marker.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/ptrace.h | 2 ++
arch/powerpc/kernel/entry_32.S | 2 +-
arch/powerpc/kernel/exceptions-64e.S| 2 +-
arch/powerpc/kernel/exceptions-64s.S
This is a common offset that currently uses the overloaded
STACK_FRAME_OVERHEAD constant. It's easier to read and more
flexible to use a specific regs offset for this.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/ptrace.h | 2 +
arch/powerpc/kernel/asm-offsets.c
Adjust the pt_regs pointer so the interrupt frame offsets can be used
to save registers.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kernel/ppc_save_regs.S | 57 -
1 file changed, 15 insertions(+), 42 deletions(-)
diff --git a/arch/powerpc/kernel/ppc_save_regs.S
This call may use the min size stack frame. The scratch space used is
in the caller's parameter area frame, not this function's frame.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/platforms/pseries/hvCall.S | 38 +
1 file changed, 20 insertions(+), 18 deletions(-)
dif
This makes it a bit clearer where the stack frame is created, and will
allow easier use of some of the stack offset constants in a later
change.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kernel/process.c | 11 ++-
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/arch/p
The interrupt frame detection and loads from the hypothetical pt_regs
are not bounds-checked. The next-frame validation only bounds-checks
STACK_FRAME_OVERHEAD, which does not include the pt_regs. Add another
test for this.
The user could set r1 to be equal to the address matching the first
interr
These are now unused. Remove.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/irqflags.h | 58 -
1 file changed, 58 deletions(-)
diff --git a/arch/powerpc/include/asm/irqflags.h
b/arch/powerpc/include/asm/irqflags.h
index 1a6c1ce17735..47d46712928a 10064
32-bit does not trace_irqs_off() to match the trace_irqs_on() call in
kvmppc_fix_ee_before_entry(). This can lead to irqs being enabled twice
in the trace, and the irqs-off region between guest exit and the host
enabling local irqs again is not properly traced.
64-bit code does call this, but from
Since RFC:
- Fix a compile bug.
- Fix BookE KVM properly. Hopefully -- I don't have a BookE
KVM environment to test. Can QEMU do it? Is it still tested?
- Drop the last two patches that changed the stack layout, they
can be done later.
- Drop the load/store-multiple change to 32-bit.
Thanks,
N
Zhouyi,
On Sun, Nov 27 2022 at 10:45, Zhouyi Zhou wrote:
> On Sun, Nov 27, 2022 at 1:05 AM Thomas Gleixner wrote:
>
> So, I should construct my patch as:
> We avoid ... by ...
Not "We avoid".
Avoid this behaviour by
>> No. We are not exporting this just to make a bogus test case happy.
>>
On 16.11.22 11:26, David Hildenbrand wrote:
FOLL_FORCE is really only for ptrace access. According to commit
707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always
writable"), get_vaddr_frames() currently pins all pages writable as a
workaround for issues with read-only buffers.
67 matches
Mail list logo