date:20231004

Re: [PATCH 0/5] KVM: x86: Fix breakage in KVM_SET_XSAVE's ABI

2023-10-04 Thread Leonardo Bras

On Wed, Sep 27, 2023 at 05:19:51PM -0700, Sean Christopherson wrote:
> Rework how KVM limits guest-unsupported xfeatures to effectively hide
> only when saving state for userspace (KVM_GET_XSAVE), i.e. to let userspace
> load all host-supported xfeatures (via KVM_SET_XSAVE) irrespective of
> what features have been exposed to the guest.

Ok, IIUC your changes provide:
- KVM_GET_XSAVE will return only guest-supported xfeatures
- KVM_SET_XSAVE will allow user to set any xfeatures supported by host
Is that correct?

> 
> The effect on KVM_SET_XSAVE was knowingly done by commit ad856280ddea
> ("x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0"):
> 
> As a bonus, it will also fail if userspace tries to set fpu features
> (with the KVM_SET_XSAVE ioctl) that are not compatible to the guest
> configuration.  Such features will never be returned by KVM_GET_XSAVE
> or KVM_GET_XSAVE2.
> 
> Peventing userspace from doing stupid things is usually a good idea, but in
> this case restricting KVM_SET_XSAVE actually exacerbated the problem that
> commit ad856280ddea was fixing.  As reported by Tyler, rejecting KVM_SET_XSAVE
> for guest-unsupported xfeatures breaks live migration from a kernel without
> commit ad856280ddea, to a kernel with ad856280ddea.  I.e. from a kernel that
> saves guest-unsupported xfeatures to a kernel that doesn't allow loading
> guest-unuspported xfeatures.

So this patch is supposed to fix migration of VM from a host with
pre-ad856280ddea (OLD) kernel to a host with ad856280ddea + your set(NEW).
Right?

Let's get the scenario here, where all machines are the same:
1 - VM created on OLD kernel with a host-supported xfeature F, which is not
guest supported.
2 - VM is migrated to a NEW kernel/host, and KVM_SET_XSAVE xfeature F.
3 - VM will be migrated to another host, qemu requests KVM_GET_XSAVE, which
returns only guest-supported xfeatures, and this is passed to next host
4 - VM will be started on 3rd host with guest-supported xfeatures, meaning
xfeature F is filtered-out, which is not good, because the VM will have
less features compared to boot.

In fact, I notice something would possibly happen between 2 and 3, since
qemu will run KVM_GET_XSAVE at kvm_cpu_synchronize_state() and
KVM_SET_XSAVE at kvm_cpu_exec(), which happens quite often (when vcpu stops
/ resumes for some reason).

Also, even if I got something wrong, and for some reason qemu will be able
to store the original VM xfeatures between migrations, we have the original
issue ad856280ddea was dealing with: newer machines -> older machines
migration:

1 - User gets a VM from an OLD kernel, with a newer host (more xfeatures).
2 - User migrates VM to NEW kernel, and we suppose qemu stores  original
xfeatures (it works). Migration can occur to newer or same gen hosts.
3 - At some point, if migration is attempted to an older host (less
xfeatures), qemu will abort the VM.

> 
> To make matters even worse, QEMU doesn't terminate if KVM_SET_XSAVE fails,
> and so the end result is that the live migration results (possibly silent)
> guest data corruption instead of a failed migration.

And this is something that really needs to be fixed in QEMU side.

> 
> Patch 1 refactors the FPU code to let KVM pass in a mask of which xfeatures
> to save, patch 2 fixes KVM by passing in guest_supported_xcr0 instead of
> modifying user_xfeatures directly.

At my current understanding of this patchset, I would not recomment merging
it, as it would introduce a lot of undesired behaviors.

Please let me know if I got something wrong, so I can review it again.

Thanks!
Leo

> 
> Patches 3-5 are regression tests.
> 
> I have no objection if anyone wants patches 1 and 2 squashed together, I
> split them purely to make review easier.
> 
> Note, this doesn't fix the scenario where a guest is migrated from a "bad"
> to a "good" kernel and the target host doesn't support the over-saved set
> of xfeatures.  I don't see a way to safely handle that in the kernel without
> an opt-in, which more or less defeats the purpose of handling it in KVM.
> 
> Sean Christopherson (5):
>   x86/fpu: Allow caller to constrain xfeatures when copying to uabi
> buffer
>   KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}
>   KVM: selftests: Touch relevant XSAVE state in guest for state test
>   KVM: selftests: Load XSAVE state into untouched vCPU during state test
>   KVM: selftests: Force load all supported XSAVE state in state test
> 
>  arch/x86/include/asm/fpu/api.h|   3 +-
>  arch/x86/kernel/fpu/core.c|   5 +-
>  arch/x86/kernel/fpu/xstate.c  |  12 +-
>  arch/x86/kernel/fpu/xstate.h  |   3 +-
>  arch/x86/kvm/cpuid.c  |   8 --
>  arch/x86/kvm/x86.c|  37 +++---
>  .../selftests/kvm/include/x86_64/processor.h  |  23 
>  .../testing/selftests/kvm/x86_64/state_test.c | 110 +++

[PATCH bpf 0/3] libbpf/selftests syscall wrapper fixes for RISC-V

2023-10-04 Thread Björn Töpel

From: Björn Töpel 

Commit 08d0ce30e0e4 ("riscv: Implement syscall wrappers") introduced
some regressions in libbpf, and the kselftests BPF suite, which are
fixed with these three patches.

Note that there's an outstanding fix [1] for ftrace syscall tracing
which is also a fallout from the commit above.


Björn

[1] 
https://lore.kernel.org/linux-riscv/20231003182407.32198-1-alexgh...@rivosinc.com/

Alexandre Ghiti (1):
  libbpf: Fix syscall access arguments on riscv

Björn Töpel (2):
  selftests/bpf: Define SYS_PREFIX for riscv
  selftests/bpf: Define SYS_NANOSLEEP_KPROBE_NAME for riscv

 tools/lib/bpf/bpf_tracing.h  | 2 --
 tools/testing/selftests/bpf/progs/bpf_misc.h | 3 +++
 tools/testing/selftests/bpf/test_progs.h | 2 ++
 3 files changed, 5 insertions(+), 2 deletions(-)


base-commit: 9077fc228f09c9f975c498c55f5d2e882cd0da59
-- 
2.39.2

[PATCH bpf 1/3] libbpf: Fix syscall access arguments on riscv

2023-10-04 Thread Björn Töpel

From: Alexandre Ghiti 

Since commit 08d0ce30e0e4 ("riscv: Implement syscall wrappers"), riscv
selects ARCH_HAS_SYSCALL_WRAPPER so let's use the generic implementation
of PT_REGS_SYSCALL_REGS().

Fixes: 08d0ce30e0e4 ("riscv: Implement syscall wrappers")
Signed-off-by: Alexandre Ghiti 
---
 tools/lib/bpf/bpf_tracing.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index 3803479dbe10..1c13f8e88833 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -362,8 +362,6 @@ struct pt_regs___arm64 {
 #define __PT_PARM7_REG a6
 #define __PT_PARM8_REG a7
 
-/* riscv does not select ARCH_HAS_SYSCALL_WRAPPER. */
-#define PT_REGS_SYSCALL_REGS(ctx) ctx
 #define __PT_PARM1_SYSCALL_REG __PT_PARM1_REG
 #define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG
 #define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG
-- 
2.39.2

[PATCH bpf 2/3] selftests/bpf: Define SYS_PREFIX for riscv

2023-10-04 Thread Björn Töpel

From: Björn Töpel 

SYS_PREFIX was missing for a RISC-V, which made a couple of kprobe
tests fail.

Add missing SYS_PREFIX for RISC-V.

Fixes: 08d0ce30e0e4 ("riscv: Implement syscall wrappers")
Signed-off-by: Björn Töpel 
---
 tools/testing/selftests/bpf/progs/bpf_misc.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h 
b/tools/testing/selftests/bpf/progs/bpf_misc.h
index 38a57a2e70db..799fff4995d8 100644
--- a/tools/testing/selftests/bpf/progs/bpf_misc.h
+++ b/tools/testing/selftests/bpf/progs/bpf_misc.h
@@ -99,6 +99,9 @@
 #elif defined(__TARGET_ARCH_arm64)
 #define SYSCALL_WRAPPER 1
 #define SYS_PREFIX "__arm64_"
+#elif defined(__TARGET_ARCH_riscv)
+#define SYSCALL_WRAPPER 1
+#define SYS_PREFIX "__riscv_"
 #else
 #define SYSCALL_WRAPPER 0
 #define SYS_PREFIX "__se_"
-- 
2.39.2

[PATCH bpf 3/3] selftests/bpf: Define SYS_NANOSLEEP_KPROBE_NAME for riscv

2023-10-04 Thread Björn Töpel

From: Björn Töpel 

Add missing sys_nanosleep name for RISC-V, which is used by some tests
(e.g. attach_probe).

Fixes: 08d0ce30e0e4 ("riscv: Implement syscall wrappers")
Signed-off-by: Björn Töpel 
---
 tools/testing/selftests/bpf/test_progs.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/testing/selftests/bpf/test_progs.h 
b/tools/testing/selftests/bpf/test_progs.h
index 77bd492c6024..2f9f6f250f17 100644
--- a/tools/testing/selftests/bpf/test_progs.h
+++ b/tools/testing/selftests/bpf/test_progs.h
@@ -417,6 +417,8 @@ int get_bpf_max_tramp_links(void);
 #define SYS_NANOSLEEP_KPROBE_NAME "__s390x_sys_nanosleep"
 #elif defined(__aarch64__)
 #define SYS_NANOSLEEP_KPROBE_NAME "__arm64_sys_nanosleep"
+#elif defined(__riscv)
+#define SYS_NANOSLEEP_KPROBE_NAME "__riscv_sys_nanosleep"
 #else
 #define SYS_NANOSLEEP_KPROBE_NAME "sys_nanosleep"
 #endif
-- 
2.39.2

Re: [PATCH 0/5] KVM: x86: Fix breakage in KVM_SET_XSAVE's ABI

2023-10-04 Thread Tyler Stachecki

On Wed, Oct 04, 2023 at 04:11:52AM -0300, Leonardo Bras wrote:
> So this patch is supposed to fix migration of VM from a host with
> pre-ad856280ddea (OLD) kernel to a host with ad856280ddea + your set(NEW).
> Right?
> 
> Let's get the scenario here, where all machines are the same:
> 1 - VM created on OLD kernel with a host-supported xfeature F, which is not
> guest supported.
> 2 - VM is migrated to a NEW kernel/host, and KVM_SET_XSAVE xfeature F.
> 3 - VM will be migrated to another host, qemu requests KVM_GET_XSAVE, which
> returns only guest-supported xfeatures, and this is passed to next host
> 4 - VM will be started on 3rd host with guest-supported xfeatures, meaning
> xfeature F is filtered-out, which is not good, because the VM will have
> less features compared to boot.

This is what I was (trying) to convey earlier...

See Sean's response here:
https://lore.kernel.org/all/zrmhy83w%2fvpjy...@google.com/

I'll copy the pertinent part of his very detailed response inline:
> KVM *must* "trim" features when servicing KVM_GET_SAVE{2}, because that's been
> KVM's ABI for a very long time, and userspace absolutely relies on that
> functionality to ensure that a VM can be migrated within a pool of 
> heterogenous
> systems so long as the features that are *exposed* to the guest are supported
> on all platforms.

My 2 cents: as an outsider with less familiarity of the KVM code, it is hard
to understand the contract here with the guest/userspace. It seems there is a
fundamental question of whether or not "superfluous" features, those being
host-supported features which extend that which the guest is actually capable
of, can be removed between the time that the guest boots and when it
terminates, through however many live-migrations that may be.

Ultimately, this problem is not really fixable if said features cannot be
removed.

Is there an RFC or document which captures expectations of this form?

[PATCH bpf-next 0/3] selftest/bpf, riscv: Improved cross-building support

2023-10-04 Thread Björn Töpel

From: Björn Töpel 

Yet another "more cross-building support for RISC-V" series.

An example how to invoke a gen_tar build:

  | make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- CC=riscv64-linux-gnu-gcc \
  |HOSTCC=gcc O=/workspace/kbuild FORMAT= \
  |SKIP_TARGETS="arm64 ia64 powerpc sparc64 x86 sgx" -j $(($(nproc)-1)) \
  |-C tools/testing/selftests gen_tar


Björn

Björn Töpel (3):
  selftests/bpf: Add cross-build support for urandom_read et al
  selftests/bpf: Enable lld usage for RISC-V
  selftests/bpf: Add uprobe_multi to gen_tar target

 tools/testing/selftests/bpf/Makefile | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)


base-commit: 2147c8d07e1abc8dfc3433ca18eed5295e230ede
-- 
2.39.2

[PATCH bpf-next 1/3] selftests/bpf: Add cross-build support for urandom_read et al

2023-10-04 Thread Björn Töpel

From: Björn Töpel 

Some userland programs in the BPF test suite, e.g. urandom_read, is
missing cross-build support. Add cross-build support for these
programs

Signed-off-by: Björn Töpel 
---
 tools/testing/selftests/bpf/Makefile | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/Makefile 
b/tools/testing/selftests/bpf/Makefile
index 47365161b6fc..a9cbb85fa180 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -198,7 +198,7 @@ endif
 # do not fail. Static builds leave urandom_read relying on system-wide shared 
libraries.
 $(OUTPUT)/liburandom_read.so: urandom_read_lib1.c urandom_read_lib2.c 
liburandom_read.map
$(call msg,LIB,,$@)
-   $(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS))   \
+   $(Q)$(CLANG) $(CLANG_TARGET_ARCH) $(filter-out -static,$(CFLAGS) 
$(LDFLAGS))   \
 $(filter %.c,$^) $(filter-out -static,$(LDLIBS)) \
 -fuse-ld=$(LLD) -Wl,-znoseparate-code -Wl,--build-id=sha1 \
 -Wl,--version-script=liburandom_read.map \
@@ -206,7 +206,7 @@ $(OUTPUT)/liburandom_read.so: urandom_read_lib1.c 
urandom_read_lib2.c liburandom
 
 $(OUTPUT)/urandom_read: urandom_read.c urandom_read_aux.c 
$(OUTPUT)/liburandom_read.so
$(call msg,BINARY,,$@)
-   $(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter 
%.c,$^) \
+   $(Q)$(CLANG) $(CLANG_TARGET_ARCH) $(filter-out -static,$(CFLAGS) 
$(LDFLAGS)) $(filter %.c,$^) \
 -lurandom_read $(filter-out -static,$(LDLIBS)) -L$(OUTPUT) 
 \
 -fuse-ld=$(LLD) -Wl,-znoseparate-code -Wl,--build-id=sha1 \
 -Wl,-rpath=. -o $@
-- 
2.39.2

[PATCH bpf-next 3/3] selftests/bpf: Add uprobe_multi to gen_tar target

2023-10-04 Thread Björn Töpel

From: Björn Töpel 

The uprobe_multi program was not picked up for the gen_tar target. Fix
by adding it to TEST_GEN_FILES.

Signed-off-by: Björn Töpel 
---
 tools/testing/selftests/bpf/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/Makefile 
b/tools/testing/selftests/bpf/Makefile
index 098e32c684d5..07ac73cc339d 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -104,7 +104,7 @@ TEST_GEN_PROGS_EXTENDED = test_sock_addr 
test_skb_cgroup_id_user \
xskxceiver xdp_redirect_multi xdp_synproxy veristat xdp_hw_metadata \
xdp_features
 
-TEST_GEN_FILES += liburandom_read.so urandom_read sign-file
+TEST_GEN_FILES += liburandom_read.so urandom_read sign-file uprobe_multi
 
 # Emit succinct information message describing current building step
 # $1 - generic step name (e.g., CC, LINK, etc);
-- 
2.39.2

[PATCH bpf-next 2/3] selftests/bpf: Enable lld usage for RISC-V

2023-10-04 Thread Björn Töpel

From: Björn Töpel 

RISC-V has proper lld support. Use that, similar to what x86 does, for
urandom_read et al.

Signed-off-by: Björn Töpel 
---
 tools/testing/selftests/bpf/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/Makefile 
b/tools/testing/selftests/bpf/Makefile
index a9cbb85fa180..098e32c684d5 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -188,7 +188,7 @@ $(OUTPUT)/%:%.c
$(Q)$(LINK.c) $^ $(LDLIBS) -o $@
 
 # LLVM's ld.lld doesn't support all the architectures, so use it only on x86
-ifeq ($(SRCARCH),x86)
+ifeq ($(SRCARCH),$(filter $(SRCARCH),x86 riscv))
 LLD := lld
 else
 LLD := ld
-- 
2.39.2

[PATCH 0/2] kbuild: kselftest-merge target improvements

2023-10-04 Thread Björn Töpel

From: Björn Töpel 

Two minor changes to the kselftest-merge target:

1. Let builtin have presedence over modules when merging configs
2. Merge per-arch configs, if available


Björn

Björn Töpel (2):
  kbuild: Let builtin have precedence over modules for kselftest-merge
  kbuild: Merge per-arch config for kselftest-merge target

 Makefile | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


base-commit: cbf3a2cb156a2c911d8f38d8247814b4c07f49a2
-- 
2.39.2

[PATCH 2/2] kbuild: Merge per-arch config for kselftest-merge target

2023-10-04 Thread Björn Töpel

From: Björn Töpel 

Some kselftests has a per-arch config,
e.g. tools/testing/selftests/bpf/config.s390x.

Make sure these configs are picked up by the kselftest-merge target.

Signed-off-by: Björn Töpel 
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 170fb2f5e378..0303acb311cc 100644
--- a/Makefile
+++ b/Makefile
@@ -1367,7 +1367,7 @@ kselftest-%: headers FORCE
 PHONY += kselftest-merge
 kselftest-merge:
$(if $(wildcard $(objtree)/.config),, $(error No .config exists, config 
your kernel first!))
-   $(Q)find $(srctree)/tools/testing/selftests -name config | \
+   $(Q)find $(srctree)/tools/testing/selftests -name config -o -name 
config.$(UTS_MACHINE) | \
xargs $(srctree)/scripts/kconfig/merge_config.sh -y -m 
$(objtree)/.config
$(Q)$(MAKE) -f $(srctree)/Makefile olddefconfig
 
-- 
2.39.2

[PATCH 1/2] kbuild: Let builtin have precedence over modules for kselftest-merge

2023-10-04 Thread Björn Töpel

From: Björn Töpel 

The kselftest-merge target walks all kselftests configs, and merges
them. However, builtin does not have precedence over modules. This
breaks some of the tests, e.g.:

$ grep CONFIG_NF_NAT tools/testing/selftests/{bpf,net}/config
tools/testing/selftests/bpf/config:CONFIG_NF_NAT=y
tools/testing/selftests/net/config:CONFIG_NF_NAT=m

Here, the net config will set NF_NAT to module, which makes it clunky
to run the BPF tests.

Add '-y' to scripts/kconfig/merge_config.sh.

Signed-off-by: Björn Töpel 
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 373649c7374e..170fb2f5e378 100644
--- a/Makefile
+++ b/Makefile
@@ -1368,7 +1368,7 @@ PHONY += kselftest-merge
 kselftest-merge:
$(if $(wildcard $(objtree)/.config),, $(error No .config exists, config 
your kernel first!))
$(Q)find $(srctree)/tools/testing/selftests -name config | \
-   xargs $(srctree)/scripts/kconfig/merge_config.sh -m 
$(objtree)/.config
+   xargs $(srctree)/scripts/kconfig/merge_config.sh -y -m 
$(objtree)/.config
$(Q)$(MAKE) -f $(srctree)/Makefile olddefconfig
 
 # ---
-- 
2.39.2

Re: [PATCH bpf-next 3/3] selftests/bpf: Add uprobe_multi to gen_tar target

2023-10-04 Thread Jiri Olsa

On Wed, Oct 04, 2023 at 02:27:21PM +0200, Björn Töpel wrote:
> From: Björn Töpel 
> 
> The uprobe_multi program was not picked up for the gen_tar target. Fix
> by adding it to TEST_GEN_FILES.
> 
> Signed-off-by: Björn Töpel 

Acked-by: Jiri Olsa 

thanks,
jirka


> ---
>  tools/testing/selftests/bpf/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/bpf/Makefile 
> b/tools/testing/selftests/bpf/Makefile
> index 098e32c684d5..07ac73cc339d 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -104,7 +104,7 @@ TEST_GEN_PROGS_EXTENDED = test_sock_addr 
> test_skb_cgroup_id_user \
>   xskxceiver xdp_redirect_multi xdp_synproxy veristat xdp_hw_metadata \
>   xdp_features
>  
> -TEST_GEN_FILES += liburandom_read.so urandom_read sign-file
> +TEST_GEN_FILES += liburandom_read.so urandom_read sign-file uprobe_multi
>  
>  # Emit succinct information message describing current building step
>  # $1 - generic step name (e.g., CC, LINK, etc);
> -- 
> 2.39.2
> 
>

[PATCH 0/3] selftests: add gitignore files to user_events, tdx and dmabuf-heaps

2023-10-04 Thread Javier Carrasco

user_events, tdx and dmabuf-heaps build a series of binaries that can be
safely ignored by git as it is done by other selftests.

Signed-off-by: Javier Carrasco 
---
Javier Carrasco (3):
  selftests/user_events: add gitignore file
  selftests/tdx: add gitignore file
  selftests/dmabuf-heaps: add gitignore file

 tools/testing/selftests/dmabuf-heaps/.gitignore | 1 +
 tools/testing/selftests/tdx/.gitignore  | 1 +
 tools/testing/selftests/user_events/.gitignore  | 4 
 3 files changed, 6 insertions(+)
---
base-commit: cbf3a2cb156a2c911d8f38d8247814b4c07f49a2
change-id: 20231004-topic-selftest_gitignore-3e82f4341001

Best regards,
-- 
Javier Carrasco

[PATCH 1/3] selftests/user_events: add gitignore file

2023-10-04 Thread Javier Carrasco

user_events builds a series of binaries that can be ignored by git.

Signed-off-by: Javier Carrasco 
---
 tools/testing/selftests/user_events/.gitignore | 4 
 1 file changed, 4 insertions(+)

diff --git a/tools/testing/selftests/user_events/.gitignore 
b/tools/testing/selftests/user_events/.gitignore
new file mode 100644
index ..f570febd211b
--- /dev/null
+++ b/tools/testing/selftests/user_events/.gitignore
@@ -0,0 +1,4 @@
+abi_test
+dyn_test
+ftrace_test
+perf_test

-- 
2.39.2

[PATCH 2/3] selftests/tdx: add gitignore file

2023-10-04 Thread Javier Carrasco

tdx builds a tdx_guest_test binary that can be ignored by git.

Signed-off-by: Javier Carrasco 
---
 tools/testing/selftests/tdx/.gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/testing/selftests/tdx/.gitignore 
b/tools/testing/selftests/tdx/.gitignore
new file mode 100644
index ..5db4d15cc673
--- /dev/null
+++ b/tools/testing/selftests/tdx/.gitignore
@@ -0,0 +1 @@
+tdx_guest_test

-- 
2.39.2

[PATCH 3/3] selftests/dmabuf-heaps: add gitignore file

2023-10-04 Thread Javier Carrasco

dmabuf-heaps builds a dmabuf-heap binary that can be ignored by git.

Signed-off-by: Javier Carrasco 
---
 tools/testing/selftests/dmabuf-heaps/.gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/testing/selftests/dmabuf-heaps/.gitignore 
b/tools/testing/selftests/dmabuf-heaps/.gitignore
new file mode 100644
index ..b500e76b9045
--- /dev/null
+++ b/tools/testing/selftests/dmabuf-heaps/.gitignore
@@ -0,0 +1 @@
+dmabuf-heap

-- 
2.39.2

Re: [PATCH v3 2/6] RISC-V: Detect Zicond from ISA string

2023-10-04 Thread Palmer Dabbelt


On Mon, 02 Oct 2023 20:52:22 PDT (-0700), apa...@ventanamicro.com wrote:

The RISC-V integer conditional (Zicond) operation extension defines
standard conditional arithmetic and conditional-select/move operations
which are inspired from the XVentanaCondOps extension. In fact, QEMU
RISC-V also has support for emulating Zicond extension.

Let us detect Zicond extension from ISA string available through
DT or ACPI.

Signed-off-by: Anup Patel 
Reviewed-by: Andrew Jones 
Reviewed-by: Conor Dooley 
---
 arch/riscv/include/asm/hwcap.h | 1 +
 arch/riscv/kernel/cpufeature.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index 0f520f7d058a..6fc51c1b34cf 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -59,6 +59,7 @@
 #define RISCV_ISA_EXT_ZIFENCEI 41
 #define RISCV_ISA_EXT_ZIHPM42
 #define RISCV_ISA_EXT_SMSTATEEN43
+#define RISCV_ISA_EXT_ZICOND   44

 #define RISCV_ISA_EXT_MAX  64

diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 3755a8c2a9de..e3803822ab5a 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -167,6 +167,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
__RISCV_ISA_EXT_DATA(zicbom, RISCV_ISA_EXT_ZICBOM),
__RISCV_ISA_EXT_DATA(zicboz, RISCV_ISA_EXT_ZICBOZ),
__RISCV_ISA_EXT_DATA(zicntr, RISCV_ISA_EXT_ZICNTR),
+   __RISCV_ISA_EXT_DATA(zicond, RISCV_ISA_EXT_ZICOND),
__RISCV_ISA_EXT_DATA(zicsr, RISCV_ISA_EXT_ZICSR),
__RISCV_ISA_EXT_DATA(zifencei, RISCV_ISA_EXT_ZIFENCEI),
__RISCV_ISA_EXT_DATA(zihintpause, RISCV_ISA_EXT_ZIHINTPAUSE),


Acked-by: Palmer Dabbelt 

Can we do a shared tag, though?  These will conflict.

Re: [PATCH 0/5] KVM: x86: Fix breakage in KVM_SET_XSAVE's ABI

2023-10-04 Thread Sean Christopherson

On Wed, Oct 04, 2023, Tyler Stachecki wrote:
> On Wed, Oct 04, 2023 at 04:11:52AM -0300, Leonardo Bras wrote:
> > So this patch is supposed to fix migration of VM from a host with
> > pre-ad856280ddea (OLD) kernel to a host with ad856280ddea + your set(NEW).
> > Right?
> > 
> > Let's get the scenario here, where all machines are the same:
> > 1 - VM created on OLD kernel with a host-supported xfeature F, which is not
> > guest supported.
> > 2 - VM is migrated to a NEW kernel/host, and KVM_SET_XSAVE xfeature F.
> > 3 - VM will be migrated to another host, qemu requests KVM_GET_XSAVE, which
> > returns only guest-supported xfeatures, and this is passed to next host
> > 4 - VM will be started on 3rd host with guest-supported xfeatures, meaning
> > xfeature F is filtered-out, which is not good, because the VM will have
> > less features compared to boot.

No, the VM will not have less features, because KVM_SET_XSAVE loads *data*, not
features.  On a host that supports xfeature F, the VM is running with garbage 
data
no matter what, which is perfectly fine because from the guest's perspective, 
that
xfeature and its associated data do not exist.

And in all likelihood, unless QEMU is doing something bizarre, the data that is
loaded via KVM_SET_XSAVE will be the exact same data that is already present in
the guest FPU state, as both with be in the init state.

On top of that, the data that is loaded via KVM_SET_XSAVE may not actually be
loaded into hardware, i.e. may never be exposed to the guest.  E.g. IIRC, the
original issues was with PKRU.  If PKU is supported by the host, but not exposed
to the guest, KVM will run the guest with the *host's* PKRU value.

> This is what I was (trying) to convey earlier...
> 
> See Sean's response here:
> https://lore.kernel.org/all/zrmhy83w%2fvpjy...@google.com/
> 
> I'll copy the pertinent part of his very detailed response inline:
> > KVM *must* "trim" features when servicing KVM_GET_SAVE{2}, because that's 
> > been
> > KVM's ABI for a very long time, and userspace absolutely relies on that
> > functionality to ensure that a VM can be migrated within a pool of 
> > heterogenous
> > systems so long as the features that are *exposed* to the guest are 
> > supported
> > on all platforms.
> 
> My 2 cents: as an outsider with less familiarity of the KVM code, it is hard
> to understand the contract here with the guest/userspace. It seems there is a
> fundamental question of whether or not "superfluous" features, those being
> host-supported features which extend that which the guest is actually capable
> of, can be removed between the time that the guest boots and when it
> terminates, through however many live-migrations that may be.

KVM's ABI has no formal notion of guest boot=>shutdown or live migration.  The
myriad KVM_GET_* APIs allow taking a snapshot of guest state, and the KVM_SET_*
APIs allow loading a snapshot of guest state.  Live migration is probably the 
most
common use of those APIs, but there are other use cases.

That matters because KVM's contract with userspace for KVM_SET_XSAVE (or any 
other
state save/load ioctl()) doesn't have a holistic view of the guest, e.g. KVM 
can't
know that userspace is live migrating a VM, and that userspace's attempt to load
data for an unsupported xfeature is ok because the xfeature isn't exposed to the
guest.

In other words, at the time of KVM_SET_XSAVE, KVM has no way of knowing that an
xfeature is superfluous.  Normally, that's a complete non-issue because there is
no superfluous xfeature data, as KVM's contract for KVM_GET_SAVE{2} is that only
necessary data is saved in the snapshot.

Unfortunately, the original bug that led to this mess broke the contract for
KVM_GET_XSAVE{2}, and I don't see a safe way to workaround that bug in KVM 
without
an opt-in from userspace.

> Ultimately, this problem is not really fixable if said features cannot be
> removed.

It's not about removing features.  The change you're asking for is to have KVM
*silently* drop data.  Aside from the fact that such a change would break KVM's
ABI, silently ignoring data that userspace has explicitly requested be loaded 
for
a vCPU is incredibly dangerous.

E.g. a not too far fetched scenario would be:

   1. xfeature X is supported on Host A and exposed to a guest 
   2. Host B is upgraded to a new kernel that has a bug that causes the kernel
  to disable support for X, even though X is supported in hardware
   3. The guest is live migrated from Host A to Host B

At step #3, what will currently happen is that KVM_SET_XSAVE will fail with 
-EINVAL
because userspace is attempting to load data that Host B is incapable of 
loading.

The change you're suggesting would result in KVM dropping the data for X and
letting KVM_SET_XSAVE succeed, *for an xfeature that is exposed to the guest*.
I.e. for all intents and purposes, KVM would deliberately corrupt guest data.

> Is there an RFC or document which captures expectations of this form?

Re: [PATCH 2/2] selftests/user_events: Fix abi_test for BE archs

2023-10-04 Thread Shuah Khan


On 10/3/23 18:59, Steven Rostedt wrote:


Note, this doesn't seem to apply to my tree so I only added the first
patch. I think this needs to go through Shuah's tree.

-- Steve




Yes. I sent a fix up for rc4 - I can pull these two patches into
linux-kselftest next

Steve! Does that work for you?

thanks,
-- Shuah

Re: [PATCH 2/2] selftests/user_events: Fix abi_test for BE archs

2023-10-04 Thread Steven Rostedt

On Wed, 4 Oct 2023 09:10:52 -0600
Shuah Khan  wrote:

> On 10/3/23 18:59, Steven Rostedt wrote:
> > 
> > Note, this doesn't seem to apply to my tree so I only added the first
> > patch. I think this needs to go through Shuah's tree.
> > 
> > -- Steve
> > 
> >   
> 
> Yes. I sent a fix up for rc4 - I can pull these two patches into
> linux-kselftest next
> 
> Steve! Does that work for you?
> 

I applied the first patch to my tree, I think the second patch is fine to go
separately through your tree.

-- Steve

Re: [PATCH 0/5] KVM: x86: Fix breakage in KVM_SET_XSAVE's ABI

2023-10-04 Thread Tyler Stachecki

On Wed, Oct 04, 2023 at 07:51:17AM -0700, Sean Christopherson wrote:
> KVM's ABI has no formal notion of guest boot=>shutdown or live migration.  The
> myriad KVM_GET_* APIs allow taking a snapshot of guest state, and the 
> KVM_SET_*
> APIs allow loading a snapshot of guest state.  Live migration is probably the 
> most
> common use of those APIs, but there are other use cases.

I think the lightbulb just clicked, it is really this:

> No, the VM will not have less features, because KVM_SET_XSAVE loads *data*, 
> not
> features [...]

I think I'm conflating the data vs. features aspect here and will have to
revisit my understanding of the code...

> > Ultimately, this problem is not really fixable if said features cannot be
> > removed.

> It's not about removing features.  The change you're asking for is to have KVM
> *silently* drop data.  Aside from the fact that such a change would break 
> KVM's
> ABI, silently ignoring data that userspace has explicitly requested be loaded 
> for
> a vCPU is incredibly dangerous.

Sorry if it came off that way - I fully understand and am resigned to the "you
break it, you keep both halves" nature of what I had initially proposed and
that it is not a generally tractable solution.

That being said, I genuinely appreciate your jump to action on this problem!

Thanks,
Tyler

[GIT PULL] Kselftest fixes update for Linux 6.6-rc5

2023-10-04 Thread Shuah Khan


Hi Linus,

Please pull the following Kselftest fixes update for Linux 6.6-rc5.

This kselftest fixes update for Linux 6.6-rc5 consists of one single
fix to Makefile to fix the incorrect TARGET name for uevent test.

diff is attached.

thanks.
-- Shuah


The following changes since commit 8ed99af4a266a3492d773b5d85c3f8e9f81254b6:

  selftests/user_events: Fix to unmount tracefs when test created mount 
(2023-09-18 11:04:52 -0600)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest 
tags/linux-kselftest-fixes-6.6-rc5

for you to fetch changes up to 6f874fa021dfc7bf37f4f37da3a5aaa41fe9c39c:

  selftests: Fix wrong TARGET in kselftest top level Makefile (2023-09-26 
18:47:37 -0600)


linux-kselftest-fixes-6.6-rc5

This kselftest fixes update for Linux 6.6-rc5 consists of one single
fix to Makefile to fix the incorrect TARGET name for uevent test.


Juntong Deng (1):
  selftests: Fix wrong TARGET in kselftest top level Makefile

 tools/testing/selftests/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 42806add0114..1a21d6beebc6 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -92,7 +92,7 @@ endif
 TARGETS += tmpfs
 TARGETS += tpm2
 TARGETS += tty
-TARGETS += uevents
+TARGETS += uevent
 TARGETS += user
 TARGETS += user_events
 TARGETS += vDSO

Re: [PATCH 2/2] selftests/user_events: Fix abi_test for BE archs

2023-10-04 Thread Shuah Khan


On 10/4/23 09:14, Steven Rostedt wrote:

On Wed, 4 Oct 2023 09:10:52 -0600
Shuah Khan  wrote:


On 10/3/23 18:59, Steven Rostedt wrote:


Note, this doesn't seem to apply to my tree so I only added the first
patch. I think this needs to go through Shuah's tree.

-- Steve

   


Yes. I sent a fix up for rc4 - I can pull these two patches into
linux-kselftest next

Steve! Does that work for you?



I applied the first patch to my tree, I think the second patch is fine to go
separately through your tree.




Yes I will apply this to linux-kselftest fixes branch once my PR
clears.

thanks,
-- Shuah

Re: [PATCH v1 05/20] arm64: context switch POR_EL0 register

2023-10-04 Thread Catalin Marinas

On Wed, Sep 27, 2023 at 03:01:08PM +0100, Joey Gouly wrote:
> +static void permission_overlay_switch(struct task_struct *next)
> +{
> + if (alternative_has_cap_unlikely(ARM64_HAS_S1POE)) {
> + current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0);
> + write_sysreg_s(next->thread.por_el0, SYS_POR_EL0);
> + }
> +}

Does this need an ISB or is the POR_EL0 register write
self-synchronising?

-- 
Catalin

Re: [PATCH bpf 0/3] libbpf/selftests syscall wrapper fixes for RISC-V

2023-10-04 Thread Sami Tolvanen

Hi Björn,

On Wed, Oct 4, 2023 at 4:09 AM Björn Töpel  wrote:
>
> From: Björn Töpel 
>
> Commit 08d0ce30e0e4 ("riscv: Implement syscall wrappers") introduced
> some regressions in libbpf, and the kselftests BPF suite, which are
> fixed with these three patches.

This series looks good to me. Thanks for fixing the issues!

Reviewed-by: Sami Tolvanen 

Sami

Re: [PATCH 0/5] KVM: x86: Fix breakage in KVM_SET_XSAVE's ABI

2023-10-04 Thread Sean Christopherson

On Wed, Oct 04, 2023, Tyler Stachecki wrote:
> On Wed, Oct 04, 2023 at 07:51:17AM -0700, Sean Christopherson wrote:

> > It's not about removing features.  The change you're asking for is to have 
> > KVM
> > *silently* drop data.  Aside from the fact that such a change would break 
> > KVM's
> > ABI, silently ignoring data that userspace has explicitly requested be 
> > loaded for
> > a vCPU is incredibly dangerous.
> 
> Sorry if it came off that way

No need to apologise, you got bit by a nasty kernel bug and are trying to find a
solution.  There's nothing wrong with that.

> I fully understand and am resigned to the "you
> break it, you keep both halves" nature of what I had initially proposed and
> that it is not a generally tractable solution.

Yeah, the crux of the matter is that we have no control or even knowledge of who
all is using KVM, with what userspace VMM, on what hardware, etc.  E.g. if this
bug were affecting our fleet and for some reason we couldn't address the problem
in userspace, carrying a hack in KVM in our internal kernel would probably be a
viable option because we can do a proper risk assessment.  E.g. we know and 
control
exactly what userspace we're running, the underlying hardware in affected pools,
what features are exposed to the guest, etc.  And we could revert the hack once
all affected VMs had been sanitized.

[PATCH 1/2] selftests/mm: export get_free_hugepages()

2023-10-04 Thread Breno Leitao

get_free_hugepages() is helpful for other hugepage tests. Export it to
the common file (vm_util.c) to be reused.

Signed-off-by: Breno Leitao 
---
 tools/testing/selftests/mm/hugetlb-madvise.c | 19 ---
 tools/testing/selftests/mm/vm_util.c | 19 +++
 tools/testing/selftests/mm/vm_util.h |  1 +
 3 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/tools/testing/selftests/mm/hugetlb-madvise.c 
b/tools/testing/selftests/mm/hugetlb-madvise.c
index d55322df4b73..f32d99565c5e 100644
--- a/tools/testing/selftests/mm/hugetlb-madvise.c
+++ b/tools/testing/selftests/mm/hugetlb-madvise.c
@@ -36,25 +36,6 @@
 unsigned long huge_page_size;
 unsigned long base_page_size;
 
-unsigned long get_free_hugepages(void)
-{
-   unsigned long fhp = 0;
-   char *line = NULL;
-   size_t linelen = 0;
-   FILE *f = fopen("/proc/meminfo", "r");
-
-   if (!f)
-   return fhp;
-   while (getline(&line, &linelen, f) > 0) {
-   if (sscanf(line, "HugePages_Free:  %lu", &fhp) == 1)
-   break;
-   }
-
-   free(line);
-   fclose(f);
-   return fhp;
-}
-
 void write_fault_pages(void *addr, unsigned long nr_pages)
 {
unsigned long i;
diff --git a/tools/testing/selftests/mm/vm_util.c 
b/tools/testing/selftests/mm/vm_util.c
index 558c9cd8901c..3082b40492dd 100644
--- a/tools/testing/selftests/mm/vm_util.c
+++ b/tools/testing/selftests/mm/vm_util.c
@@ -269,3 +269,22 @@ int uffd_unregister(int uffd, void *addr, uint64_t len)
 
return ret;
 }
+
+unsigned long get_free_hugepages(void)
+{
+   unsigned long fhp = 0;
+   char *line = NULL;
+   size_t linelen = 0;
+   FILE *f = fopen("/proc/meminfo", "r");
+
+   if (!f)
+   return fhp;
+   while (getline(&line, &linelen, f) > 0) {
+   if (sscanf(line, "HugePages_Free:  %lu", &fhp) == 1)
+   break;
+   }
+
+   free(line);
+   fclose(f);
+   return fhp;
+}
diff --git a/tools/testing/selftests/mm/vm_util.h 
b/tools/testing/selftests/mm/vm_util.h
index c7fa61f0dff8..c02990bbd56f 100644
--- a/tools/testing/selftests/mm/vm_util.h
+++ b/tools/testing/selftests/mm/vm_util.h
@@ -51,6 +51,7 @@ int uffd_register(int uffd, void *addr, uint64_t len,
 int uffd_unregister(int uffd, void *addr, uint64_t len);
 int uffd_register_with_ioctls(int uffd, void *addr, uint64_t len,
  bool miss, bool wp, bool minor, uint64_t *ioctls);
+unsigned long get_free_hugepages(void);
 
 /*
  * On ppc64 this will only work with radix 2M hugepage size
-- 
2.34.1

[PATCH 2/2] selftests/mm: Add a new test for madv and hugetlb

2023-10-04 Thread Breno Leitao

Create a selftest that exercises the conflict between page faults and
madvise(MADV_DONTNEED) in the same huge page. Do it by running two
threads that touches the huge page and madvise(MADV_DONTNEED) at the same
time.

In case of a SIGBUS coming at pagefault, the test should fail, since we
hit the bug.

The test doesn't have a signal handler, and if it fails, it fails like
the following

  --
  running ./hugetlb_fault_after_madv
  --
  ./run_vmtests.sh: line 186: 595563 Bus error(core dumped) "$@"
  [FAIL]

This selftest goes together with the fix of the bug[1] itself.

[1] https://lore.kernel.org/all/20231001005659.2185316-1-r...@surriel.com/#r

Signed-off-by: Breno Leitao 
---
 tools/testing/selftests/mm/Makefile   |  1 +
 .../selftests/mm/hugetlb_fault_after_madv.c   | 82 +++
 tools/testing/selftests/mm/run_vmtests.sh |  4 +
 3 files changed, 87 insertions(+)
 create mode 100644 tools/testing/selftests/mm/hugetlb_fault_after_madv.c

diff --git a/tools/testing/selftests/mm/Makefile 
b/tools/testing/selftests/mm/Makefile
index 6a9fc5693145..e71ec9910c62 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -68,6 +68,7 @@ TEST_GEN_FILES += split_huge_page_test
 TEST_GEN_FILES += ksm_tests
 TEST_GEN_FILES += ksm_functional_tests
 TEST_GEN_FILES += mdwe_test
+TEST_GEN_FILES += hugetlb_fault_after_madv
 
 ifneq ($(ARCH),arm64)
 TEST_GEN_PROGS += soft-dirty
diff --git a/tools/testing/selftests/mm/hugetlb_fault_after_madv.c 
b/tools/testing/selftests/mm/hugetlb_fault_after_madv.c
new file mode 100644
index ..d6d38d443840
--- /dev/null
+++ b/tools/testing/selftests/mm/hugetlb_fault_after_madv.c
@@ -0,0 +1,82 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "vm_util.h"
+
+#define MMAP_SIZE (1 << 21)
+#define INLOOP_ITER 100
+
+char *huge_ptr;
+
+/* Touch the memory while it is being madvised() */
+void *touch(void *unused)
+{
+   char *ptr = (char *)huge_ptr;
+
+   if (!ptr) {
+   fprintf(stderr, "Failed to allocate memory\n");
+   perror("");
+   }
+
+   for (int i = 0; i < INLOOP_ITER; i++)
+   ptr[0] = '.';
+
+   return NULL;
+}
+
+void *madv(void *unused)
+{
+   usleep(rand() % 10);
+   if (!huge_ptr)
+   return NULL;
+
+   for (int i = 0; i < INLOOP_ITER; i++)
+   madvise(huge_ptr, MMAP_SIZE, MADV_DONTNEED);
+
+   return NULL;
+}
+
+int main(void)
+{
+   unsigned long free_hugepages;
+   pthread_t thread1, thread2;
+   /*
+* On kernel 6.4, we are able to reproduce the problem with ~1000
+* interactions
+*/
+   int max = 1;
+
+   srand(getpid());
+
+   free_hugepages = get_free_hugepages();
+   if (free_hugepages != 1) {
+   fprintf(stderr,
+   "This test needs one and only one page to execute. Got 
%lu\n",
+   free_hugepages);
+   exit(1);
+   }
+
+   while (max--) {
+   huge_ptr = mmap(NULL, MMAP_SIZE, PROT_READ | PROT_WRITE,
+   MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, -1, 
0);
+
+   if ((unsigned long)huge_ptr == -1) {
+   perror("Failed to allocate\n");
+   continue;
+   }
+
+   pthread_create(&thread1, NULL, madv, NULL);
+   pthread_create(&thread2, NULL, touch, NULL);
+
+   pthread_join(thread1, NULL);
+   pthread_join(thread2, NULL);
+   munmap(huge_ptr, MMAP_SIZE);
+   }
+
+   return 0;
+}
diff --git a/tools/testing/selftests/mm/run_vmtests.sh 
b/tools/testing/selftests/mm/run_vmtests.sh
index 3e2bc818d566..9f53f7318a38 100755
--- a/tools/testing/selftests/mm/run_vmtests.sh
+++ b/tools/testing/selftests/mm/run_vmtests.sh
@@ -221,6 +221,10 @@ CATEGORY="hugetlb" run_test ./hugepage-mremap
 CATEGORY="hugetlb" run_test ./hugepage-vmemmap
 CATEGORY="hugetlb" run_test ./hugetlb-madvise
 
+# For this test, we need one and just one huge page
+echo 1 > /proc/sys/vm/nr_hugepages
+CATEGORY="hugetlb" run_test ./hugetlb_fault_after_madv
+
 if test_selected "hugetlb"; then
echo "NOTE: These hugetlb tests provide minimal coverage.  Use"
echo "  https://github.com/libhugetlbfs/libhugetlbfs.git for"
-- 
2.34.1

Re: [PATCH v2 1/6] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal

2023-10-04 Thread Reinette Chatre

Hi Shaopeng,

On 9/28/2023 1:10 AM, Shaopeng Tan (Fujitsu) wrote:
>> On 9/15/2023 8:44 AM, Ilpo Järvinen wrote:

...

>>> +static void run_mbm_test(const char * const *benchmark_cmd, int
>>> +cpu_no) {
>>> +   int res;
>>> +
>>> +   ksft_print_msg("Starting MBM BW change ...\n");
>>> +
>>> +   if (test_prepare())
>>> +   return;
>>>
>>
>> I am not sure about this. With this exit the kselftest machinery is not 
>> aware of
>> the test passing or failing. I wonder if there should not rather be a "goto" 
>> here
>> that triggers ksft_test_result()? This needs some more thought though. First,
>> with this change test_prepare() officially gains responsibility to determine 
>> if a
>> failure is transient (just a single test
>> fails) or permanent (no use trying any other tests if this fails). For the 
>> former it
>> would then be up to the caller to call ksft_test_result() and for the latter
>> test_prepare() will call ksft_exit_fail_msg().
>> Second, that SNC warning may be an inconvenience with a new goto. Here it
>> may be ok to print that message before the test failure?
> 
> If a failure may be permanent, it may be best to detect it before running all 
> tests, rather than in test_prepare().
> Now some detections are completed before running all tests. For example:
> 273 if (geteuid() != 0)
> 274 return ksft_exit_skip("Not running as root. 
> Skipping...\n");
> 275
> 276 if (!check_resctrlfs_support())
> 277 return ksft_exit_skip("resctrl FS does not exist. Enable 
> X86_CPU_RESCTRL config option.\n");
> 278
> 279 if (umount_resctrlfs())
> 280 return ksft_exit_skip("resctrl FS unmount failed.\n");
> 

You are correct that the tests should aim to detect as early as possible if
no test has a chance of succeeding. This is covered in the checks you mention.
The purpose of test_prepare()/test_cleanup() pair is to perform actions that
should be done for every test. For example, resctrl is mounted before each
test and unmounted after each test. Since these actions are required to be done
for every test it cannot be a single call before all tests are run.

It may be possible to add a test_prepare() directly followed by a test_cleanup()
before any test is run to be more explicit about early detection but that
does not seem necessary considering the checks would be done anyway when the
first test is run. Even when doing so it would not eliminate the need for 
test_prepare()/test_cleanup() to form part of every test run and needing to exit
if, for example, a previous test triggered a fault preventing resctrl from
being mounted.

Reinette

Re: [GIT PULL] Kselftest fixes update for Linux 6.6-rc5

2023-10-04 Thread pr-tracker-bot

The pull request you sent on Wed, 4 Oct 2023 09:46:00 -0600:

> git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest 
> tags/linux-kselftest-fixes-6.6-rc5

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/ba7d997a2a29ee3fa766fee912c65796e0c21903

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

Re: [PATCH-cgroup v2] cgroup/cpuset: Enable invalid to valid local partition transition

2023-10-04 Thread Tejun Heo

On Tue, Oct 03, 2023 at 10:44:20AM -0400, Waiman Long wrote:
> When a local partition becomes invalid, it won't transition back to
> valid partition automatically if a proper "cpuset.cpus.exclusive" or
> "cpuset.cpus" change is made. Instead, system administrators have to
> explicitly echo "root" or "isolated" into the "cpuset.cpus.partition"
> file at the partition root.
> 
> This patch now enables the automatic transition of an invalid local
> partition back to valid when there is a proper "cpuset.cpus.exclusive"
> or "cpuset.cpus" change.
> 
> Automatic transition of an invalid remote partition to a valid one,
> however, is not covered by this patch. They still need an explicit
> write to "cpuset.cpus.partition" to become valid again.
> 
> The test_cpuset_prs.sh test script is updated to add new test cases to
> test this automatic state transition.
> 
> Reported-by: Pierre Gondois 
> Link: 
> https://lore.kernel.org/lkml/9777f0d2-2fdf-41cb-bd01-19c52939e...@arm.com
> Signed-off-by: Waiman Long 

Applied to cgroup/for-6.7.

Thanks.

-- 
tejun

Re: [PATCH bpf 0/3] libbpf/selftests syscall wrapper fixes for RISC-V

2023-10-04 Thread patchwork-bot+netdevbpf

Hello:

This series was applied to bpf/bpf-next.git (master)
by Andrii Nakryiko :

On Wed,  4 Oct 2023 13:09:02 +0200 you wrote:
> From: Björn Töpel 
> 
> Commit 08d0ce30e0e4 ("riscv: Implement syscall wrappers") introduced
> some regressions in libbpf, and the kselftests BPF suite, which are
> fixed with these three patches.
> 
> Note that there's an outstanding fix [1] for ftrace syscall tracing
> which is also a fallout from the commit above.
> 
> [...]

Here is the summary with links:
  - [bpf,1/3] libbpf: Fix syscall access arguments on riscv
https://git.kernel.org/bpf/bpf-next/c/8a412c5c1cd6
  - [bpf,2/3] selftests/bpf: Define SYS_PREFIX for riscv
https://git.kernel.org/bpf/bpf-next/c/0f2692ee4324
  - [bpf,3/3] selftests/bpf: Define SYS_NANOSLEEP_KPROBE_NAME for riscv
https://git.kernel.org/bpf/bpf-next/c/b55b775f0316

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html

Re: [PATCH bpf-next 0/3] selftest/bpf, riscv: Improved cross-building support

2023-10-04 Thread patchwork-bot+netdevbpf

Hello:

This series was applied to bpf/bpf-next.git (master)
by Andrii Nakryiko :

On Wed,  4 Oct 2023 14:27:18 +0200 you wrote:
> From: Björn Töpel 
> 
> Yet another "more cross-building support for RISC-V" series.
> 
> An example how to invoke a gen_tar build:
> 
>   | make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- CC=riscv64-linux-gnu-gcc 
> \
>   |HOSTCC=gcc O=/workspace/kbuild FORMAT= \
>   |SKIP_TARGETS="arm64 ia64 powerpc sparc64 x86 sgx" -j $(($(nproc)-1)) \
>   |-C tools/testing/selftests gen_tar
> 
> [...]

Here is the summary with links:
  - [bpf-next,1/3] selftests/bpf: Add cross-build support for urandom_read et al
https://git.kernel.org/bpf/bpf-next/c/97a79e502e25
  - [bpf-next,2/3] selftests/bpf: Enable lld usage for RISC-V
https://git.kernel.org/bpf/bpf-next/c/72fae6319962
  - [bpf-next,3/3] selftests/bpf: Add uprobe_multi to gen_tar target
https://git.kernel.org/bpf/bpf-next/c/e096ab9d9f45

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html

[PATCH] kunit: Reset suite counter right before running tests

2023-10-04 Thread Michal Wajdeczko

Today we reset the suite counter as part of the suite cleanup,
called from the module exit callback, but it might not work that
well as one can try to collect results without unloading a previous
test (either unintentionally or due to dependencies).

For easy reproduction try to load the kunit-test.ko and then
collect and parse results from the kunit-example-test.ko load.
Parser will complain about mismatch of expected test number:

[ ] KTAP version 1
[ ] 1..1
[ ] # example: initializing suite
[ ] KTAP version 1
[ ] # Subtest: example
..
[ ] # example: pass:5 fail:0 skip:4 total:9
[ ] # Totals: pass:6 fail:0 skip:6 total:12
[ ] ok 7 example

[ ] [ERROR] Test: example: Expected test number 1 but found 7
[ ] = [PASSED] example =
[ ] 
[ ] Testing complete. Ran 12 tests: passed: 6, skipped: 6, errors: 1

Since we are now printing suite test plan on every module load,
right before running suite tests, we should make sure that suite
counter will also start from 1. Easiest solution seems to be move
counter reset to the __kunit_test_suites_init() function.

Signed-off-by: Michal Wajdeczko 
Cc: David Gow 
Cc: Rae Moar 
---
 lib/kunit/test.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index f2eb71f1a66c..9325d309ed82 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -670,6 +670,8 @@ int __kunit_test_suites_init(struct kunit_suite * const * 
const suites, int num_
return 0;
}
 
+   kunit_suite_counter = 1;
+
static_branch_inc(&kunit_running);
 
for (i = 0; i < num_suites; i++) {
@@ -696,8 +698,6 @@ void __kunit_test_suites_exit(struct kunit_suite **suites, 
int num_suites)
 
for (i = 0; i < num_suites; i++)
kunit_exit_suite(suites[i]);
-
-   kunit_suite_counter = 1;
 }
 EXPORT_SYMBOL_GPL(__kunit_test_suites_exit);
 
-- 
2.25.1

Re: [PATCH v2 1/2] kunit: add ability to run tests after boot using debugfs

2023-10-04 Thread Rae Moar

On Thu, Sep 14, 2023 at 5:06 AM David Gow  wrote:
>
> On Sat, 9 Sept 2023 at 05:31, Rae Moar  wrote:
> >
> > Add functionality to run built-in tests after boot by writing to a
> > debugfs file.
> >
> > Add a new debugfs file labeled "run" for each test suite to use for
> > this purpose.
> >
> > As an example, write to the file using the following:
> >
> > echo "any string" > /sys/kernel/debugfs/kunit//run
> >
> > This will trigger the test suite to run and will print results to the
> > kernel log.
> >
> > Note that what you "write" to the debugfs file will not be saved.
> >
> > To guard against running tests concurrently with this feature, add a
> > mutex lock around running kunit. This supports the current practice of
> > not allowing tests to be run concurrently on the same kernel.
> >
> > This functionality may not work for all tests.
> >
> > This new functionality could be used to design a parameter
> > injection feature in the future.
> >
> > Signed-off-by: Rae Moar 
> > ---
>
> This is looking pretty good, but I have a few nitpicks below and one big 
> issue.
>
> The big issue is that this doesn't seem to exclude test suites created
> with kunit_test_init_section_suite{,s}(). The init section versions of
> the suite declarations, by definition, won't work if run after the
> kernel has finished booting. At the moment, these macros just pass
> through to the normal versions (because we've not been able to run
> after boot until now), but we'll need to implement it (maybe as a
> separate linker section, maybe as an attribute, etc) now. I expect
> that the correct solution here would be to not create the 'run'
> debugfs file for these tests. But I could be convinced to have it
> exist, but to just say "this test cannot be run after boot" if you've
> got a good argument. In any case, grep 'test.h' for "NOTE TO KUNIT
> DEVS" and you'll see the details.
>
> My one other not-totally-related thought (and this extends to module
> loading, too, so is possibly more useful as a separate patch) is that
> we're continually incrementing the test number still. This doesn't
> matter if we read the results from debugfs though, and it may make
> more sense to have this continue to increment (and thus treat all of
> dmesg as one long KTAP document). We could always add a reset option
> to debugfs in a follow-up patch if we want. But that's not something
> I'd hold this up with.
>

Hello!

Sorry for the delay in this response. I was working on other items but
I have started working on the next version of this patch.

Thanks for bringing my attention to the init tests. I am currently
working on a draft to remove the run files for these tests. However,
if that does not work, I will resort to outputting the message you
detailed above: "this test cannot be run after boot".

I am currently fine with the test number incrementing. However, I
would also be fine to implement a reset to ensure all the re-run
results have the test number of 1.

> >
> > Changes since v1:
> > - Removed second patch as this problem has been fixed
> > - Added Documentation patch
> > - Made changes to work with new dynamically-extending log feature
> >
> > Note that these patches now rely on (and are rebased on) the patch series:
> > https://lore.kernel.org/all/20230828104111.2394344-1...@opensource.cirrus.com/
> >
> >  lib/kunit/debugfs.c | 66 +
> >  lib/kunit/test.c| 13 +
> >  2 files changed, 79 insertions(+)
> >
> > diff --git a/lib/kunit/debugfs.c b/lib/kunit/debugfs.c
> > index 270d185737e6..8c0a970321ce 100644
> > --- a/lib/kunit/debugfs.c
> > +++ b/lib/kunit/debugfs.c
> > @@ -8,12 +8,14 @@
> >  #include 
> >
> >  #include 
> > +#include 
> >
> >  #include "string-stream.h"
> >  #include "debugfs.h"
> >
> >  #define KUNIT_DEBUGFS_ROOT "kunit"
> >  #define KUNIT_DEBUGFS_RESULTS  "results"
> > +#define KUNIT_DEBUGFS_RUN  "run"
> >
> >  /*
> >   * Create a debugfs representation of test suites:
> > @@ -21,6 +23,8 @@
> >   * PathSemantics
> >   * /sys/kernel/debug/kunit//results Show results of last run for
> >   * testsuite
> > + * /sys/kernel/debug/kunit//run Write to this file to 
> > trigger
> > + * testsuite to run
> >   *
> >   */
> >
> > @@ -99,6 +103,51 @@ static int debugfs_results_open(struct inode *inode, 
> > struct file *file)
> > return single_open(file, debugfs_print_results, suite);
> >  }
> >
> > +/*
> > + * Print a usage message to the debugfs "run" file
> > + * (/sys/kernel/debug/kunit//run) if opened.
> > + */
> > +static int debugfs_print_run(struct seq_file *seq, void *v)
> > +{
> > +   struct kunit_suite *suite = (struct kunit_suite *)seq->private;
> > +
> > +   seq_puts(seq, "Write to this file to trigger the test suite to 
> > run.\n");
> > +   seq_printf(seq, "usage: echo \"any stri

Re: [PATCH v2 2/2] Documentation: Add debugfs docs with run after boot

2023-10-04 Thread Rae Moar

On Thu, Sep 14, 2023 at 5:06 AM David Gow  wrote:
>
> On Sat, 9 Sept 2023 at 05:32, Rae Moar  wrote:
> >
> > Expand the documentation on the KUnit debugfs filesystem on the
> > run_manual.rst page.
> >
> > Add section describing how to access results using debugfs.
> >
> > Add section describing how to run tests after boot using debugfs.
> >
> > Signed-off-by: Rae Moar 
> > Co-developed-by: Sadiya Kazi 
> > Signed-off-by: Sadiya Kazi 
> > ---
>
> Looks good to me, a few nitpicks, and the fact that we'll probably
> need to add something about init section suites when those are
> implemented.
>
> (Also, since you sent the email, your sign off should be at the bottom
> of the list above.)

Hello!

Thanks for the comments! Sorry about the Signed-off order. I will
change that for next time.

>
> >  Documentation/dev-tools/kunit/run_manual.rst | 45 ++--
> >  1 file changed, 41 insertions(+), 4 deletions(-)
> >
> > diff --git a/Documentation/dev-tools/kunit/run_manual.rst 
> > b/Documentation/dev-tools/kunit/run_manual.rst
> > index e7b46421f247..613385c5ba5b 100644
> > --- a/Documentation/dev-tools/kunit/run_manual.rst
> > +++ b/Documentation/dev-tools/kunit/run_manual.rst
> > @@ -49,9 +49,46 @@ loaded.
> >
> >  The results will appear in TAP format in ``dmesg``.
> >
> > +debugfs
> > +===
> > +
> > +``debugfs`` is a file system that enables user interaction with the files 
> > to
> > +make kernel information available to user space. A user can interact with
> > +the debugfs filesystem using a variety of file operations, such as open,
> > +read, and write.
> > +
> > +By default, only the root user has access to the debugfs directory.
>
> These two paragraphs are probably a bit excessive: we want to focus on
> what KUnit can do with debugfs, not describing what debugfs is as a
> whole (which is best left to pages like
> Documentation/filesystems/debugfs.rst )

Got it. Maybe I should just leave the first sentence and then link to
../debugfs.rst.

>
> > +
> > +If ``CONFIG_KUNIT_DEBUGFS`` is enabled, you can use KUnit debugfs
> > +filesystem to perform the following actions.
> > +
> > +Retrieve Test Results
> > +=
> > +
> > +You can use debugfs to retrieve KUnit test results. The test results are
> > +accessible from the debugfs filesystem in the following read-only file:
> > +
> > +.. code-block :: bash
> > +
> > +   /sys/kernel/debug/kunit//results
> > +
> > +The test results are available in KTAP format.
>
> We _could_ mention that this is a separate KTAP document (i.e., the
> numbering starts at 1), though it may be obvious.
>
> > +
> > +Run Tests After Kernel Has Booted
> > +=
> > +
> > +You can use the debugfs filesystem to trigger built-in tests to run after
> > +boot. To run the test suite, you can use the following command to write to
> > +the ``/sys/kernel/debug/kunit//run`` file:
> > +
> > +.. code-block :: bash
> > +
> > +   echo "any string" > /sys/kernel/debugfs/kunit//run
> > +
> > +As a result, the test suite runs and the results are printed to the kernel
> > +log.
> > +
> >  .. note ::
> >
> > -   If ``CONFIG_KUNIT_DEBUGFS`` is enabled, KUnit test results will
> > -   be accessible from the ``debugfs`` filesystem (if mounted).
> > -   They will be in ``/sys/kernel/debug/kunit//results``, in
> > -   TAP format.
> > +   The contents written to the debugfs file
> > +   ``/sys/kernel/debug/kunit//run`` are not saved.
>
> This is possibly a bit obvious. Maybe it'd be more useful with a bit
> more context, e.g., "The contents written to the file ... are
> discarded; it is the act of writing which triggers the test, not the
> specific contents written."?

I will try to add more context here in the next version.

>
> It might be worth having a note that tests cannot run concurrently, so
> this may block or fail.
>
> Equally, it may be worth having a note for test authors, that their
> tests will need to correctly initialise and/or clean up any data, so
> the test runs correctly a second time.
>

Yes these are two good points. I will add notes on tests not being
able to run concurrently, cleaning up data, and also init tests.

>
> > --
> > 2.42.0.283.g2d96d420d3-goog
> >

Re: [PATCH 2/2] selftests/mm: Add a new test for madv and hugetlb

2023-10-04 Thread Rik van Riel

On Wed, 2023-10-04 at 10:11 -0700, Breno Leitao wrote:
> 
> +char *huge_ptr;
> +
> +/* Touch the memory while it is being madvised() */
> +void *touch(void *unused)
> +{
> +   char *ptr = (char *)huge_ptr;
> +
> +   if (!ptr) {
> +   fprintf(stderr, "Failed to allocate memory\n");
> +   perror("");
> +   }

I'm not sure this error message makes a lot of sense
away from where the huge page gets allocated.

> 
> +   while (max--) {
> +   huge_ptr = mmap(NULL, MMAP_SIZE, PROT_READ |
> PROT_WRITE,
> +   MAP_PRIVATE | MAP_ANONYMOUS |
> MAP_HUGETLB, -1, 0);
> +
> +   if ((unsigned long)huge_ptr == -1) {
> +   perror("Failed to allocate\n");
> +   continue;
> +   }

Should the test case just exit with an error here, when
the allocation fails?

Looping around when it cannot get memory seems pointless,
but telling the user that the allocation fails, when it
should clearly have succeeded could be useful.

This test case certainly seems to do the trick in showing
whether the race between MADV_DONTNEED and page faults
exists in a particular kernel.

-- 
All Rights Reversed.

Re: [PATCH 0/5] KVM: x86: Fix breakage in KVM_SET_XSAVE's ABI

2023-10-04 Thread Sean Christopherson

On Wed, 27 Sep 2023 17:19:51 -0700, Sean Christopherson wrote:
> Rework how KVM limits guest-unsupported xfeatures to effectively hide
> only when saving state for userspace (KVM_GET_XSAVE), i.e. to let userspace
> load all host-supported xfeatures (via KVM_SET_XSAVE) irrespective of
> what features have been exposed to the guest.
> 
> The effect on KVM_SET_XSAVE was knowingly done by commit ad856280ddea
> ("x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0"):
> 
> [...]

Applied to kvm-x86 fpu, even though there is still ongoing discussion.  I want
to get this exposure in -next sooner than later.  I'll keep this in its own
branch so it'll be easier to rewrite/discard if necessary.

[1/5] x86/fpu: Allow caller to constrain xfeatures when copying to uabi buffer
  https://github.com/kvm-x86/linux/commit/2d287ec65e79
[2/5] KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}
  https://github.com/kvm-x86/linux/commit/27526efb5cff
[3/5] KVM: selftests: Touch relevant XSAVE state in guest for state test
  https://github.com/kvm-x86/linux/commit/ff0654c71fb6
[4/5] KVM: selftests: Load XSAVE state into untouched vCPU during state test
  https://github.com/kvm-x86/linux/commit/d7b8762ec4a3
[5/5] KVM: selftests: Force load all supported XSAVE state in state test
  https://github.com/kvm-x86/linux/commit/afb2c7e27a7f

--
https://github.com/kvm-x86/linux/tree/next

40 matches

Mail list logo