On 15.03.2025 01:36, Volodymyr Babchuk wrote: > LibAFL, which is a part of AFL++ project is a instrument that allows > us to perform fuzzing on beremetal code (Xen hypervisor in this case) > using QEMU as an emulator. It employs QEMU's ability to create > snapshots to run many tests relatively quickly: system state is saved > right before executing a new test and restored after the test is > finished. > > This patch adds all necessary plumbing to run aarch64 build of Xen > inside that LibAFL-QEMU fuzzer. From the Xen perspective we need to > do following things: > > 1. Able to communicate with LibAFL-QEMU fuzzer. This is done by > executing special opcodes, that only LibAFL-QEMU can handle. > > 2. Use interface from p.1 to tell the fuzzer about code Xen section, > so fuzzer know which part of code to track and gather coverage data. > > 3. Report fuzzer about crash. This is done in panic() function. > > 4. Prevent test harness from shooting itself in knee. > > Right now test harness is an external component, because we want to > test external Xen interfaces, but it is possible to fuzz internal code > if we want to. > > Test harness is implemented XTF-based test-case(s). As test harness > can issue hypercall that shuts itself down, KConfig option > CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING was added. It basically tells > fuzzer that test was completed successfully if Dom0 tries to shut > itself (or the whole machine) down. > > Signed-off-by: Volodymyr Babchuk <volodymyr_babc...@epam.com> > > --- > > I tried to fuzz the vGIC emulator and hypercall interface. While vGIC > fuzzing didn't yield any interesting results, hypercall fuzzing found a > way to crash the hypervisor from Dom0 on aarch64, using > "XEN_SYSCTL_page_offline_op" with "sysctl_query_page_offline" sub-op, > because it leads to page_is_ram_type() call which is marked > UNREACHABLE on ARM. > > In v2: > > - Moved to XTF-based test harness > - Severely reworked the fuzzer itself. Now it has user-friendly > command-line interface and is capable of running in CI, as it now > returns an appropriate error code if any faults were found > - Also I found, debugged and fixed a nasty bug in LibAFL-QEMU fork, > which crashed the whole fuzzer. > > Right now the fuzzer is lockated at Xen Troops repo: > > https://github.com/xen-troops/xen-fuzzer-rs > > But I believe that it is ready to be included into > gitlab.com/xen-project/ > > XTF-based harness is at > > https://gitlab.com/vlad.babchuk/xtf/-/tree/mr_libafl > > and there is corresponding MR for including it into > > https://gitlab.com/xen-project/fusa/xtf/-/tree/xtf-arm > > So, to sum up. All components are basically ready for initial > inclusion. There will be smaller, integration-related changes > later. For example - we will need to update URLs for various > components after they are moved to correct places. > --- > docs/hypervisor-guide/fuzzing.rst | 90 ++++++++++++ > xen/arch/arm/Kconfig.debug | 26 ++++ > xen/arch/arm/Makefile | 1 + > xen/arch/arm/include/asm/libafl_qemu.h | 54 +++++++ > xen/arch/arm/include/asm/libafl_qemu_defs.h | 37 +++++ > xen/arch/arm/libafl_qemu.c | 152 ++++++++++++++++++++ > xen/arch/arm/psci.c | 13 ++ > xen/common/sched/core.c | 17 +++ > xen/common/shutdown.c | 7 + > xen/drivers/char/console.c | 8 ++ > 10 files changed, 405 insertions(+) > create mode 100644 docs/hypervisor-guide/fuzzing.rst > create mode 100644 xen/arch/arm/include/asm/libafl_qemu.h > create mode 100644 xen/arch/arm/include/asm/libafl_qemu_defs.h > create mode 100644 xen/arch/arm/libafl_qemu.c
This looks to be about Arm only, which would be nice if that was visible right from the subject. Also, nit: New files' names are to use dashes in favor of underscores. > --- a/xen/common/sched/core.c > +++ b/xen/common/sched/core.c > @@ -47,6 +47,10 @@ > #define pv_shim false > #endif > > +#ifdef CONFIG_LIBAFL_QEMU_FUZZER > +#include <asm/libafl_qemu.h> > +#endif > + > /* opt_sched: scheduler - default to configured value */ > static char __initdata opt_sched[10] = CONFIG_SCHED_DEFAULT; > string_param("sched", opt_sched); > @@ -1452,6 +1456,10 @@ static long do_poll(const struct sched_poll > *sched_poll) > if ( !guest_handle_okay(sched_poll->ports, sched_poll->nr_ports) ) > return -EFAULT; > > +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING > + libafl_qemu_end(LIBAFL_QEMU_END_OK); > +#endif > + > set_bit(_VPF_blocked, &v->pause_flags); > v->poll_evtchn = -1; > set_bit(v->vcpu_id, d->poll_mask); > @@ -1904,12 +1912,18 @@ ret_t do_sched_op(int cmd, > XEN_GUEST_HANDLE_PARAM(void) arg) > { > case SCHEDOP_yield: > { > +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING > + libafl_qemu_end(LIBAFL_QEMU_END_OK); > +#endif > ret = vcpu_yield(); > break; > } > > case SCHEDOP_block: > { > +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING > + libafl_qemu_end(LIBAFL_QEMU_END_OK); > +#endif > vcpu_block_enable_events(); > break; > } > @@ -1924,6 +1938,9 @@ ret_t do_sched_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) > arg) > > TRACE_TIME(TRC_SCHED_SHUTDOWN, current->domain->domain_id, > current->vcpu_id, sched_shutdown.reason); > +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING > + libafl_qemu_end(LIBAFL_QEMU_END_OK); > +#endif > ret = domain_shutdown(current->domain, (u8)sched_shutdown.reason); > > break; If I was a scheduler maintainer, I'd likely object to this kind of #ifdef-ary. > --- a/xen/common/shutdown.c > +++ b/xen/common/shutdown.c > @@ -11,6 +11,10 @@ > #include <xen/kexec.h> > #include <public/sched.h> > > +#ifdef CONFIG_LIBAFL_QEMU_FUZZER > +#include <asm/libafl_qemu.h> > +#endif > + > /* opt_noreboot: If true, machine will need manual reset on error. */ > bool __ro_after_init opt_noreboot; > boolean_param("noreboot", opt_noreboot); > @@ -32,6 +36,9 @@ static void noreturn reboot_or_halt(void) > > void hwdom_shutdown(unsigned char reason) > { > +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING > + libafl_qemu_end(LIBAFL_QEMU_END_OK); > +#endif > switch ( reason ) > { > case SHUTDOWN_poweroff: It's not as bad here and ... > --- a/xen/drivers/char/console.c > +++ b/xen/drivers/char/console.c > @@ -40,6 +40,9 @@ > #ifdef CONFIG_SBSA_VUART_CONSOLE > #include <asm/vpl011.h> > #endif > +#ifdef CONFIG_LIBAFL_QEMU_FUZZER > +#include <asm/libafl_qemu.h> > +#endif > > /* console: comma-separated list of console outputs. */ > static char __initdata opt_console[30] = OPT_CONSOLE_STR; > @@ -1289,6 +1292,11 @@ void panic(const char *fmt, ...) > > kexec_crash(CRASHREASON_PANIC); > > + #ifdef CONFIG_LIBAFL_QEMU_FUZZER > + /* Tell the fuzzer that we crashed */ > + libafl_qemu_end(LIBAFL_QEMU_END_CRASH); > + #endif ... here, but still. Also, pre-processor directives want their # to live at the beginning of the line. Jan