On 15.03.2025 01:36, Volodymyr Babchuk wrote:
> LibAFL, which is a part of AFL++ project is a instrument that allows
> us to perform fuzzing on beremetal code (Xen hypervisor in this case)
> using QEMU as an emulator. It employs QEMU's ability to create
> snapshots to run many tests relatively quickly: system state is saved
> right before executing a new test and restored after the test is
> finished.
> 
> This patch adds all necessary plumbing to run aarch64 build of Xen
> inside that LibAFL-QEMU fuzzer. From the Xen perspective we need to
> do following things:
> 
> 1. Able to communicate with LibAFL-QEMU fuzzer. This is done by
> executing special opcodes, that only LibAFL-QEMU can handle.
> 
> 2. Use interface from p.1 to tell the fuzzer about code Xen section,
> so fuzzer know which part of code to track and gather coverage data.
> 
> 3. Report fuzzer about crash. This is done in panic() function.
> 
> 4. Prevent test harness from shooting itself in knee.
> 
> Right now test harness is an external component, because we want to
> test external Xen interfaces, but it is possible to fuzz internal code
> if we want to.
> 
> Test harness is implemented XTF-based test-case(s). As test harness
> can issue hypercall that shuts itself down, KConfig option
> CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING was added. It basically tells
> fuzzer that test was completed successfully if Dom0 tries to shut
> itself (or the whole machine) down.
> 
> Signed-off-by: Volodymyr Babchuk <volodymyr_babc...@epam.com>
> 
> ---
> 
> I tried to fuzz the vGIC emulator and hypercall interface. While vGIC
> fuzzing didn't yield any interesting results, hypercall fuzzing found a
> way to crash the hypervisor from Dom0 on aarch64, using
> "XEN_SYSCTL_page_offline_op" with "sysctl_query_page_offline" sub-op,
> because it leads to page_is_ram_type() call which is marked
> UNREACHABLE on ARM.
> 
> In v2:
> 
>  - Moved to XTF-based test harness
>  - Severely reworked the fuzzer itself. Now it has user-friendly
>    command-line interface and is capable of running in CI, as it now
>    returns an appropriate error code if any faults were found
>  - Also I found, debugged and fixed a nasty bug in LibAFL-QEMU fork,
>    which crashed the whole fuzzer.
> 
> Right now the fuzzer is lockated at Xen Troops repo:
> 
> https://github.com/xen-troops/xen-fuzzer-rs
> 
> But I believe that it is ready to be included into
> gitlab.com/xen-project/
> 
> XTF-based harness is at
> 
> https://gitlab.com/vlad.babchuk/xtf/-/tree/mr_libafl
> 
> and there is corresponding MR for including it into
> 
> https://gitlab.com/xen-project/fusa/xtf/-/tree/xtf-arm
> 
> So, to sum up. All components are basically ready for initial
> inclusion. There will be smaller, integration-related changes
> later. For example - we will need to update URLs for various
> components after they are moved to correct places.
> ---
>  docs/hypervisor-guide/fuzzing.rst           |  90 ++++++++++++
>  xen/arch/arm/Kconfig.debug                  |  26 ++++
>  xen/arch/arm/Makefile                       |   1 +
>  xen/arch/arm/include/asm/libafl_qemu.h      |  54 +++++++
>  xen/arch/arm/include/asm/libafl_qemu_defs.h |  37 +++++
>  xen/arch/arm/libafl_qemu.c                  | 152 ++++++++++++++++++++
>  xen/arch/arm/psci.c                         |  13 ++
>  xen/common/sched/core.c                     |  17 +++
>  xen/common/shutdown.c                       |   7 +
>  xen/drivers/char/console.c                  |   8 ++
>  10 files changed, 405 insertions(+)
>  create mode 100644 docs/hypervisor-guide/fuzzing.rst
>  create mode 100644 xen/arch/arm/include/asm/libafl_qemu.h
>  create mode 100644 xen/arch/arm/include/asm/libafl_qemu_defs.h
>  create mode 100644 xen/arch/arm/libafl_qemu.c

This looks to be about Arm only, which would be nice if that was visible
right from the subject.

Also, nit: New files' names are to use dashes in favor of underscores.

> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -47,6 +47,10 @@
>  #define pv_shim false
>  #endif
>  
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +#include <asm/libafl_qemu.h>
> +#endif
> +
>  /* opt_sched: scheduler - default to configured value */
>  static char __initdata opt_sched[10] = CONFIG_SCHED_DEFAULT;
>  string_param("sched", opt_sched);
> @@ -1452,6 +1456,10 @@ static long do_poll(const struct sched_poll 
> *sched_poll)
>      if ( !guest_handle_okay(sched_poll->ports, sched_poll->nr_ports) )
>          return -EFAULT;
>  
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
> +
>      set_bit(_VPF_blocked, &v->pause_flags);
>      v->poll_evtchn = -1;
>      set_bit(v->vcpu_id, d->poll_mask);
> @@ -1904,12 +1912,18 @@ ret_t do_sched_op(int cmd, 
> XEN_GUEST_HANDLE_PARAM(void) arg)
>      {
>      case SCHEDOP_yield:
>      {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +        libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
>          ret = vcpu_yield();
>          break;
>      }
>  
>      case SCHEDOP_block:
>      {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +        libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
>          vcpu_block_enable_events();
>          break;
>      }
> @@ -1924,6 +1938,9 @@ ret_t do_sched_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) 
> arg)
>  
>          TRACE_TIME(TRC_SCHED_SHUTDOWN, current->domain->domain_id,
>                     current->vcpu_id, sched_shutdown.reason);
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +        libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
>          ret = domain_shutdown(current->domain, (u8)sched_shutdown.reason);
>  
>          break;

If I was a scheduler maintainer, I'd likely object to this kind of #ifdef-ary.

> --- a/xen/common/shutdown.c
> +++ b/xen/common/shutdown.c
> @@ -11,6 +11,10 @@
>  #include <xen/kexec.h>
>  #include <public/sched.h>
>  
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +#include <asm/libafl_qemu.h>
> +#endif
> +
>  /* opt_noreboot: If true, machine will need manual reset on error. */
>  bool __ro_after_init opt_noreboot;
>  boolean_param("noreboot", opt_noreboot);
> @@ -32,6 +36,9 @@ static void noreturn reboot_or_halt(void)
>  
>  void hwdom_shutdown(unsigned char reason)
>  {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
>      switch ( reason )
>      {
>      case SHUTDOWN_poweroff:

It's not as bad here and ...

> --- a/xen/drivers/char/console.c
> +++ b/xen/drivers/char/console.c
> @@ -40,6 +40,9 @@
>  #ifdef CONFIG_SBSA_VUART_CONSOLE
>  #include <asm/vpl011.h>
>  #endif
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +#include <asm/libafl_qemu.h>
> +#endif
>  
>  /* console: comma-separated list of console outputs. */
>  static char __initdata opt_console[30] = OPT_CONSOLE_STR;
> @@ -1289,6 +1292,11 @@ void panic(const char *fmt, ...)
>  
>      kexec_crash(CRASHREASON_PANIC);
>  
> +    #ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +    /* Tell the fuzzer that we crashed */
> +    libafl_qemu_end(LIBAFL_QEMU_END_CRASH);
> +    #endif

... here, but still.

Also, pre-processor directives want their # to live at the beginning of the
line.

Jan

Reply via email to