Hi,

here a new version of my proposal for fuzzing the hypervisor. The original
can be found here: [1].

==================================
1. Motivation and Description
==================================
Fuzzing is a recent trend for systematic testing of interfaces by trying
more or less random inputs and iterating over them. A subset of fuzzers
uses code-coverage as feedback when permuting and choosing inputs, among
them the popular user-space fuzzer American Fuzzy Lop. Recently there have
been attempts to port fuzzers to the kernel and in a similar manner should
now the hypercall-interface of Xen be tested.

While this is overall a very comprehensive problem, this project will help
to develop a better understanding of the problem space and make at least
first advances of the source tree into the necessary direction. A generic
mechanism will be implemented allowing fuzzers to obtain feedback on
code-coverage. In the next step this output will be further processed in
order to actually run a fuzzer (in particular AFL), although there might
not be sufficient time to commit this to the source tree.

==================================
2. Fuzzing user-space programs vs. fuzzing a hypervisor
==================================
An iteration in the fuzzing-cycle normally involves the following:

1. Fuzzer creates new test case based on tracing output
2. Fuzzer starts binary with a test case
3. Test case is executed and program counters are traced into shared memory
with fuzzer
4. Repeat from 1.

Being not a user-space program, it is however not possible to instrument
the binary using the methods applied by AFL. Instead, program counters need
to be extracted manually and post-processed into the form expected by AFL.
As Xen doesn't just take binary input there further needs to be some other
format for test cases and executing them.

In order to fuzz the hypervisor, at least the following steps are required:

1. Let the fuzzer create a new test case
2. Drive a domU to execute the test cases of the fuzzer and trace execution
path
3. Parse the execution path into a format consumable by a user-space fuzzer
4. Repeat from 1.

In comparison to AFL, the notion of a test case when fuzzing the hypervisor
is different. In AFL, the program is restarted for every test case (or at
least the initial state is restored using the fork-server), while in Xen,
all hypercalls are made on the same hypervisor (at least until it crashes),
such that even though AFL might run multiple times, the test case would be
the collection of all hypercalls made since booting.

Also, domains can only have a single vCPU as determinism is needed in
tracing output. For the same reason, there will initially also only be a
single AFL-instance run at a time. However, this can easily be extended
later on.

==================================
3. Implementation Plan
==================================

==================================
3.1 Tracing
==================================
The tracing has mostly already been implemented in [2] and has been
discussed in the original version of this proposal [1].

The approach is similar to KCOV, a coverage collection tool in the Linux
kernel and is well-summarised in the following description from the
original patch [3]:

kcov does not aim to collect as much coverage as possible. It aims
to collect more or less stable coverage that is function of syscall
inputs. To achieve this goal it does not collect coverage in
soft/hard interrupts and instrumentation of some inherently
non-deterministic or non-interesting parts of kernel is disbled
(e.g. scheduler, locking).

Some adjustments in regard to what to include are still to be made based on
more extensive testing. Originally supposed to happen on a directory-level,
that turned out to be not granular enough, sucht that coverage was instead
restricted to certain files in the common/-directory.

==================================
3.2 Distinction between executor and fuzzer
==================================
Xen doesn't take any binary input directly, so in order to fuzz the
hypercall-interface a middle-program needs to created, fed with the test
cases produced by the fuzzer and making hypercalls. This is done by a
Xen-guest. In order to increase determinism (by not making hypercalls that
weren't explicitly instructed) and speed (by being lean) XTF was chosen. If
a little more features were needed Mini-OS could be chosen, but for now
this isn't necessary.

As mentioned, test cases are not actually separated in this setup. It would
thus not be helpful to actually have separate XTF files for every iteration
of AFL. Further, it would reduce speed by forcing recompilation in every
iteration. AFL would have troubles with producing valid C syntax anyways,
so this approach isn't taken. Instead, the executor is a server, waiting
for input from the fuzzer about which hypercalls to make with which
arguments, parsing a custom syntax.

==================================
3.3 Fuzzer
==================================
The idea is to create some dictionary from which the fuzzer builds test
cases, each line looking like :

<function name> arguments

Also, the test cases need to be written to a file before actually being
applied in order to survive a crash. As pointed out by Andy, the file
system should have caching disabled.

The binary of the fuzzer also needs to be adjusted in order to accommodate
the fact that not the usual binary is instrumented. The particular code
here for AFL is all found in afl-fuzz.c. The function run_target() is
normally responsible for starting the binary, but instead will have to
communicate with the executor. The buffer trace_bits has to be filled with
the binary tracing output.

The fuzzer will initially be run in deterministic mode.

==================================
3.4 Executor
==================================
The executor will then try to read out hypercalls and execute them. It will
also need some encoded information about the arguments expected for each
hypercall, such that it can pass on valid buffers (if it wants to, one
could also imagine that a test case is to pass in completely invalid data).

==================================
3.5 Communication between executor and fuzzer
==================================
Fuzzer and executor need to communicate at least test cases and tracing
output. For this it was considered to have a shared memory region as in the
original AFL for the tracing output (using grant tables) and to execute the
test case from a file. Signals could be used via event-channels to inform
the executor of the arrival of a new test case and the fuzzer of an ended
test case, allowing a nice event-driven approach without polling. As XTF
doesn't support any of this, this idea was overthrown in favour of just
using the Xen console, which still should allow both.

==================================
4 Flow diagram
==================================

+-----------------------------------------+
            +------------------------------------+

|                                         |
            |                                    |
|               Dom X                     |            +--------+
            |               Dom Y                |
|                                         |            |  Test  |
            |                                    |
|  +-----------------------------------+  | +--------->+ cases
+-------------->   | +-------------------------------+  |
|  |                                   |  |            | SAVED  |
            | |                               |  |
|  |                                   |  |            +--------+
            | |                               |  |
|  |                                   |  |
            | |                               |  |
|  |              Fuzzer               |  |
            | |         Executor              |  |
|  |                                   |  |
            | |                               |  |
|  |                                   |  |
            | |                               |  |
|  |                                   |  |
            | |                               |  |
|  |                                   |  |
            | |                               |  |
|  +-----------------------------------+  |
<----------------------------------+   |
+-----+----------+--------------+  |
|                                         |
            |       ^          |                 |
|                                         |            Tracing
(binary)            |Tracing|          |                 |
|                                         |
            |(program      Hypercalls            |
|                                         |
            |counters)         |                 |
+-----------------------------------------+
            +-------+----------------------------+

                    |          |
+------------------------------------------------------------------------------------------+----------v-----------------+
|
                                                 |
|
                                                 |
|                                                 Xen Hypervisor
                                                 |
|
                                                 |
|
                                                 |
+-----------------------------------------------------------------------------------------------------------------------+

==================================
5. References
==================================
[1] Link to previous version of proposal:
https://lists.xen.org/archives/html/xen-devel/2017-05/threads.html#02210
[2] Patch for tracing:
https://lists.xen.org/archives/html/xen-devel/2017-06/threads.html#02301
[3] KCOV patch for Linux kernel: https://lwn.net/Articles/671640/
[4] Link to GSoC page of project: https://summerofcode.
withgoogle.com/projects/#5585891117498368
[5] Link to originally suggested topic: https://wiki.xenproject.org/
wiki/Outreach_Program_Projects#Fuzzing_Xen_hypercall_interface


Any comments appreciated,

Felix
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Reply via email to