Hi,

I'm Felix Schmoll, one of the GSoC students this year. Go Xen!

In order to begin I am herewith posting an implementation proposal for the
first part of the project for comments.

==================================
1. Motivation and Description
==================================
Fuzzing is a recent trend for systematic testing of interfaces by trying
more or less random inputs and iterating over them. A subset of fuzzers
uses code-coverage as feedback when permuting and choosing inputs, among
them the popular user-space fuzzer American Fuzzy Lop. Recently there have
been attempts to port fuzzers to the kernel and in a similar manner should
now the hypercall interface of Xen be tested.

While this is overall a very comprehensive problem this project will help
to develop a better understanding of the problem space and make at least
first advances of the source tree into the necessary direction. A generic
mechanism will be implemented allowing fuzzers to obtain feedback on
code-coverage. In the next step this output will be further processed in
order to actually run a particular fuzzer (such as AFL), although there
might not be sufficient time to commit this to the source tree.

To sum up, the overall steps to getting a fuzzer running are the following:

1. Extracting the execution path from the hypervisor via a hypercall
2. Parse the execution path into a format consumable by a user-space fuzzer
3. Drive a domU to execute the test cases of the fuzzer

This proposal is only concerned with how to extract the execution path.

==================================
2. Implementation Plan
==================================

==================================
2.1 Tracing
==================================
The gcc-6 fsanitize-coverage=trace_pc feature will be the foundation to
implement the tracing necessary for the hypercall. It inserts a
customisable function at every edge of the binary. By writing the current
program counter to a buffer passed in from user-space this will allow a
very detailed tracing in form of a sequence of program counters (PC's).

Care has also to be taken that the returned execution path contains only
executions related to the domain that is being traced and hypercalls
executed by it. Thus, only appropriate files will be compiled with the
option and, for example, interrupts will be excluded.

==================================
2.1.1 Function content
==================================
The "struct domain" as defined in xen/include/xen/sched.h should be
extended to include:
    * a pointer to the trace buffer (NULL if domain is not traced)
    * the next position to write to in the trace buffer
    * size of the trace buffer

An alternative considered here was to have some sort of global array to
store the data relevant for tracing in, but this limits the number of
domains.

Pseudo code:

/* Check if the current domain is being traced and, if appropriate, write
the program counter to the buffer. */
if(domain is traced && buffer not full) {
    current_domain->trace_buffer[current_domain->trace_buffer_pos++] =
       __builtin_return_address(0)
}

==================================
2.2 Hypercall-Interface
==================================
As stated in the preceding sections, a hypercall is needed to extract the
execution path. The proposed interface is the following:

/*
* @brief Traces the execution path of hypercalls executed by a domain.
* @param domain_id Domain whose execution path is supposed to be traced
* @param buffer Buffer to write program counters to
* @param size Size of the buffer
* @param mode, if to trace or to stop tracing
* @return Success or error in some form (e.g. number of PC’s written for
success)
*/
int trace_execution(int domain_id, int* buffer, int size, int mode);

This interface together with the previous snippet content seem to imply
that some program counters of this hypercall might be included in the
buffer (there will be edges between setting the buffer and returning to the
kernel if a domain traces itself). For the purpose of fuzzing this doesn't
matter as long as this is the same for all runs.

==================================
2.3 Adjustments to libxc
==================================
With this interface the only modification to libxc would be to add the new
hypercall.

An alternative considered was to implement an event notification system
which informs the trace hypercall when a hypercall starts and ends. One
could then change the interface to just trace the next hypercall instead of
tracing all hypercalls. This however involves changing the xencall
functions and throws up some questions in regards to having multiple
hypercalls at the same time. As long as the hypercall is used only for
fuzzing a single hypercall at a time the difference should be irrelevant.

==================================
2.4 Build
==================================
Inserting even a single instruction at every edge is a rather costly
operation in case the feature is never intended to be used. The tracing
should thus be an optional build-feature that has to be explicitly enabled.

As mentioned before, there are further adjustments needed for the build
system in order to compile only specific files with the option.

==================================
3. Expected Outcomes/Goals
==================================
This proposal outlines the steps for implementing coverage feedback.
Overall the project aims to enable automated fuzzing of the hypervisor,
which requires further steps as outlined in Section 1.

==================================
4. References
==================================
[1] Link to GSoC page of project:
https://summerofcode.withgoogle.com/projects/#5585891117498368
[2] Link to originally suggested topic:
https://wiki.xenproject.org/wiki/Outreach_Program_Projects#Fuzzing_Xen_hypercall_interface

Any comments appreciated,

Felix
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Reply via email to