On Wed, Dec 2, 2015 at 5:10 PM, Dmitry Vyukov <dvyu...@google.com> wrote:
> ping
>
> Number of bugs found with this coverage in kernel already crossed 40:
> https://github.com/google/syzkaller/wiki/Found-Bugs
>
>
>
>
> On Fri, Nov 27, 2015 at 3:30 PM, Dmitry Vyukov <dvyu...@google.com> wrote:
>> +syzkaller group
>>
>> On Fri, Nov 27, 2015 at 3:28 PM, Dmitry Vyukov <dvyu...@google.com> wrote:
>>> Hello,
>>>
>>> This patch adds support for coverage-guided fuzzing:
>>> https://codereview.appspot.com/280140043
>>>
>>> Coverage-guided fuzzing is a powerful randomized testing technique
>>> that uses coverage feedback to determine new interesting inputs to a
>>> program. Some examples of the systems that use it are AFL
>>> (http://lcamtuf.coredump.cx/afl/) and LibFuzzer
>>> (http://llvm.org/docs/LibFuzzer.html). Compiler coverage
>>> instrumentation allows to make fuzzing more efficient as compared to
>>> binary instrumentation.
>>>
>>> Flag that enables coverage is named -fsanitize-coverage=trace-pc to
>>> make it consistent with similar functionality in clang, which already
>>> supports a set of fuzzing coverage modes:
>>> http://clang.llvm.org/docs/SanitizerCoverage.html
>>> -fsanitize-coverage=trace-pc is not yet supported in clang, but we
>>> plan to add it.
>>>
>>> This particular coverage mode simply inserts function calls into every
>>> basic block.
>>> I've built syzkaller, a Linux system call fuzzer
>>> (https://github.com/google/syzkaller), using this functionality. The
>>> fuzzer has found 30+ previously unknown bugs in kernel
>>> (https://github.com/google/syzkaller/wiki/Found-Bugs) in slightly more
>>> than a month (while kernel was extensively fuzzed with trinity --
>>> non-guided fuzzer). Quentin also built some kernel fuzzer on top of
>>> it.
>>>
>>> Why not gcov. Typical fuzzing loop looks as follows: (1) reset
>>> coverage, (2) execute a bit of code, (3) collect coverage, repeat. A
>>> typical coverage can be just a dozen of basic blocks (e.g. invalid
>>> input). In such context gcov becomes prohibitively expensive as
>>> reset/collect coverage steps depend on total number of basic
>>> blocks/edges in program (in case of kernel it is about 2M). Cost of
>>> this "tracing" mode depends on number of executed basic blocks/edges.
>>> On top of that, kernel required per-thread coverage because there are
>>> always background threads and unrelated processes that also produce
>>> coverage. With inlined gcov instrumentation per-thread coverage is not
>>> possible.
>>> Inlined fast paths do not make lots of sense in fuzzing scenario,
>>> because lots of code is executed just once in between resets. Also it
>>> is not really possible to inline accesses to per-cpu and per-task data
>>> structures for kernel.
>>>
>>> OK for trunk?

Reply via email to