ping Number of bugs found with this coverage in kernel already crossed 40: https://github.com/google/syzkaller/wiki/Found-Bugs
On Fri, Nov 27, 2015 at 3:30 PM, Dmitry Vyukov <dvyu...@google.com> wrote: > +syzkaller group > > On Fri, Nov 27, 2015 at 3:28 PM, Dmitry Vyukov <dvyu...@google.com> wrote: >> Hello, >> >> This patch adds support for coverage-guided fuzzing: >> https://codereview.appspot.com/280140043 >> >> Coverage-guided fuzzing is a powerful randomized testing technique >> that uses coverage feedback to determine new interesting inputs to a >> program. Some examples of the systems that use it are AFL >> (http://lcamtuf.coredump.cx/afl/) and LibFuzzer >> (http://llvm.org/docs/LibFuzzer.html). Compiler coverage >> instrumentation allows to make fuzzing more efficient as compared to >> binary instrumentation. >> >> Flag that enables coverage is named -fsanitize-coverage=trace-pc to >> make it consistent with similar functionality in clang, which already >> supports a set of fuzzing coverage modes: >> http://clang.llvm.org/docs/SanitizerCoverage.html >> -fsanitize-coverage=trace-pc is not yet supported in clang, but we >> plan to add it. >> >> This particular coverage mode simply inserts function calls into every >> basic block. >> I've built syzkaller, a Linux system call fuzzer >> (https://github.com/google/syzkaller), using this functionality. The >> fuzzer has found 30+ previously unknown bugs in kernel >> (https://github.com/google/syzkaller/wiki/Found-Bugs) in slightly more >> than a month (while kernel was extensively fuzzed with trinity -- >> non-guided fuzzer). Quentin also built some kernel fuzzer on top of >> it. >> >> Why not gcov. Typical fuzzing loop looks as follows: (1) reset >> coverage, (2) execute a bit of code, (3) collect coverage, repeat. A >> typical coverage can be just a dozen of basic blocks (e.g. invalid >> input). In such context gcov becomes prohibitively expensive as >> reset/collect coverage steps depend on total number of basic >> blocks/edges in program (in case of kernel it is about 2M). Cost of >> this "tracing" mode depends on number of executed basic blocks/edges. >> On top of that, kernel required per-thread coverage because there are >> always background threads and unrelated processes that also produce >> coverage. With inlined gcov instrumentation per-thread coverage is not >> possible. >> Inlined fast paths do not make lots of sense in fuzzing scenario, >> because lots of code is executed just once in between resets. Also it >> is not really possible to inline accesses to per-cpu and per-task data >> structures for kernel. >> >> OK for trunk?