On Fri, Oct 07, 2022 at 01:51:05PM +0100, Richard W.M. Jones wrote: > > Thinking about ways to expose even more code-paths, I wonder if we > > could tweak the client along the lines of: > > > > if (rand () & 1) > > nbd_set_handshake_flags (nbd, rand ()); > > if (rand () & 1) > > nbd_set_strict_mode (nbd, rand ()); > > Adding randomization to the fuzzer is a bad idea I'm afraid, > specifically called out in the docs: > > https://aflplus.plus/docs/faq/ (search for "Stability")
Interesting reading, including: "There are functions that are unstable, but also provide value to coverage, e.g., init functions that use fuzz data as input. If, however, a function that has nothing to do with the input data is the source of instability, e.g., checking jitter, or is a hash map function etc., then it should not be instrumented." So using rand() is probably going to hurt more than it helps (too unpredictable; even if seeded, you can only fuzz the seed number, not the psuedo-random sequence that follows from that seed). But setting up a mode where we tweak our first few handshaking decisions based on reading a few bytes from a fuzzed file may be worthwhile - where the fuzzer can then explore changes it makes to that file as a way of deterministically exploring different initialization paths. > > > and so forth, to allow the fuzzer to explore different combinations of > > settings. > > The fuzzer will explore different paths by presenting different > inputs. In the case of libnbd, "input" means the network data that > normally libnbd would be reading from the NBD server. As long as > variations in those replies (inputs) can cause libnbd to take > different paths then the fuzzer will eventually explore those paths. > > > Another idea might be: > > > > static void do_opt_structured_reply (void) > > { /* call nbd_opt_structured_reply() */ } > > static void do_opt_list_meta_context (void) > > { /* call nbd_opt_list_meta_context[_queries]() */ } > > ... > > void (*opts[])(void) = { > > do_opt_structured_reply, > > do_opt_list_meta_context, > > ... > > }; > > > > for (i = rand () % 20; i > 0; i--) > > opts[i % ARRAY_SIZE (opts)] (); > > > > to play with different handshake sequences. > > This won't work for the same reason. Okay, I see that better after looking at README; the fuzzing we are attempting is based on a two-step process: first we generate an actual capture of server replies to a valid client session, then you start fuzzing on a replay of the client dealing with slight variations of the server's reply (that is, trying to find spots where a buggy/malicious server can trip up the client). Throwing in more randomness to the initialization would let us create exponentially more starting point files in the first step of capturing actual client sessions, but may not necesesarily drive us any closer to the second step of fuzzing the server's replies into tickling client bugs unless we can tightly correlate which input sequence of the first step determines which output file we should be fuzzing in the second step. There may still be some fuzzing gains to be added, but it would be by having yet another file under the fuzzer's control to read from during initialization, and not by calls to rand(). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org _______________________________________________ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs