On 13/02/2018 05:09, John Fastabend wrote:
To find these we do the following
    1.a: Build the CFG of all the instructions so we can walk around the
         program easily.
    1.b: Find the basic blocks and build the basic block CFG. After this
         with a few helper calls we can move around the CFG both at block
         level and insn level.
    1.c: Build the dom tree and post dom tree. This is a data structure that
         allows us to actually find the natural loops. I can write more about
         this later.
    1.d: Using the post dom tree verify all the CFG loops are in fact
         natural loops. I made some simplifications here and threw
         out loops that have multiple jumps to the header node and nested
         loops for now. Both can be handled with some additional complexity.

   Versions of the above can be found in gcc and llvm so it appears to be
   fairly regular stuff from the compiler side. For LLVM the passes are
   --loops, --postdomtree. I dumped the various CFG into dot format and
   and graphed them for fun.

This part of work are very interesting. It will be very helpful if these CFG
information are kept even after verification pass finished, or if the storage
overhead is a concern, made into reusable functions.

I plan to implement similar graph dump in bpftool, i.e. be able to dump eBPF
CFG into .dot graph and color the graph according to some simple static 
analysis,
these exactly need such CFG information.

Regards,
Jiong

Reply via email to