On 13/02/2018 05:09, John Fastabend wrote:
To find these we do the following
1.a: Build the CFG of all the instructions so we can walk around the
program easily.
1.b: Find the basic blocks and build the basic block CFG. After this
with a few helper calls we can move around the CFG both at block
level and insn level.
1.c: Build the dom tree and post dom tree. This is a data structure that
allows us to actually find the natural loops. I can write more about
this later.
1.d: Using the post dom tree verify all the CFG loops are in fact
natural loops. I made some simplifications here and threw
out loops that have multiple jumps to the header node and nested
loops for now. Both can be handled with some additional complexity.
Versions of the above can be found in gcc and llvm so it appears to be
fairly regular stuff from the compiler side. For LLVM the passes are
--loops, --postdomtree. I dumped the various CFG into dot format and
and graphed them for fun.
This part of work are very interesting. It will be very helpful if these CFG
information are kept even after verification pass finished, or if the storage
overhead is a concern, made into reusable functions.
I plan to implement similar graph dump in bpftool, i.e. be able to dump eBPF
CFG into .dot graph and color the graph according to some simple static
analysis,
these exactly need such CFG information.
Regards,
Jiong