Hi, I've been working for some time on the prototype of the Pointer Bounds Checker which uses function clones for instrumentation (http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03327.html). After several experiments with this approach I want to share my results and ask for some feedback to make a decision about the future steps.
Firstly I want to remind the reasons for digging in this direction. In the original approach bounds of call arguments and input parameters are associated with arguments via special built-in calls. It creates implicit data flow compiler is not aware about which confuses some optimizations resulting in miss-optimization and breaks bounds data flow. Thus optimizations have to be fixed to get better pointers protection. Clones approach does not use special built-in function calls to associate bounds with call arguments and input parameters. Each function which should be instrumented gets an additional version and only this special version will be instrumented.This new version gets additional bound arguments to express input bounds. When function call is instrumented, it is redirected to instrumented version and all bounds are passed as explicit call arguments. Thus we have explicit pointer bounds flow similar to regular function parameters. It should allow to avoid changes in optimization, avoid miss-optimizations, allow existing IPA optimizations to work with bound args (e.g. propagate constant bounds value and remove checks in called function). I made a prototype implementation of this approach in the following way: - Add new IPA pass before early local passes to produce versions for all functions to be instrumented. - Put instrumentation pass after SSA pass. - Add new pass after IPA passes to remove bodies of functions which have instrumented versions. Function nodes may still be required for calls in not instrumented code. But we do not emit this code and therefore function bodies are not needed. Positive changes are: - IPA optimizations are not confused by bound parameters - bounds are now more like regular arguments; it makes their processing in expand easier - functions with bounds not attached to any pointer are allowed On simple codes this approach worked well but on a bigger tests some issues were revealed. 1. Nodes reachability. Instrumented version is actually always reachable when original function is reachable because it is always emitted instead of the original. Thus I had to fix reachability analysis to achieve it. Another similar problem is check whether node can be removed after inline when inlining instrumented function. Not hard to fix but probably other similar problems exist. 2. Function processing order. Function processing order is determined before early local passes. But during function instrumentation call graph is modified significantly and used topological order becomes outdated. That causes some troubles. E.g. function marked as 'always inline' cannot be inlined because it is not in SSA form yet. Surely inlining problem may be solved by just putting instrumentation after early inline, but similar problem may exist in other passes too. To resolve this problem I tried to split early local passes into three parts. The first one builds SSA, the second one performs instrumentation, the last one does the rest. Each part is performed on all functions before the next one starts. Thus I get all functions in SSA form and all instrumentation performed before starting early optimizations. Unfortunately such passes order leads to invalid SSA because of local_pure_const optimization affecting callers correctness (in case caller SSA was built before optimization revealed 'pure' or 'const' flag). In general I feel that having special function version for instrumentation has a better potential, should lead to less intrusive changes in the compiler and better code quality. But before continue this implementation I would like to get some feedback and probably some advice on how to order passes to get the best result. Currently I incline to have all functions instrumented before any local optimizations and solve pure_const problem by modifying all callers when attribute is computed for some function. Thanks, Ilya