(delurking)

Ian Grant writes:

> In case it isn't obvious, what I am interested in is how easily we can know 
> the problem of infeasibly large binaries isn't an instance of this one:

>    
> http://livelogic.blogspot.com/2014/08/beware-insiduous-penetrator-my-son.html

Ah, this is commonly called the Thompson hack, since Ken Thompson actually 
produced a successful demo:

http://www.win.tue.nl/~aeb/linux/hh/thompson/trust.html

The only way that the Thompson hack can survive a three-stage bootstrap is if 
the compiler used for the stage 1 build has the bad code.  The comparison 
between stages 2 and 3 require exact match, and any imperfection in the object 
code injection would reveal itself.

So, you can build GCC with LLVM or Intel's compiler or Microsoft's or IBM's or 
Sun's, doing cross-compilation where necessary.  The basic idea is:

1: build gcc with 3-stage bootstrap, starting with a compiler that you suspect 
might be infected.  call the result A.
2: do it again, starting with a different compiler that you think is 
independent of the compiler you used in step 1.  call it B.
3: compare A to B.  If they differ, you've found something that should be 
investigated.  If you don't, then either A and B are both clean, or A and B 
both have the identical inserted object code. Maybe they have a common ancestor?

Note that if you build gcc with a cross-compiler the object code will be 
different.  You have to use the cross-compiler to build one more time to 
"normalize": GCC 4.9.0 built with GCC 4.9.0 on operating system X should always 
be the same.

As far as I know no one has been paranoid enough to put in the time to do the 
experiment on a large scale, and it's harder because you can't build a modern 
GCC (or LLVM for that matter) with an ancient compiler.  But you can create a 
chain: grab an ancient gcc version off a 15-year-old CD, and build newer 
versions with it until you get up  to the present.  The result should be 
byte-for-byte identical with what you get when building the current compiler 
with a recent version.  If it is, then either the infection is 15 years old or 
does not exist.  Try it again by building cross-compilers from a Microsoft 
system.  Don't trust Apple, they used to use GCC so maybe all their LLVM 
binaries caught the bug.


BTW, if "size" is reporting much smaller size than the executable file itself 
and that motivates this concern, most of the difference is likely to be debug 
info, which is bigger since gcc switched to C++.  Might want to try "strip".

Reply via email to