> The abstract states: > > "In the DDC technique, source code is compiled twice: once with a > second (trusted) compiler (using the source code of the compiler’s > parent), and then the compiler source code is compiled using the > result of the first compilation. If the result is bit-for-bit > identical with the untrusted executable, then the source code > accurately represents the executable." > > > I find the above to be unclear:
Of course, it's an abstract. You can read the paper for more details. Basically: Take an existing untrusted compiler whose source code is A and binary is cA. You check that: cA == cA(A) if it's not the case (or if you don't have access to the source code A), the DDC technique can't be used. If it is the case, you have just checked that `A` is indeed the source code for `cA`. Then take a trusted compiler whose source code is T. Now compile it with `cA`: cT = cA(T) and then use this new compiler binary `cT` to compile `A` a second time: cA2 = cT(A) finally compare `cA` and `cA2`. If they're bit-for-bit identical, then you're good: `cA` doesn't seem to have any hidden trojan horse. If they're not, then either cA is compromised, or maybe it's simply that the compilers A and T don't agree sufficiently on the source language in which they're both written. > 1. What source code is compiled twice? The source code `A` of the untrusted compiler. > 2. Where do I get the second (trusted) compiler? You write it yourself by hand. You also have to make sure that it matches the semantics of `A` sufficiently to avoid false negatives. You need to not only trust that it does what you think it does, but also that any attacker who may have infected `cA` hasn't seen that code and can't have guessed enough of its content to be able to properly infect `cT`. > 7. What if the compiler, by design, does not produce identical output for > identical input? Then you can't use that technique and you're left wondering if it may have a hidden self-perpetuating backdoor. Stefan