On Tue, 15 Aug 2023, Siddhesh Poyarekar wrote:
> > Thanks, this is nicer (see notes below). My main concern is that we > > shouldn't pretend there's some method of verifying that arbitrary source > > code is "safe" to pass to an unsandboxed compiler, nor should we push > > the responsibility of doing that on users. > > But responsibility would be pushed to users, wouldn't it? Making users responsible for verifying that sources are "safe" is not okay (we cannot teach them how to do that since there's no general method). Making users responsible for sandboxing the compiler is fine (there's a range of sandboxing solutions, from which they can choose according to their requirements and threat model). Sorry about the ambiguity. > So: > > The compiler driver processes source code, invokes other programs such as the > assembler and linker and generates the output result, which may be assembly > code or machine code. Compiling untrusted sources can result in arbitrary > code execution and unconstrained resource consumption in the compiler. As a > result, compilation of such code should be done inside a sandboxed environment > to ensure that it does not compromise the development environment. I'm happy with this, thanks for bearing with me. > >> inside a sandboxed environment to ensure that it does not compromise the > >> development environment. Note that this still does not guarantee safety of > >> the produced output programs and that such programs should still either be > >> analyzed thoroughly for safety or run only inside a sandbox or an isolated > >> system to avoid compromising the execution environment. > > > > The last statement seems to be a new addition. It is too broad and again > > makes a reference to analysis that appears quite theoretical. It might be > > better to drop this (and instead talk in more specific terms about any > > guarantees that produced binary code matches security properties intended > > by the sources; I believe Richard Sandiford raised this previously). > > OK, so I actually cover this at the end of the section; Richard's point AFAICT > was about hardening, which I added another note for to make it explicit that > missed hardening does not constitute a CVE-worthy threat: Thanks for the reminder. To illustrate what I was talking about, let me give two examples: 1) safety w.r.t timing attacks: even if the source code is written in a manner that looks timing-safe, it might be transformed in a way that mounting a timing attack on the resulting machine code is possible; 2) safety w.r.t information leaks: even if the source code attempts to discard sensitive data (such as passwords and keys) immediately after use, (partial) copies of that data may be left on stack and in registers, to be leaked later via a different vulnerability. For both 1) and 2), GCC is not engineered to respect such properties during optimization and code generation, so it's not appropriate for such tasks (a possible solution is to isolate such sensitive functions to separate files, compile to assembly, inspect the assembly to check that it still has the required properties, and use the inspected asm in subsequent builds instead of the original high-level source). Cheers. Alexander