https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109485
Bug ID: 109485 Summary: Feature request: More efficient include path handling Product: gcc Version: 12.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: preprocessor Assignee: unassigned at gcc dot gnu.org Reporter: dani.borg at outlook dot com Target Milestone: --- When preprocessing, the method to lookup include paths is inefficient and cause more file system calls than needed. The current method simply tries each include path in order, for every unique #include directive. Basically O(n*n) in complexity. The method scales poorly when the number of include paths increase which can cause high system load and long build times. A smarter approach, which clang appears to be using, is to keep track of which include paths doesn't contain the path seen in the include directive. Then file system queries for impossible paths can be avoided. To give a concrete example that compares gcc and clang, the following can be run in bash on a Linux system. The example below only shows the relevant output from strace. #prepare - create 2 include paths, 3 headers and a source file including the headers mkdir -p a b/b && touch b/b/a.h b/b/b.h b/b/c.h && echo -e '#include "b/a.h"\n#include "b/b.h"\n#include "b/c.h"' > a.c #gcc strace -f -e open,stat gcc -Ia -Ib -nostdinc a.c -E -o/dev/null [pid 12] open("a/b/a.h", O_RDONLY|O_NOCTTY) = -1 ENOENT (No such file or directory) [pid 12] open("b/b/a.h", O_RDONLY|O_NOCTTY) = 4 [pid 12] open("a/b/b.h", O_RDONLY|O_NOCTTY) = -1 ENOENT (No such file or directory) [pid 12] open("b/b/b.h", O_RDONLY|O_NOCTTY) = 4 [pid 12] open("a/b/c.h", O_RDONLY|O_NOCTTY) = -1 ENOENT (No such file or directory) [pid 12] open("b/b/c.h", O_RDONLY|O_NOCTTY) = 4 [pid 12] +++ exited with 0 +++ #clang strace -f -e open,stat clang -Ia -Ib -nostdinc a.c -E -o/dev/null stat("a/b", 0x7ffd8d11ac20) = -1 ENOENT (No such file or directory) stat("b/b", {st_mode=S_IFDIR|0755, st_size=55, ...}) = 0 open("b/b/a.h", O_RDONLY|O_CLOEXEC) = 3 open("b/b/b.h", O_RDONLY|O_CLOEXEC) = 3 open("b/b/c.h", O_RDONLY|O_CLOEXEC) = 3 +++ exited with 0 +++ Note how clang first determines if the partial path exist, before testing the full path. This way all open("a/b/... calls can be avoided.