sammccall added subscribers: hokein, sammccall. sammccall added a comment. Herald added a project: All.
In D20401#2770059 <https://reviews.llvm.org/D20401#2770059>, @nickdesaulniers wrote: > I know this was sped up slightly in 3339c568c43e4644f02289e5edfc78c860f19c9f, > but this change makes `updateConsecutiveMacroArgTokens` the hottest function > in clang in a bottom up profile of an entire build of the Linux kernel. It > thrashes the one entry LastFileIDLookup cache, and we end up looking up the > same FileID again and again and again for each token when we expand nested > function like macros. > > Is there anything we can do to speed this up? @hokein and I spent some time looking at this (initially trying to understand behavior, now performance). Short version is: - we can *simplify* the code a lot, we think it's now just partitioning based on FileID and this can be done more clearly. This may have some speedups at the margin. - in terms of performance: I suspect when clang is built by GCC it's doing roughly 3x as much work as when it's built by clang. @nickdesaulniers can you tell me which you're measuring/deploying? To give some idea if we're likely to actually help much... --- **Behavior: partitioning by file IDs** I think we're back to the original (<2011) behavior of just partitioning by file IDs - the original patch <https://github.com/llvm/llvm-project/commit/2797df6a24e9065b37022f922da813a133cfaac2> clearly intended this to merge tokens across file IDs, and the comments still claim this - then a bugfix <https://github.com/llvm/llvm-project/commit/5e14925a35e87b46d48e6dc6b71a682956c50d55> banned merging file+macro or macro+file - then this patch banned merging macro+macro - meanwhile, there's no code disallowing file+file, but I don't think it's actually possible to achieve: you can't have an `#include` or an `eof` inside a macro arg, and I don't know how else to switch between files. **Performance (good case)** The current obfuscated version *is* faster than the pre-2011 version because we avoid getFileID() in when testing file+macro, macro+file, and *some* macro+macro cases (when the locations happen to be >50 apart). When we see a run of N consecutive macro nearby macro tokens, we do `2*(N-1)` getFileID()s. We can reduce the number of getFileID() calls by caching FileID bounds (the expensive part is looking up the SLocEntry - given that we can hit-test a SourceLocation against it with simple arithmetic). However, getFileID() has the one-element cache of SLocEntry, so this may only be a marginal improvement. // Tokens A1 A2 A3 B1 B2 isWrittenInSameFile(A1, A2); getFileID(A1); // miss getFileID(A2); isWrittenInSameFile(A2, A3); getFileID(A2); getFileID(A3); isWrittenInSameFile(A3, B1); getFileID(A3); getFileID(B1); // miss isWrittenInSameFile(B1, B2); getFileID(B1); getFileID(B2); All the getFileID() calls we could avoid are the cached ones. It's probably still a win (the cache lookup logic is kinda messy), but probably not huge. **Performance (bad case)** However, the implementation of `isWrittenInSameFile` is `getFileID(X) == getFileID(Y)`, and it's unspecified which gets evaluated first. GCC generally evaluates right-to-left (https://godbolt.org/z/M4bs74Tbj), producing substantially more misses: isWrittenInSameFile(A1, A2); getFileID(A2); // miss getFileID(A1); isWrittenInSameFile(A2, A3); getFileID(A3); getFileID(A2); isWrittenInSameFile(A3, B1); getFileID(B1); // miss getFileID(A3); // miss isWrittenInSameFile(B1, B2); getFileID(B2); // miss getFileID(B1); So I'd expect we can improve GCC-built-clang's performance by maybe 2x by caching externally. (Or by having isWrittenInSameFile try to cleverly evaluate whichever arg matches the cache first, but I have no idea whether that will work well across clang) --- @hokein or I will try to find time to take a stab at this. Repository: rL LLVM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D20401/new/ https://reviews.llvm.org/D20401 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits