https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66682
--- Comment #7 from Brendan G Bohannon <cr88192 at gmail dot com> --- (In reply to Mikhail Maltsev from comment #6) > Created attachment 35859 [details] > Sampling profile of cc1 > > It's hard to say what's wrong here (why do we perform so many lookups in > mem_attrs hash table? collisions?) without looking further. The code in question creates about 7k internal functions, which could be a factor? The VM it is from, in general, creates large numbers of one-off functions via macros. Most of this is because the interpreter structure is built around structs and calls through function-pointers (with the bytecode decoded into "Traces", AKA: "Extended Basic Blocks", which are executed via unrolled loops of calls through function-pointers, in turn driven by a top-level trampoline loop). Functions for basic operations are expanded out in terms of combinations of parameter and data types (in turn, the functions are fairly specialized). This structure is used as it gives fairly good performance for a reasonably portable plain-C interpreter.