[Apologies for reviving a long-dead thread, but in case it piques the interest of someone who has better knowledge of gcc internals...]
Elmar Krieger <el...@cmbi.ru.nl> writes: > For example, I got a huge slowdown also with this compiler: > > gcc44 (GCC) 4.4.6 20110731 (Red Hat 4.4.6-3) > Copyright (C) 2010 Free Software Foundation, Inc. > > which spends all its time in 'variable tracking': I found that, for a very few files in my project, 4.5-era compilers spent an awful lot of time doing variable tracking -- so much so, that I turned off that feature for those files. These files tend to be large and "flat" -- configuration validators, web page labels, things like that. I just checked with 4.7.2, and in some cases, it still takes 2x the time if var-tracking is enabled. On a fairly vanilla system (up-to-date Fedora 17 x86-64), with a custom build of 4.7.2: /usr/local/gcc/bin/g++ -v Using built-in specs. COLLECT_GCC=/usr/local/gcc/bin/g++ COLLECT_LTO_WRAPPER=/usr/local/gcc-4.7.2/libexec/gcc/x86_64-unknown-linux-gnu/4.7.2/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../gcc-4.7.2/configure --prefix=/usr/local/gcc-4.7.2 \ --with-local-prefix=/usr/local/gcc-4.7.2/local --enable-languages=c,c++ \ --enable-threads --disable-multilib Thread model: posix gcc version 4.7.2 (GCC) There are cases where I experience a 2x slowdown with var tracking. First, without tracking, I got times from 4.40 to 4.46s: /bin/time /usr/local/gcc/bin/g++ -I/usr/include/libxml2 \ -std=c++0x -Woverloaded-virtual -Wold-style-cast -pthread \ -rdynamic -D_FILE_OFFSET_BITS=64 -funsigned-char -I .. \ -Wconversion -Wall -Wextra -Werror -fdata-sections \ -ffunction-sections -Wl,--gc-sections \ -isystem /usr/local/boost/include \ -fno-var-tracking-assignments \ -g -Wall -O3 -c -o foo.o foo.cpp 4.42user 0.15system 0:04.62elapsed 99%CPU (0avgtext+0avgdata 307180maxresident)k 0inputs+14520outputs (0major+92908minor)pagefaults 0swaps With the exact same compiler and input file, but with variable tracking enabled, I saw times from 9.80 to 9.96s: /bin/time /usr/local/gcc/bin/g++ -I/usr/include/libxml2 \ -std=c++0x -Woverloaded-virtual -Wold-style-cast -pthread -rdynamic -D_FILE_OFFSET_BITS=64 -funsigned-char -I .. \ -Wconversion -Wall -Wextra -Werror -fdata-sections \ -ffunction-sections -Wl,--gc-sections \ -isystem /usr/local/boost/include \ -g -Wall -O3 -c -o foo.o foo.cpp 9.91user 0.25system 0:10.22elapsed 99%CPU (0avgtext+0avgdata 416248maxresident)k 0inputs+49128outputs (0major+148701minor)pagefaults 0swaps Variable tracking definitely uses more memory (93k vs. 149k pagefaults), but at no time was I swapping (always had >1GiB free). Another case went from 4.10s to 5.70s when I enabled variable tracking. Again, no swapping, and pagefaults went from 358k to 413k. One attribute these two source files share is a relatively large number of function-local static arrays. The former has this structure: void check_condition_1(...) { ... } void check_condition_2(...) { ... } void check_scenario_1(...) { check_condition_1(...); } void check_scenario_2(...) { check_condition_1(...); check_condition_2(...); } void check_scenario_3(...) { check_condition_2(...); } void check(...) { typedef boost::function< void ( void ) > checker_func; struct scenario_checker { const char * name; checker_func checker; }; static const scenario_checker checkers[] = { { "one", &check_scenario_1 }, { "two", &check_scenario_2 }, { "three", &check_scenario_3 }, { 0, 0 } }; for ( const scenario_checker * p = &checkers[0]; p->name; ++p ) if ( situation == p->name ) { p->checker(...); break; } } While the latter has this structure: const Thing * * my_func() { const static Thing * things[] = { new Thing1(...), new Thing2(...), 0 }; return things; } I ran into this in a few other files; as mentioned above, the common traits are long "flat" files, with a lot of repetitive boilerplate. (I vaguely recall that, years ago, gcc had issues with machine- generated source files that were likewise long and flat; no human would ever write a switch statement with thousands of cases, but preprocessors and code generators apparently did so all the time...) Anyway. It might be that var tracking is already as efficient as it can possibly be; in which case, I'll just continue disabling it for the (very few) source files where it's an issue. On the other hand, maybe something I said will trigger a gcc devel to realize that there's an easy quadratic-to-linear algorithm fix. :) Regardless, thanks very much for the excellent compiler. Best regards, Anthony Foiani