Menezes, Evandro wrote: > Each HPC application tends to be unlike others, making it difficult > to optimize GCC for an elusive typical FP application. Not that > there isn't room for improvement though.
The performance of almost any HPC application can be isolated to specific key loops, and every critical loop has different optimization characteristics -- what improves performance on algorithm A may degrade performance on algorithm B. See my Acovea work for many supporting examples. My conclusion is the composite switches like -O2 are good only for general-purpose code. Anyone explicitly interested in squeezing out the most performance needs to do analysis and use application-specific switches. ..Scott