Hi,

while dealing with autogenerated code I found that GCC often outputs compare 
trees instead of branch tables for switch statements. I found that GCC uses the 
density of case labels within their spanned range as the key criterion. If the 
density of labels is smaller than 10% (33% for size optimization) then GCC uses 
compare instruction rather than a branch table. What I am wondering is why no 
real cost estimation is done here. In my case there are around 200 to 1500 
labels within the 16K range, so they are completely resolved to compares. In 
contrast the Intel compiler outputs branch tables more likely.

The question now is whether the factors 10 (3 for size opt) have been found 
using any standard benchmark. And shouldn't at least the factor be optimized 
for a standard benchmark (like the upcoming SPEC CPU 2006 INT) if this simple 
linear method is used for modeling a logarithmic behavior? Or should a cost 
estimation be introduced?

For size optimization there could be a quite accurate architecture dependent 
size estimation.

Opinions, Comments?

Christian

PS: Please CC me.


Reply via email to