------- Additional Comments From rth at gcc dot gnu dot org 2005-01-30 07:45 ------- That said, what are you expecting here for massage48? On K8, the latency of imul for a 32-bit register operand is 3 cycles. Alternately, we can break this down into
leal (%eax,%eax,2), %eax sall $4, $eax # Use eax unscaled Which is 2 cycles latency for the leal and one for the sall, so gcc rightly chooses to use the single multiply instruction, since the alternative is no cheaper. On pentium4, timings are different, and we select the leal+sall sequence. I don't see anything in this test case at all that could be made to use more complex addressing modes. As for the "Real World" code that uses a variable imul instead of a constant, that would be more interesting to examine. -- What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |WAITING http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19680