------- Additional Comments From rth at gcc dot gnu dot org  2005-01-30 07:45 
-------
That said, what are you expecting here for massage48?  On K8, the latency
of imul for a 32-bit register operand is 3 cycles.  Alternately, we can
break this down into

   leal (%eax,%eax,2), %eax
   sall $4, $eax
   # Use eax unscaled

Which is 2 cycles latency for the leal and one for the sall, so gcc rightly
chooses to use the single multiply instruction, since the alternative is no
cheaper.  On pentium4, timings are different, and we select the leal+sall
sequence.  I don't see anything in this test case at all that could be made
to use more complex addressing modes.

As for the "Real World" code that uses a variable imul instead of a constant,
that would be more interesting to examine.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19680

Reply via email to