On 17/08/12 16:20, Richard Earnshaw wrote:
No, given a u16xu16->u64 operation in the code, and that the arch doesn't have such an opcode, I'd expect to getstep1 -> (u32)u16 x (u32)u16 -> u64Hmm, I would have thought that would be more costly than (u64)(u16 x u16 -> u32)
You might be right, but then extends are often free, especially with unsigned types, so it's hard to say for sure.
Did you reproduce one? It's a long time since I last looked at this stuff, so I could be confused.
Andrew