On Tue, Apr 12, 2011 at 11:31:20PM +0100, Peter Maydell wrote: > On 12 April 2011 22:32, Aurelien Jarno <aurel...@aurel32.net> wrote: > > On Mon, Apr 11, 2011 at 04:32:08PM +0100, Peter Maydell wrote: > > >> @@ -1524,12 +1528,12 @@ uint64_t HELPER(neon_abdl_u16)(uint32_t a, > >> uint32_t b) > >> { > >> uint64_t tmp; > >> uint64_t result; > >> - DO_ABD(result, a, b, uint8_t); > >> - DO_ABD(tmp, a >> 8, b >> 8, uint8_t); > >> + DO_ABD(result, a, b, uint8_t, uint32_t); > >> + DO_ABD(tmp, a >> 8, b >> 8, uint8_t, uint32_t); > >> result |= tmp << 16; > >> - DO_ABD(tmp, a >> 16, b >> 16, uint8_t); > >> + DO_ABD(tmp, a >> 16, b >> 16, uint8_t, uint32_t); > >> result |= tmp << 32; > >> - DO_ABD(tmp, a >> 24, b >> 24, uint8_t); > >> + DO_ABD(tmp, a >> 24, b >> 24, uint8_t, uint32_t); > >> result |= tmp << 48; > >> return result; > >> } > > > > Do we really need a 32-bit type for the computation here? > > No, anything wider than 8 will do, but my guess was that in > practice 32 bits would be fractionally more efficient than > unnecessarily forcing 16 bit arithmetic. For that matter I > guess we could just say "int" and "unsigned int" since C > guarantees us at least 16 bits there.
My guess was that in 2011 a compiler can optimize that itself if it is faster, and so it should be presented the size that is really needed. It turns to be the case on x86_64, but not on arm or ia64. I have therefore applied this patch as is. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net