Signed-off-by: James Almer <jamr...@gmail.com> --- This makes ubsan happier, removing hundreds of runtime errors as seen in http://fatebeta.ffmpeg.org/report/x86_64-archlinux-gcc-ubsan/20160216235604 Tested on x86 and benched with no apparent speed loss, but the more people bench it the better as it's a very important speed critical module.
Could someone with a big endian system that has fast unaligned loads test if there's any difference in speed now? Since the slower codepath will not be used anymore for those with this patch. I think ppc is one. That aside, note that these runtime erros happen with every test using md5 and don't make ubsan register them as failed. In the link above something else made it fail. It makes me wonder how many other ubsan errors are hidden this way. libavutil/md5.c | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/libavutil/md5.c b/libavutil/md5.c index 876bd55..5d9c8f0 100644 --- a/libavutil/md5.c +++ b/libavutil/md5.c @@ -83,11 +83,11 @@ static const uint32_t T[64] = { // T[i]= fabs(sin(i+1)<<32) a += T[i]; \ \ if (i < 32) { \ - if (i < 16) a += (d ^ (b & (c ^ d))) + X[ i & 15]; \ - else a += ((d & b) | (~d & c)) + X[(1 + 5*i) & 15]; \ + if (i < 16) a += (d ^ (b & (c ^ d))) + AV_RL32(X+( i & 15)); \ + else a += ((d & b) | (~d & c)) + AV_RL32(X+((1 + 5*i) & 15)); \ } else { \ - if (i < 48) a += (b ^ c ^ d) + X[(5 + 3*i) & 15]; \ - else a += (c ^ (b | ~d)) + X[( 7*i) & 15]; \ + if (i < 48) a += (b ^ c ^ d) + AV_RL32(X+((5 + 3*i) & 15)); \ + else a += (c ^ (b | ~d)) + AV_RL32(X+(( 7*i) & 15)); \ } \ a = b + (a << t | a >> (32 - t)); \ } while (0) @@ -106,11 +106,6 @@ static void body(uint32_t ABCD[4], uint32_t *src, int nblocks) X = src + n * 16; -#if HAVE_BIGENDIAN - for (i = 0; i < 16; i++) - X[i] = av_bswap32(X[i]); -#endif - #if CONFIG_SMALL for (i = 0; i < 64; i++) { CORE(i, a, b, c, d); @@ -164,7 +159,7 @@ void av_md5_update(AVMD5 *ctx, const uint8_t *src, int len) } end = src + (len & ~63); - if (HAVE_BIGENDIAN || (!HAVE_FAST_UNALIGNED && ((intptr_t)src & 3))) { + if (!HAVE_FAST_UNALIGNED && ((intptr_t)src & 3)) { while (src < end) { memcpy(ctx->block, src, 64); body(ctx->ABCD, (uint32_t *) ctx->block, 1); -- 2.7.0 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel