On 9/8/2017 6:47 PM, Kieran Kunhya wrote: > On Fri, 8 Sep 2017 at 22:29 Michael Niedermayer <mich...@niedermayer.cc> > wrote: > >> Speeds code up from 50sec to 15sec >> >> Fixes Timeout >> Fixes: 3242/clusterfuzz-testcase-5811951672229888 >> >> Found-by: continuous fuzzing process >> https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg >> Signed-off-by >> <https://github.com/google/oss-fuzz/tree/master/projects/ffmpegSigned-off-by>: >> Michael Niedermayer <mich...@niedermayer.cc> >> --- >> libavcodec/scpr.c | 11 ++++++++++- >> 1 file changed, 10 insertions(+), 1 deletion(-) >> >> diff --git a/libavcodec/scpr.c b/libavcodec/scpr.c >> index 37fbe7a106..2ef63a7bf8 100644 >> --- a/libavcodec/scpr.c >> +++ b/libavcodec/scpr.c >> @@ -827,7 +827,16 @@ static int decode_frame(AVCodecContext *avctx, void >> *data, int *got_frame, >> return ret; >> >> for (y = 0; y < avctx->height; y++) { >> - for (x = 0; x < avctx->width * 4; x++) { >> + if (!(((uintptr_t)dst) & 7)) { >> + uint64_t *dst64 = (uint64_t *)dst; >> + int w = avctx->width>>1; >> + for (x = 0; x < w; x++) { >> + dst64[x] = (dst64[x] << 3) & 0xFCFCFCFCFCFCFCFCULL; >> + } >> + x *= 8; >> + } else >> + x = 0; >> + for (; x < avctx->width * 4; x++) { >> dst[x] = dst[x] << 3; >> } >> dst += frame->linesize[0]; >> -- >> 2.14.1 >> > > This is as clear as mud.
It reads eight bytes at a time if the buffer is sufficiently aligned, then finishes reading the remaining bytes one at a time. If the buffer is unaligned, it reads everything one byte at a time like it used to. See ff_h2645_extract_rbsp() and add_bytes_c() for another example of this optimization. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel