On 28.07.2019, at 00:31, Michael Niedermayer <mich...@niedermayer.cc> wrote:
> This merges several byte operations and avoids some shifts inside the loop > > Improves: Timeout (330sec -> 134sec) > Improves: > 15599/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_MSZH_fuzzer-5658127116009472 > > Found-by: continuous fuzzing process > https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg > Signed-off-by: Michael Niedermayer <mich...@niedermayer.cc> > --- > libavcodec/lcldec.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/libavcodec/lcldec.c b/libavcodec/lcldec.c > index 104defa5f5..c3787b3cbe 100644 > --- a/libavcodec/lcldec.c > +++ b/libavcodec/lcldec.c > @@ -391,13 +391,13 @@ static int decode_frame(AVCodecContext *avctx, void > *data, int *got_frame, AVPac > break; > case IMGTYPE_YUV422: > for (row = 0; row < height; row++) { > - for (col = 0; col < width - 3; col += 4) { > + for (col = 0; col < (width - 2)>>1; col += 2) { > memcpy(y_out + col, encoded, 4); > encoded += 4; > - u_out[ col >> 1 ] = *encoded++ + 128; > - u_out[(col >> 1) + 1] = *encoded++ + 128; > - v_out[ col >> 1 ] = *encoded++ + 128; > - v_out[(col >> 1) + 1] = *encoded++ + 128; > + AV_WN16(u_out + col, AV_RN16(encoded) ^ 0x8080); > + encoded += 2; > + AV_WN16(v_out + col, AV_RN16(encoded) ^ 0x8080); > + encoded += 2; Huh? Surely the pixel stride used for y_out still needs to be double of the u/v one? I suspect doing only the AV_RN16/xor optimization might be best, the one shift saved seems not worth the risk/complexity... _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".