Michael Niedermayer: > Signed-off-by: Michael Niedermayer <mich...@niedermayer.cc> > --- > libavcodec/get_bits.h | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/libavcodec/get_bits.h b/libavcodec/get_bits.h > index cfcf97c021c..86cea00494a 100644 > --- a/libavcodec/get_bits.h > +++ b/libavcodec/get_bits.h > @@ -581,8 +581,12 @@ static inline const uint8_t > *align_get_bits(GetBitContext *s) > n = table[index].len; \ > \ > if (max_depth > 1 && n < 0) { \ > - LAST_SKIP_BITS(name, gb, bits); \ > - UPDATE_CACHE(name, gb); \ > + if (av_builtin_constant_p(bits <= MIN_CACHE_BITS/2) && bits <= > MIN_CACHE_BITS/2) { \ > + SKIP_BITS(name, gb, bits); \ > + } else { \ > + LAST_SKIP_BITS(name, gb, bits); \ > + UPDATE_CACHE(name, gb); \ > + } \ > \ > nb_bits = -n; \ > \
This is problematic: The GET_VLC macro does not presume that MIN_CACHE_BITS are available; there is code that directly uses GET_VLC instead of get_vlc2(). I had the same idea when I made my VLC patchset, yet I wanted to first apply it (which I forgot). While investigating the above issue, I found out that all users of GET_VLC always call UPDATE_CACHE immediately before GET_VLC, so UPDATE_CACHE should be moved into GET_VLC; furthermore, no user of GET_VLC relies on the reloads inside of GET_VLC. The patches for this are here: https://github.com/mkver/FFmpeg/commits/vlc Shall I send them? Notice that making GET_VLC more standalone enables improvements over the current approach; yet it will not lead to optimal code: E.g. the VLCs in decode_alpha_block() in speedhqdec.c are so short that one could read both VLCs with only one UPDATE_CACHE(); another example is mjpegdec.c which currently does this: GET_VLC(code, re, &s->gb, s->vlcs[1][ac_index].table, 9, 2); i += ((unsigned)code) >> 4; code &= 0xf; if (code) { if (code > MIN_CACHE_BITS - 16) UPDATE_CACHE(re, &s->gb); { int cache = GET_CACHE(re, &s->gb); int sign = (~cache) >> 31; level = (NEG_USR32(sign ^ cache,code) ^ sign) - sign; } LAST_SKIP_BITS(re, &s->gb, code); Because of the reloads in GET_VLC, there will always be at least MIN_CACHE_BITS - 9 (= 16) bits available after GET_VLC, so one can read code (<= 15) bits without updating the cache at all (16 in MIN_CACHE_BITS - 16 is the maximum length of a VLC code used here); this will no longer be possible with this optimization. Btw: I am surprised that there is a branch before UPDATE_CACHE instead of an unconditional UPDATE_CACHE. I also do not really see why this uses these macros directly and not the functions. Given my objection to your patch #1, magicyuv will not benefit from this; a different approach (see https://github.com/mkver/FFmpeg/commit/9b5a977957968c0718dea55a5b15f060ef6201dc) is to add a get_vlc() that uses the nb of bits used to create the VLC and a compile-time upper bound for the maximum length of a VLC code as parameters instead of the maximum depth of the VLC. Reading VLCs for the cached bitstream reader can btw also be improved: https://github.com/mkver/FFmpeg/commit/fba57506a9cf6be2f4aa5eeee7b10d54729fd92a - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".