Hi, On Fri, Jul 7, 2017 at 2:48 PM, Paul B Mahol <one...@gmail.com> wrote:
> typedef struct GetBitContext { > const uint8_t *buffer, *buffer_end; > +#ifdef CACHED_BITSTREAM_READER > + uint64_t cache; > + unsigned bits_left; > +#endif > Can you post some stats (from relevant systems, ideally, e.g. 32-bit binary on x86, or 32-bit arm) on how a 32bit cache performs compared to a 64bit cache on systems with HAVE_FAST_64BIT=0? > +static inline void refill_32(GetBitContext *s) > [..] > +#ifdef BITSTREAM_READER_LE > + s->cache = (uint64_t)AV_RL32(s->buffer + (s->index >> 3)) << > s->cache | s->cache; > As said on IRC: middle s->cache should be s->bits_left. Overall very nice improvement, I would in particular not be surprised if this is generally faster for almost all users, except those using the lower-level macros (things like SHOW_BITS() etc.) in the old interface. If that's true, it may be positive to enable this by default and disable only for those using the low-level interface. (I'm assuming the low-level interface no longer works with the cached reader, so can we prevent users from accessing these macros unless cached=1?) Ronald _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel