On Fri, May 17, 2019 at 08:13:51PM +0200, Reimar Döffinger wrote: > On Fri, May 17, 2019 at 08:09:45PM +1000, Peter Ross wrote: > > ah, i see what you did there! it works perfectly, just missing > > UPDATE_CACHE at the start and in the loop. > > > > test results on i7 decoding 3 minute long 4k video with vp4. > > Looks fairly close to noise to me, though for me > it seemed a bit more obvious how the encoding > works from it (which was the primary reason to suggest it). > If one really wanted to optimize it for performance the > arrangement of the conditions can probably be improved, e.g. > the 0x1ff check is now the very first one even though > it is the least likely one (but avoids duplicating code > or needing crazy goto or loop constructs and thus > is more readable), and depending on probabilities > doing the range checks in a more tree-like structure > might also be better. > But as said, optimizing this has probably at most > curiosity value :)
i like your the while loop, it makes it more obvious i don't have enough source sequences (or interest), do generate those probabilities. another run, on original raspberry pi with 720p sequence. vp4 patch v3: bench: utime=110.393s stime=5.232s rtime=115.835s bench: utime=112.869s stime=4.981s rtime=118.235s bench: utime=111.737s stime=5.220s rtime=117.168s bench: utime=113.265s stime=5.250s rtime=118.730s bench: utime=112.638s stime=5.120s rtime=117.938s bench: utime=110.732s stime=5.190s rtime=116.133s bench: utime=111.218s stime=5.013s rtime=116.444s bench: utime=111.096s stime=4.768s rtime=116.076s bench: utime=111.318s stime=5.073s rtime=116.603s your patch + UPDATE_CACHE: bnnch: utime=111.583s stime=5.421s rtime=117.251s bench: utime=112.145s stime=4.799s rtime=117.155s bench: utime=111.235s stime=5.552s rtime=116.967s bench: utime=112.169s stime=5.248s rtime=117.628s bench: utime=112.424s stime=5.178s rtime=117.813s bench: utime=112.721s stime=5.182s rtime=118.115s bench: utime=112.753s stime=5.162s rtime=118.125s bench: utime=111.587s stime=5.267s rtime=117.065s bench: utime=112.641s stime=4.952s rtime=117.805s averaging the results your patch is a only marginally slower. but basically no differnce. converting your patch to use ordinary get_bits/show_bits however does make it slower. average approx 0.4 seconds slower: bench: utime=114.315s stime=5.292s rtime=119.820s bench: utime=111.625s stime=5.481s rtime=117.317s bench: utime=112.706s stime=5.063s rtime=117.982s bench: utime=111.707s stime=5.221s rtime=117.137s bench: utime=113.653s stime=5.451s rtime=119.318s bench: utime=113.559s stime=5.332s rtime=119.104s bench: utime=113.149s stime=5.261s rtime=118.621s bench: utime=112.254s stime=5.401s rtime=117.868s bench: utime=112.954s stime=5.532s rtime=118.698s -- Peter (A907 E02F A6E5 0CD2 34CD 20D2 6760 79C5 AC40 DD6B)
signature.asc
Description: PGP signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".