On Fri, Jul 11, 2014 at 12:14:28AM +0100, Ben Avison wrote: > The previous implementation targeted DTS Coherent Acoustics, which only > requires mdct_bits == 6. This relatively small size lent itself to > unrolling the loops a small number of times, and encoding offsets > calculated at assembly time within the load/store instructions of each > iteration. > > In the more general case (codecs such as AAC and AC3) much larger arrays > are used - mdct_bits == [8, 9, 11]. The old method does not scale for > these cases, so more integer registers are used with non-unrolled versions > of the loops (and with some stack spillage). The postrotation filter loop > is still unrolled by a factor of 2 to permit the double-buffering of some > VFP registers to facilitate overlap of neighbouring iterations. > > I benchmarked the result by measuring the number of gperftools samples > that hit anywhere in the AAC decoder (starting from aac_decode_frame()) > or specifically in ff_imdct_half_c / ff_imdct_half_vfp, for the same > example AAC stream: > > Before After > Mean StdDev Mean StdDev Confidence Change > aac_decode_frame 2368.1 35.8 2117.2 35.3 100.0% +11.8% > ff_imdct_half_* 457.5 22.4 251.2 16.2 100.0% +82.1% > --- > libavcodec/arm/mdct_vfp.S | 146 > ++++++++++++++++++++++++++++++++++++++++++++- > 1 files changed, 144 insertions(+), 2 deletions(-)
applied thanks [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB The real ebay dictionary, page 3 "Rare item" - "Common item with rare defect or maybe just a lie" "Professional" - "'Toy' made in china, not functional except as doorstop" "Experts will know" - "The seller hopes you are not an expert"
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel