On Wed, Mar 15, 2017 at 01:14:42PM +0100, Matthieu Bouron wrote: > On Mon, Mar 06, 2017 at 03:48:57PM +0100, Matthieu Bouron wrote: > > On Thu, Feb 23, 2017 at 04:59:16PM +0100, Matthieu Bouron wrote: > > > Hello, > > > > > > The following patchset add the ff_simple_idct function neon functions for > > > the > > > aarch64 platform. It's ported from armv7 simple_idct_neon with some > > > improvements: > > > * the source idct blocks are now loaded once and kept in v24-v31 > > > * the source idct blocks are no longer overriden in idct_col4_top > > > * the destination is now written in one pass at the end of > > > ff_simple_idct{,_put,_add}_neon > > > > > > It is bitexact with the armv7 neon implementation. > > > > > > Here are some results (reported by {START,STOP}_TIMER) on an Odroid-C2 > > > (Cortex > > > A53): > > > > > > Functions IDCT: simple IDCT: simpleneon > > > ff_simple_idct_put 9795 units 3170 units > > > ff_simple_idct_add 10227 units 3302 units > > > > > > > Ping. > > I'd like to push the patch tomorrow if there is no objection. > > If that helps, here is the output of mjpegdec with simple and simpleneon > idct methods. > > Original: http://0x5c.me/idct/original.jpg > Simple: http://0x5c.me/idct/simplec.png > Simpleneon: http://0x5c.me/idct/simpleneon.png > > The diff between simple and simpleneon shows off some off by 1 > differences: http://0x5c.me/idct/diff.png (simpleneon aarch64 is bitexact > with its armv7 counterpart though).
Patchset pushed. Matthieu [...] _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel