Re: [FFmpeg-devel] [aarch64] improve performance of ff_yuv2planeX_8_neon

2020-01-04 Thread Michael Niedermayer
On Sat, Jan 04, 2020 at 05:53:34PM +0100, Clément Bœsch wrote: > On Tue, Dec 10, 2019 at 04:38:25PM -0600, Sebastian Pop wrote: > > Hi, > > > > This patch rewrites the innermost loop of ff_yuv2planeX_8_neon to avoid > > zips and > > horizontal adds by using fused multiply adds. The patch also use

Re: [FFmpeg-devel] [aarch64] improve performance of ff_yuv2planeX_8_neon

2020-01-04 Thread Clément Bœsch
On Tue, Dec 10, 2019 at 04:38:25PM -0600, Sebastian Pop wrote: > Hi, > > This patch rewrites the innermost loop of ff_yuv2planeX_8_neon to avoid zips > and > horizontal adds by using fused multiply adds. The patch also uses ld1r to load > one element and replicate it across all lanes of the vecto

Re: [FFmpeg-devel] [aarch64] improve performance of ff_yuv2planeX_8_neon

2019-12-25 Thread Sebastian Pop
On Mon, Dec 16, 2019 at 3:56 PM Jean-Baptiste Kempf wrote: > > On Tue, Dec 10, 2019, at 23:38, Sebastian Pop wrote: >> Please let me know how I can improve the patch. > > No remarks from me. > Clément, any further feedback to improve the patch? Ok to commit? Thanks, Sebastian ___

Re: [FFmpeg-devel] [aarch64] improve performance of ff_yuv2planeX_8_neon

2019-12-16 Thread Jean-Baptiste Kempf
On Tue, Dec 10, 2019, at 23:38, Sebastian Pop wrote: > Please let me know how I can improve the patch. No remarks from me. -- Jean-Baptiste Kempf - President +33 672 704 734 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mail