Re: [FFmpeg-devel] [PATCH] 8-bit hevc decoding optimization on aarch64 with neon

Rafal Dabrowa Tue, 21 Nov 2017 06:24:48 -0800

On 11/21/2017 11:51 AM, Shengbin Meng wrote:

On 19 Nov 2017, at 01:35, Rafal Dabrowa <fatwild...@gmail.com<mailto:fatwild...@gmail.com>> wrote:
This is a proposal of performance optimizations for 8-bit
hevc video decoding on aarch64 platform with neon (simd) extension.
Nice to see the work for aarch64!
We are also in the process of doing NEON optimization for HEVCdecoding.(http://ffmpeg.org/pipermail/ffmpeg-devel/2017-October/218233.html)
Now we are just about to finish arm 32-bit work and ready to send somepatches out. Looks like for aarch64 we can join force:) What do you think?

Why not. I started to work on aarch64 because my device, although hasVPU, but the VPU does not support hevc. Hence the h264 format, even fullHD one is played smoothly but playback of hevc looks poorly. I wascurious how much hevc decoding might be optimized. I optimized onefunction, then another one...


Currently I'm focused on patch size reduction. But I'm open to cooperation.

The patch contains optimizations for most heavily used qpel, epel,sao and idct
functions.  Among the functions provided for optimization there are two
intensively used, but not optimized in this patch:hevc_v_loop_filter_luma_8
and hevc_h_loop_filter_luma_8. I have no idea how they could be optimized
hence I leaved them without optimizations.
I see that optimization for loop filter already exists for arm 32-bitcode. Why not use that algorithm?

Maybe... Although optimization for aarch64 is a different story. I havenoticed that gcc with -O3 option on aarch64 produces really good code. Iwas surprised how much the code execution time is reduced in some cases.Sometimes it is hard to optimize code better than compiler does.



Rafal Dabrowa
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] 8-bit hevc decoding optimization on aarch64 with neon

Reply via email to