The code looks good to me. I think the wrapper is fine, because that part of
code is not suitable for NEON assembly.
But you can remove the using of `sizeof(uint8_t)` as suggested by Carl.
Shengbin Meng
> On 19 Mar 2018, at 12:41, Yingming Fan wrote:
>
> Hi, is there any review about this pat
Hi,
By checkasm benchmark, I can see a speedup of ~3x for band mode and ~6x for
edge mode on my device (the device has aarch64 CPU, but I configured ffmpeg
with `—arch=arm`). And FATE passed as well.
Results of a checkasm run:
$./tests/checkasm/checkasm --test=hevc_sao --bench
$ sudo ./tests/c
Hi, is there any review about this patch? What’s your option about wrapper we
used in this patch.
Yingming Fan
> On 11 Mar 2018, at 8:59 PM, Yingming Fan wrote:
>
>
>> On 11 Mar 2018, at 8:54 PM, Carl Eugen Hoyos wrote:
>>
>> 2018-03-08 8:03 GMT+01:00 Yingming Fan :
>>> From: Meng Wang
>>
> On 11 Mar 2018, at 8:54 PM, Carl Eugen Hoyos wrote:
>
> 2018-03-08 8:03 GMT+01:00 Yingming Fan :
>> From: Meng Wang
>
>> +stride_dst /= sizeof(uint8_t);
>> +stride_src /= sizeof(uint8_t);
>
> FFmpeg requires sizeof(uint8_t) to be 1, please simplify
> your patch accordingly.
>
> Why
2018-03-08 8:03 GMT+01:00 Yingming Fan :
> From: Meng Wang
> +stride_dst /= sizeof(uint8_t);
> +stride_src /= sizeof(uint8_t);
FFmpeg requires sizeof(uint8_t) to be 1, please simplify
your patch accordingly.
Why is the wrapper function needed?
Carl Eugen
___
Hi, there.
I have already pushed a patch which add hevc_sao checkasm and patch was adopted.
You can verify this optimization by using checkasm under arm device, `checkasm
--test=hevc_sao --bench`.
hevc_sao_band speed up ~2x, hevc_sao_edge speed up ~4x. Also passed FATE under
arm platform.
Yingm
From: Meng Wang
Signed-off-by: Meng Wang
---
As FFmpeg hevc decoder have no SAO neon optimization, we add sao_band and
sao_edge neon codes in this patch.
I have already submit a patch called 'checkasm/hevc_sao : add hevc_sao for
checkasm' several days ago.
Results below was printed by hevc_sao