aarch64: Add rgb24 to yuv implementation

Zhao Zhili Mon, 03 Jun 2024 06:11:37 -0700


> On Jun 3, 2024, at 16:07, Martin Storsjö <mar...@martin.st> wrote:
> 
> On Mon, 3 Jun 2024, Zhao Zhili wrote:
> 
>> diff --git a/libswscale/aarch64/input.S b/libswscale/aarch64/input.S
>> new file mode 100644
>> index 0000000000..0a46475723
>> --- /dev/null
>> +++ b/libswscale/aarch64/input.S
>> @@ -0,0 +1,229 @@
>> +/*
>> + * Copyright (c) 2024 Zhao Zhili <quinkbl...@foxmail.com>
>> + *
>> + * This file is part of FFmpeg.
>> + *
>> + * FFmpeg is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU Lesser General Public
>> + * License as published by the Free Software Foundation; either
>> + * version 2.1 of the License, or (at your option) any later version.
>> + *
>> + * FFmpeg is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * Lesser General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU Lesser General Public
>> + * License along with FFmpeg; if not, write to the Free Software
>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
>> USA
>> + */
>> +
>> +#include "libavutil/aarch64/asm.S"
>> +
>> +.macro rgb24_to_yuv_load_rgb, src
>> +        ld3.16b         { v16, v17, v18 }, [\src]
>> +        ushll.8h        v19, v16, #0         // v19: r
>> +        ushll.8h        v20, v17, #0         // v20: g
>> +        ushll.8h        v21, v18, #0         // v21: b
>> +        ushll2.8h       v22, v16, #0         // v22: r
>> +        ushll2.8h       v23, v17, #0         // v23: g
>> +        ushll2.8h       v24, v18, #0         // v24: b
> 
> Don't use this nonstandard, Apple specific aarch64 syntax. This was used by 
> Apple tools at the start, when the proper standardized aarch64 syntax wasn't 
> quite settled yet, and it is still accepted. (And apparently this is still 
> the preferred form to disassemble things in, for apple platforms.)
> 
> With this syntax, the assembly is rejected by GNU binutils and MSVC.
> 
>> +function ff_rgb24ToY_neon, export=1
>> +        cmp             w4, #0                  // check width > 0
>> +        b.le            4f
>> +
>> +        ldp             w10, w11, [x5], #8       // w10: ry, w11: gy
>> +        dup             v0.8H, w10
>> +        dup             v1.8H, w11
>> +        ldr             w12, [x5]               // w12: by
>> +        dup             v2.8H, w12
> 
> Don't use uppercase .8H for field layout configurations, we prefer to stick 
> to all lowercase here - see 184103b3105f02f1189fa0047af4269e027dfbd6. The 
> same goes for a number of places in this patch.
> 
>> +        add             w9, w9, #1              // i++
>> +        add             x3, x3, #6              // src += 6
>> +3:
>> +        cmp w9, w5
>> +        b.lt 2b
>> +4:
> 
> Incorrect indentation for the cmp/b.lt instructions here.
> 
> 
> I have set up a bunch of github actions for testing aarch64 assembly - see 
> https://github.com/mstorsjo/ffmpeg/commits/gha-aarch64. If you have a github 
> account, grab a copy of this branch into your repo, add your own commits on 
> top, and push to your fork (and if necessary, activate running the actions), 
> then you should get a wide testing of your patches.
> 
> See https://github.com/mstorsjo/FFmpeg/actions/runs/9346228714 for one 
> example run of these actions with your patches.


Wow, it’s very helpful. This is the action result of the updated patch:

https://github.com/quink-black/FFmpeg/actions/runs/9350348848

https://ffmpeg.org/pipermail/ffmpeg-devel/2024-June/328786.html

The test still failed on x86, but success on all arm64 platform and
longarch. I have tried to call rgb24ToY_c and ff_rgb24ToY_avx
directly and compare the results, they don't match. I’m confused.

> 
> // Martin
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/2] swscale/aarch64: Add rgb24 to yuv implementation

Reply via email to