yuv2rgb: macro-ify

Benoit Fouet Thu, 31 Mar 2016 01:48:53 -0700

Hi,

(sorry for the first mail, fuzzy fingers...)


On 28/03/2016 21:19, Matthieu Bouron wrote:

---
  libswscale/arm/yuv2rgb_neon.S | 137 ++++++++++++++++++------------------------
  1 file changed, 60 insertions(+), 77 deletions(-)

diff --git a/libswscale/arm/yuv2rgb_neon.S b/libswscale/arm/yuv2rgb_neon.S
index ef7b0a6..e1b68c1 100644
--- a/libswscale/arm/yuv2rgb_neon.S
+++ b/libswscale/arm/yuv2rgb_neon.S
@@ -64,7 +64,7 @@
      vmov.u8             \a2, #255
  .endm

-.macro compute_16px dst y0 y1 ofmt

+.macro compute dst y0 y1 ofmt
      vmovl.u8            q14, \y0                                       @ 8px 
of y
      vmovl.u8            q15, \y1                                       @ 8px 
of y

@@ -99,23 +99,23 @@.endm-.macro process_1l_16px ofmt

+.macro process_1l ofmt
      compute_premult     d28, d29, d30, d31
      vld1.8              {q7}, [r4]!
-    compute_16px        r2, d14, d15, \ofmt
+    compute             r2, d14, d15, \ofmt
  .endm

-.macro process_2l_16px ofmt

+.macro process_2l ofmt
      compute_premult     d28, d29, d30, d31

vld1.8 {q7}, [r4]! @ first line of luma

-    compute_16px        r2, d14, d15, \ofmt
+    compute             r2, d14, d15, \ofmt

vld1.8 {q7}, [r12]! @ second line of luma

-    compute_16px        r11, d14, d15, \ofmt
+    compute             r11, d14, d15, \ofmt
  .endm


This renaming could be split

[...]

@@ -232,68 +204,79 @@ function ff_\ifmt\()_to_\ofmt\()_neon, export=1
      vld1.8              d3, [r10]!                                     @ d3: 
chroma blue line
      vsubl.u8            q14, d2, d10                                   @ q14 
= U - 128
      vsubl.u8            q15, d3, d10                                   @ q15 
= V - 128
+.endm

- process_2l_16px \ofmt

-.endif
-
-.ifc \ifmt,yuv422p
+.macro load_chroma_yuv422p
      pld [r10, #64*3]

vld1.8 d2, [r6]! @ d2: chroma red line

      vld1.8              d3, [r10]!                                     @ d3: 
chroma blue line
      vsubl.u8            q14, d2, d10                                   @ q14 
= U - 128
      vsubl.u8            q15, d3, d10                                   @ q15 
= V - 128
+.endm

- process_1l_16px \ofmt

-.endif
-
-    subs                r8, r8, #16                                    @ width 
-= 16
-    bgt                 2b
-
-    add                 r2, r2, r3                                     @ dst   
+= padding
-    add                 r4, r4, r5                                     @ srcY  
+= paddingY
-
-.ifc \ifmt,nv12
+.macro increment_nv12


How about increment_and test_nv12? Same for the other ones.

(I'm not happy with the name I found, but am trying to come up with asolution to have a more explicit naming)


--
Ben

_______________________________________________
ffmpeg-devel mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH v2 6/9] swscale/arm/yuv2rgb: macro-ify

Reply via email to