me_cmp.c

rongyan Mon, 10 Nov 2014 22:16:32 -0800

Hi,
Thanks for your patience and review.
 New patch please find in the attachment. This is the first of two.
The fate test result is here:
 
 Regards,
 Rong Yan
  
 
-------------------------------------------------------------------------------------------
 Subject: [FFmpeg-devel] [patch 2/4] Fix bug for POWERLE: 
libavcodec/ppc/me_cmp.c
------------------------


From: rongyan <[email protected]>
Date: 2014-11-07 17:37 GMT+08:00
To: FFmpeg development discussions <[email protected]>


Hi,
There are 4 patches presented to fix bugs for POWER8 little endian. I will send 
4 patches in 4 different email. This is the second.

It fixed the function  hadamard8_diff8x8_altivec(),  
hadamard8_diff16x8_altivec(),  sad16_x2_altivec(),     sad16_y2_altivec(), 
sad16_xy2_altivec(), sad16_altivec(), sad8_altivec(), sse16_altivec(), 
sse8_altivec().

The fate test result on POWER BE and POWER LE after merge these 4 patches are 
attached here to facilitate the review:

 The passed test cases change from 1679/2182 to 2010/2236.



 Rong Yan


  ------------------
  The world has enough for everyone's need, but not enough for everyone's greed.
_______________________________________________
ffmpeg-devel mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


----------
From: Michael Niedermayer <[email protected]>
Date: 2014-11-07 19:44 GMT+08:00
To: FFmpeg development discussions and patches <[email protected]>


On Fri, Nov 07, 2014 at 05:37:43PM +0800, rongyan wrote:
[...]

> @@ -88,11 +49,9 @@ static int sad16_x2_altivec(MpegEncContext *v, uint8_t 
> *pix1, uint8_t *pix2,
>          /* Read unaligned pixels into our vectors. The vectors are as 
> follows:
>           * pix1v: pix1[0] - pix1[15]
>           * pix2v: pix2[0] - pix2[15]      pix2iv: pix2[1] - pix2[16] */
> -        vector unsigned char pix1v  = vec_ld(0,  pix1);
> -        vector unsigned char pix2l  = vec_ld(0,  pix2);
> -        vector unsigned char pix2r  = vec_ld(16, pix2);
> -        vector unsigned char pix2v  = vec_perm(pix2l, pix2r, perm1);
> -        vector unsigned char pix2iv = vec_perm(pix2l, pix2r, perm2);
> +        vector unsigned char pix1v  = VEC_LD(0,  pix1);
> +        vector unsigned char pix2v  = VEC_LD(0,  pix2);
> +        vector unsigned char pix2iv = VEC_LD(1,  pix2);
>
>          /* Calculate the average vector. */
>          vector unsigned char avgv = vec_avg(pix2v, pix2iv);

this would add vec_perm vec_ld and vec_lvsl to
vector unsigned char pix1v  = vec_ld(0,  pix1);
for big endian, which would slow it down

[...]

--
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Breaking DRM is a little like attempting to break through a door even
though the window is wide open and the only thing in the house is a bunch
of things you dont want and which you would get tomorrow for free anyway

_______________________________________________
ffmpeg-devel mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
_______________________________________________
ffmpeg-devel mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


----------
From: Michael Niedermayer <[email protected]>
Date: 2014-11-10 22:53 GMT+08:00
To: FFmpeg development discussions and patches <[email protected]>
Cc: Tony Lin <[email protected]>


On Mon, Nov 10, 2014 at 05:01:56PM +0800, rongyan wrote:
> Hi,
>  New patch please find in the attachment. There are two patches to re-submit, 
> this is the first.
>  The fate test result is here:
>
>  Rong Yan
[...]

> -        vector unsigned char pix2l  = vec_ld(0,  pix2);
> -        vector unsigned char pix2r  = vec_ld(16, pix2);
> -        vector unsigned char pix2v  = vec_perm(pix2l, pix2r, perm1);
> -        vector unsigned char pix2iv = vec_perm(pix2l, pix2r, perm2);
> +        vector unsigned char pix2v  = VEC_LD(0,  pix2);
> +        vector unsigned char pix2iv = VEC_LD(1,  pix2);

this doubles the number of vec_ld() on big endian


[...]
> @@ -356,11 +168,8 @@ static int sad16_xy2_altivec(MpegEncContext *v, uint8_t 
> *pix1, uint8_t *pix2,
>           * pix1v: pix1[0] - pix1[15]
>           * pix3v: pix3[0] - pix3[15]      pix3iv: pix3[1] - pix3[16] */
>          pix1v  = vec_ld(0, pix1);
> -
> -        pix2l  = vec_ld(0, pix3);
> -        pix2r  = vec_ld(16, pix3);
> -        pix3v  = vec_perm(pix2l, pix2r, perm1);
> -        pix3iv = vec_perm(pix2l, pix2r, perm2);
> +        pix3v  = VEC_LD(0, pix3);
> +        pix3iv = VEC_LD(1, pix3);
>
>          /* Note that AltiVec does have vec_avg, but this works on vector 
> pairs
>           * and rounds up. We could do avg(avg(a, b), avg(c, d)), but the

this also doubles the number of vec_ld() on big endian

[...]

--
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Its not that you shouldnt use gotos but rather that you should write
readable code and code with gotos often but not always is less readable

0001-libavcodec-ppc-me_cmp.c-fix-hadamard8_diff8x8_altive.patch
Description: Binary data

_______________________________________________
ffmpeg-devel mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] Resubmit patch01 - RE?? [patch 2/4] Fix bug for POWERLE: libavcodec/ppc/me_cmp.c

Reply via email to