h264dsp.c

Michael Niedermayer Thu, 27 Nov 2014 03:29:08 -0800

On Thu, Nov 27, 2014 at 02:35:24PM +0800, rongyan wrote:
> Hi,
> We present 5 patches to fix h264 bugs for POWER8 little endian, which are 
> sent in 5 seperate emails.
> This is the second, to fix the functions  
> h264_idct8_add_altivec();
>  
> h264_idct_dc_add_internal();
>  
> h264_loop_filter_luma_altivec();
>  
> write16x4() VEC_1D_DCT();
>  
> weight_h264_W_altivec();
>  
> biweight_h264_W_altivec();
>  
> VEC_LOAD_U8_ADD_S16_STORE_U8();
>  
> ALTIVEC_STORE_SUM_CLIP();
>  
> And add marcos  GET_2PERM(), dstv_load(),vdst_load(), dest_unligned_store().
> 
> The fate test result after merge these 5 patches can be found on website by 
> searching "ibmcrl", also attached in the below to facilitate the review. The 
> passed test cases change from  2017/2243 to 2209/2245.
>  
> Thanks.
>  Rong Yan
>   
>   ------------------
>   The world has enough for everyone's need, but not enough for everyone's 
> greed.



>  h264dsp.c |  374 
> +++++++++++++++++++++++++++++++++-----------------------------
>  1 file changed, 205 insertions(+), 169 deletions(-)
> dcaccec4338f960704148c933e1ec454dd4dc6a2  
> 0002-libavcodec-ppc-h264dsp.c-fix-h264_idct8_add_altivec.patch
> From 130b20e650a2d83a4c66cd23c10fe943742339f8 Mon Sep 17 00:00:00 2001
> From: Rong Yan <rongyan...@gmail.com>
> Date: Thu, 27 Nov 2014 05:49:53 +0000
> Subject: [PATCH 2/5] libavcodec/ppc/h264dsp.c : fix h264_idct8_add_altivec()
>  h264_idct_dc_add_internal() h264_loop_filter_luma_altivec() write16x4()
>  VEC_1D_DCT() weight_h264_W_altivec() biweight_h264_W_altivec()
>  VEC_LOAD_U8_ADD_S16_STORE_U8() ALTIVEC_STORE_SUM_CLIP() add marcos 
>  GET_2PERM() dstv_load() vdst_load() dest_unligned_store() for POWER LE
> 
> ---
>  libavcodec/ppc/h264dsp.c | 374 
> ++++++++++++++++++++++++++---------------------
>  1 file changed, 205 insertions(+), 169 deletions(-)
> 
> diff --git a/libavcodec/ppc/h264dsp.c b/libavcodec/ppc/h264dsp.c
> index 7fc7e0b..cfce32d 100644
> --- a/libavcodec/ppc/h264dsp.c
> +++ b/libavcodec/ppc/h264dsp.c
> @@ -34,7 +34,7 @@
>   * IDCT transform:
>   
> ****************************************************************************/
>  
> -#define VEC_1D_DCT(vb0,vb1,vb2,vb3,va0,va1,va2,va3)               \
> +#define VEC_1D_DCT(vb0,vb1,vb2,vb3,va0,va1,va2,va3) {\
>      /* 1st stage */                                               \
>      vz0 = vec_add(vb0,vb2);       /* temp[0] = Y[0] + Y[2] */     \
>      vz1 = vec_sub(vb0,vb2);       /* temp[1] = Y[0] - Y[2] */     \
> @@ -46,7 +46,8 @@
>      va0 = vec_add(vz0,vz3);       /* x[0] = temp[0] + temp[3] */  \
>      va1 = vec_add(vz1,vz2);       /* x[1] = temp[1] + temp[2] */  \
>      va2 = vec_sub(vz1,vz2);       /* x[2] = temp[1] - temp[2] */  \
> -    va3 = vec_sub(vz0,vz3)        /* x[3] = temp[0] - temp[3] */
> +    va3 = vec_sub(vz0,vz3);        /* x[3] = temp[0] - temp[3] */\
> +}
>  
>  #define VEC_TRANSPOSE_4(a0,a1,a2,a3,b0,b1,b2,b3) \
>      b0 = vec_mergeh( a0, a0 ); \
> @@ -62,14 +63,23 @@
>      b2 = vec_mergeh( a1, a3 ); \
>      b3 = vec_mergel( a1, a3 )
>  
> -#define VEC_LOAD_U8_ADD_S16_STORE_U8(va)                      \
> -    vdst_orig = vec_ld(0, dst);                               \
> -    vdst = vec_perm(vdst_orig, zero_u8v, vdst_mask);          \
> -    vdst_ss = (vec_s16) vec_mergeh(zero_u8v, vdst);         \
> -    va = vec_add(va, vdst_ss);                                \
> -    va_u8 = vec_packsu(va, zero_s16v);                        \
> -    va_u32 = vec_splat((vec_u32)va_u8, 0);                  \
> -    vec_ste(va_u32, element, (uint32_t*)dst);
> +#if HAVE_BIGENDIAN
> +#define vdst_load(d)\
> +    vdst_orig = vec_ld(0, dst); \
> +    vdst = vec_perm(vdst_orig, zero_u8v, vdst_mask)
> +#else
> +#define vdst_load(d)\
> +    vdst = vec_vsx_ld(0, dst)
> +#endif
> +
> +#define VEC_LOAD_U8_ADD_S16_STORE_U8(va) {\
> +    vdst_load();\
> +    vdst_ss = (vec_s16) VEC_MERGEH(zero_u8v, vdst);\
> +    va = vec_add(va, vdst_ss);\
> +    va_u8 = vec_packsu(va, zero_s16v);\
> +    va_u32 = vec_splat((vec_u32)va_u8, 0);\
> +    vec_ste(va_u32, element, (uint32_t*)dst);\
> +}

please dont mix whitespace changes with functional changes
this makes the patch and commit unreadable
it also can cause problems for other developers as rebasing their
work becomes harder if the code changed alot
please leave the whitespaces in place

git show HEAD^^^ -w --stat
 libavcodec/ppc/h264dsp.c |  106 +++++++++++++++++++++++++++++++---------------
 1 file changed, 71 insertions(+), 35 deletions(-)


git show HEAD^^^  --stat
 libavcodec/ppc/h264dsp.c |  374 +++++++++++++++++++++++++---------------------
 1 file changed, 205 insertions(+), 169 deletions(-)

 [...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Dictatorship naturally arises out of democracy, and the most aggravated
form of tyranny and slavery out of the most extreme liberty. -- Plato

signature.asc
Description: Digital signature

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [Patch 2/5]Fix h264 on POWER LE: libavcodec/ppc/h264dsp.c

Reply via email to