On Tue, 2015-03-03 at 12:42 +0000, Nedeljko Babic wrote: > >Removing these removes the dependency of this code on mips32r2 which would > >allow it to be used on processors which have FPU instructions, but not r2 > >instructions (like the mips64el debian port for instance). > > > > I would be more comfortable if there were two instances of this code: one for > mips32r2 and one for mips32 so advantages of using mips32r2 instructions > (however small here) are left intact. > > On the other hand, since this doesn't change much number of instructions used > (adding at maximum around 100 instructions overall if I am not mistaking) I > am ok with this.
Well I can't see how 'ext' can ever be faster than 'and' (it does more work) so most of these should be no slower anyway. For VMUL4S my version has 2 extra instructions in it so it could be a bit slower. Does this #if seem ok? --- a/libavcodec/mips/aacdec_mips.h +++ b/libavcodec/mips/aacdec_mips.h @@ -198,9 +198,18 @@ static inline float *VMUL4S_mips(float *dst, const float *v, unsigned idx, "lwxc1 %[temp12], %[temp3](%[v]) \n\t" "lwxc1 %[temp13], %[temp4](%[v]) \n\t" "and %[temp1], %[sign], %[mask] \n\t" +#if defined(__mips_isa_rev) && __mips_isa_rev >= 2 "ext %[temp2], %[idx], 12, 1 \n\t" "ext %[temp3], %[idx], 13, 1 \n\t" "ext %[temp4], %[idx], 14, 1 \n\t" +#else + "srl %[temp2], %[idx], 12 \n\t" + "srl %[temp3], %[idx], 13 \n\t" + "srl %[temp4], %[idx], 14 \n\t" + "andi %[temp2], %[temp2], 1 \n\t" + "andi %[temp3], %[temp3], 1 \n\t" + "andi %[temp4], %[temp4], 1 \n\t" +#endif "sllv %[sign], %[sign], %[temp2] \n\t" "xor %[temp1], %[temp0], %[temp1] \n\t" "and %[temp2], %[sign], %[mask] \n\t" Thanks, James _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel