On Sun, Jan 31, 2016 at 06:18:53PM -0300, James Almer wrote: > On 1/31/2016 4:48 PM, Timothy Gu wrote: > > --- > > libavcodec/x86/vc1dsp.asm | 104 ++++++++++++++++++++++ > > libavcodec/x86/vc1dsp_init.c | 13 +++ > > libavcodec/x86/vc1dsp_mmx.c | 207 > > ------------------------------------------- > > 3 files changed, 117 insertions(+), 207 deletions(-) > > > > diff --git a/libavcodec/x86/vc1dsp.asm b/libavcodec/x86/vc1dsp.asm > > index 6415a83..f922927 100644 > > --- a/libavcodec/x86/vc1dsp.asm > > +++ b/libavcodec/x86/vc1dsp.asm > > @@ -395,3 +395,107 @@ cglobal vc1_put_ver_16b_shift2, 4,7,0, dst, src, > > stride > > jnz .loop > > REP_RET > > %endif ; HAVE_MMX_INLINE > > + > > +%macro INV_TRANS_INIT 0 > > + movsxdifnidn linesizeq, linesized > > Maybe change the prototype so linesize is ptrdiff_t?
I wanted to do that at first, but then I realized that to change this I'd need to change simple_idct and a bunch of other decoders. I do want to come back to this, but that just seems too much work for just four functions =P [...] > > +; ff_vc1_inv_trans_?x?_dc_mmxext(uint8_t *dest, int linesize, int16_t > > *block) > > +INIT_MMX mmxext > > +cglobal vc1_inv_trans_4x4_dc, 3,4,0, dest, linesize, block > > + movsx r3d, WORD [blockq] > > Can this value be negative? I'm not 100% certain but I believe it can be. > Because you're using it as an argument > for lea using native size after movsx sign extended the value to 32 > bits, which means that on x86_64 the upper bits of the register will > be zeroed. > > If it can you'll have to use blockq/r3q everywhere, and if it can't > then use movzx and shr. Changed locally to blockq/r3. I was emulating GCC's code generation but seems like there isn't much difference. Timothy _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel