float_dsp: add ff_vector_dmul_{sse2, avx}

James Almer Fri, 14 Sep 2018 06:27:01 -0700

On 9/14/2018 9:57 AM, Henrik Gramner wrote:
> On Thu, Sep 13, 2018 at 3:08 PM, James Almer <jamr...@gmail.com> wrote:
>> +    lea       lenq, [lend*8 - mmsize*4]
> 
> Is len guaranteed to be a multiple of mmsize/8? Otherwise this would
> cause misalignment. It will also break if len < mmsize/2.


len must be a multiple of 16 as per the doxy, so yes.
The only way for len to be < mmsize/2 is if we add an avx512 version.

> 
> Also if you want a 32-bit result from lea it should be written as "lea
> lend, [lenq*8 - mmsize*4]" which is equivalent but has a shorter
> opcode (e.g. always use native sizes within brackets).

len is an int, so I assume this is only possible here because it's an
argument passed in a reg and not stack? Otherwise, the upper 32bits
would probably make a mess with the multiplication. See for example
vector_fmul_add where len is the fifth argument.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 2/2] avutil/float_dsp: add ff_vector_dmul_{sse2, avx}

Reply via email to