On 9/14/2018 9:57 AM, Henrik Gramner wrote: > On Thu, Sep 13, 2018 at 3:08 PM, James Almer <jamr...@gmail.com> wrote: >> + lea lenq, [lend*8 - mmsize*4] > > Is len guaranteed to be a multiple of mmsize/8? Otherwise this would > cause misalignment. It will also break if len < mmsize/2.
len must be a multiple of 16 as per the doxy, so yes. The only way for len to be < mmsize/2 is if we add an avx512 version. > > Also if you want a 32-bit result from lea it should be written as "lea > lend, [lenq*8 - mmsize*4]" which is equivalent but has a shorter > opcode (e.g. always use native sizes within brackets). len is an int, so I assume this is only possible here because it's an argument passed in a reg and not stack? Otherwise, the upper 32bits would probably make a mess with the multiplication. See for example vector_fmul_add where len is the fifth argument. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel