On Sat, Aug 5, 2017 at 12:58 AM, Ivan Kalvachev <ikalvac...@gmail.com> wrote: > 8 packed, 8 scalar. > > Unless I miss something (and as I've said before, > I'm not confident enough to mess with that code.) > > (AVX does extend to 32 variants, but they are not > SSE compatible, so no need to emulate them.)
Oh, right. I quickly glanced at the docs and saw 32 pseudo-ops for each instruction for a total of 128 when adding pd, ps, sd, ss, but the fact that only the first 8 is relevant here reduces it to 32 which is a lot more manageable. > movaps m1, [WRT_PIC_BASE + const_2 + r2 ] > > Looks better. (Also not tested. Will do, later.) I intentionally used the WRT define at the end because that's most similar to the built in wrt syntax used when accessing symbols through the PLT or GOT, e.g. mov eax, [external_symbol wrt ..got] > Yeh $$ is the start of the current section, and that's is going to be > ".text" not "rodata". Obviously, yes. You need a reference that results in a compile-time constant PC-offset (which .rodata isn't) to create PC-relative relocation records to external symbols. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel