On 09/26/2011 11:15 AM, Laurent Desnogues wrote:
On Mon, Sep 26, 2011 at 10:01 AM, Mulyadi Santosa
<mulyadi.sant...@gmail.com> wrote:
> Hi...
>
> On Mon, Sep 26, 2011 at 14:46, Jan Kiszka<jan.kis...@siemens.com> wrote:
>> This increases the overhead of frequently executed helpers.
>>
>> Signed-off-by: Jan Kiszka<jan.kis...@siemens.com>
>
> IMHO, stack protector setup put more stuffs during epilogue, but quite
> likely it is negligible unless it cause too much L1 cache misses. So,
> I think this micro tuning is somewhat unnecessary but still okay.
The impact of stack protection is very high for instance running
FFmpeg ARM with NEON optimizations: a few months ago I
measured that removing stack protection improved the run time
by more than 10%. Of course it's extreme since the proportion
of NEON instructions (and hence of helper calls) is very high.
I saw a lot of helper calls for sse in ordinary x86_64 code, likely for
memcpy/cmp and friends. Native tcg ops for common vector instructions
would probably be quite a speedup.
--
error compiling committee.c: too many arguments to function