On Fri, Jan 5, 2018 at 3:26 AM, Paolo Bonzini <pbonz...@redhat.com> wrote:
> On 05/01/2018 11:28, Paul Turner wrote:
>>
>> The "pause; jmp" sequence proved minutely faster than "lfence;jmp" which is 
>> why
>> it was chosen.
>>
>>   "pause; jmp" 33.231 cycles/call 9.517 ns/call
>>   "lfence; jmp" 33.354 cycles/call 9.552 ns/call
>
> Do you have timings for a non-retpolined indirect branch with the
> predictor suppressed via IBRS=1?  So at least we can compute the break
> even point.

The data I collected here previously had the run-time cost as a wash.
On Skylake, an IBRS=1 and a retpolined indirect branch had cost within
a few cycles.

The costs to consider when making a choice here are:

- The transition overheads.  This is how frequently will you be
switching in and out of protected code (as IBRS needs to be enabled
and disabled at these boundaries).
- The frequency at which you will be executing protected code on one
sibling, and unprotected code on another (enabling IBRS may affect
sibling execution, depending on SKU)
- The implementation cost (retpoline requires auditing/rebuilding your
target, while IBRS can be used out of the box).


>
> Paolo

Reply via email to