On Wed, 2018-01-10 at 15:22 -0800, David Lang wrote: > I somewhat hate to ask this, but for those of us following at home, what does > this add to the overhead? > > I am remembering an estimate from mid last week that put retpoline at > replacing > a 3 clock 'ret' with 30 clocks of eye-bleed code
Retpoline doesn't replace 'ret'. It replaces indirect branches (jmp *%rax) of which there aren't quite as many in the kernel. The eye-bleed retpoline thunk does actually stop speculation and cause a pipeline stall. For the RSB stuffing that's not the case; there are no barriers here. The actual performance numbers depend on the precise CPU being used, and I'm not sure anyone has done the microbenchmarks of each *specific* part for of the mitigations separately. For this *particular* patch... well, we strive to avoid vmexits anyway, and Intel has spent the last decade adding more and more tricks to the CPU to help us *avoid* vmexits. So a little extra overhead on the vmexit is something we can probably tolerate. FWIW the IBRS microcode also requires the RSB-stuffing, so it's kind of orthogonal to the "retpoline is much faster than IBRS" observation.
smime.p7s
Description: S/MIME cryptographic signature