On Tue, Mar 26, 2019 at 9:31 PM Andy Lutomirski <l...@kernel.org> wrote: > > On Tue, Mar 26, 2019 at 3:35 AM Reshetova, Elena > <elena.reshet...@intel.com> wrote: > > > > > On Mon, Mar 18, 2019 at 1:16 PM Andy Lutomirski <l...@kernel.org> wrote: > > > > On Mon, Mar 18, 2019 at 2:41 AM Elena Reshetova > > > > <elena.reshet...@intel.com> wrote: > > > > > Performance: > > > > > > > > > > 1) lmbench: ./lat_syscall -N 1000000 null > > > > > base: Simple syscall: 0.1774 microseconds > > > > > random_offset (rdtsc): Simple syscall: 0.1803 microseconds > > > > > random_offset (rdrand): Simple syscall: 0.3702 microseconds > > > > > > > > > > 2) Andy's tests, misc-tests: ./timing_test_64 10M sys_enosys > > > > > base: 10000000 loops in 1.62224s = 162.22 > > > > > nsec / loop > > > > > random_offset (rdtsc): 10000000 loops in 1.64660s = 164.66 > > > > > nsec / loop > > > > > random_offset (rdrand): 10000000 loops in 3.51315s = 351.32 nsec > > > > > / loop > > > > > > > > > > > > > Egads! RDTSC is nice and fast but probably fairly easy to defeat. > > > > RDRAND is awful. I had hoped for better. > > > > > > RDRAND can also fail. > > > > > > > So perhaps we need a little percpu buffer that collects 64 bits of > > > > randomness at a time, shifts out the needed bits, and refills the > > > > buffer when we run out. > > > > > > I'd like to avoid saving the _exact_ details of where the next offset > > > will be, but if nothing else works, this should be okay. We can use 8 > > > bits at a time and call prandom_u32() every 4th call. Something like > > > prandom_bytes(), but where it doesn't throw away the unused bytes. > > > > Actually I think this would make the end result even worse security-wise > > than simply using rdtsc() on every syscall. Saving the randomness in percpu > > buffer, which is probably easily accessible and can be probed if needed, > > would supply attacker with much more knowledge about the next 3-4 > > random offsets that what he would get if we use "weak" rdtsc. Given > > that for a successful exploit, an attacker would need to have stack aligned > > once only, having a knowledge of 3-4 next offsets sounds like a present to > > an > > exploit writer... Additionally it creates complexity around the code that I > > have issues justifying with "security" argument because of above...
That certainly solidifies my concern against saving randomness. :) > > I have the patch now with alloca() and rdtsc() working, I can post it > > (albeit it is very simple), but I am really hesitating on adding the percpu > > buffer randomness storage to it... > > > > Hmm. I guess it depends on what types of attack you care about. I > bet that, if you do a bunch of iterations of mfence;rdtsc;syscall, > you'll discover that the offset between the user rdtsc and the > syscall's rdtsc has several values that occur with high probability. How about rdtsc xor with the middle word of the stack canary? (to avoid the 0-byte) Something like: rdtsc xorl [%gs:...canary....], %rax andq $__MAX_STACK_RANDOM_OFFSET, %rax I need to look at the right way to reference the canary during that code. Andy might know off the top of his head. :) -Kees -- Kees Cook