> On Thu, Mar 28, 2019 at 9:29 AM Andy Lutomirski <l...@amacapital.net> wrote: > > Doesn’t this just leak some of the canary to user code through side > > channels? > > Erf, yes, good point. Let's just use prandom and be done with it.
And here I have some numbers on this. Actually prandom turned out to be pretty fast, even when called every syscall. See the numbers below: 1) lmbench: ./lat_syscall -N 1000000 null base: Simple syscall: 0.1774 microseconds random_offset (prandom_u32() every syscall): Simple syscall: 0.1822 microseconds random_offset (prandom_u32() every 4th syscall): Simple syscall: 0.1844 microseconds 2) Andy's tests, misc-tests: ./timing_test_64 10M sys_enosys base: 10000000 loops in 1.62224s = 162.22 nsec / loop random_offset (prandom_u32() every syscall): 10000000 loops in 1.64660s = 166.26 nsec / loop random_offset (prandom_u32() every 4th syscall): 10000000 loops in 3.51315s = 169.30 nsec / loop The second case is when prandom is called only once in 4 syscalls and unused random bits are preserved in a per-cpu buffer. As you can see it is actually slower (modulo my maybe not so optimized code in prandom, see below) vs. calling it every time, so I would vote for actually calling it every time and saving on the hassle and also avoid additional code in prandom. And below is what I was calling instead of prandom_u32() to preserve random bits (net_rand_state_buffer is a new per-cpu buffer I added to save random bits): And I didn't include the check for bytes >= sizeof(u32) since this was just poc to test the base speed, but for generic case it would be needed. +void prandom_bytes_preserve(void *buf, size_t bytes) +{ + u32 *buffer = &get_cpu_var(net_rand_state_buffer); + u8 *ptr = buf; + + if (!(*buffer)) { + struct rnd_state *state = &get_cpu_var(net_rand_state); + if (bytes > 0) { + *buffer = prandom_u32_state(state); + do { + *ptr++ = (u8) *buffer; + bytes--; + *buffer >>= BITS_PER_BYTE; + } while (bytes > 0); + } + put_cpu_var(net_rand_state); + put_cpu_var(net_rand_state_buffer); + } else { + if (bytes > 0) { + do { + *ptr++ = (u8) *buffer; + bytes--; + *buffer >>= BITS_PER_BYTE; + } while (bytes > 0); + } + put_cpu_var(net_rand_state_buffer); + } +} I will send the first version of patch (calling prandom_u32() every time) shortly if anyone wants to double check performance implications. Best Regards, Elena.