From: Reshetova, Elena > Sent: 30 April 2019 18:51 ... > I guess this is true, so I did a quick implementation now to estimate the > performance hits. > Here are the preliminary numbers (proper ones will take a bit more time): > > base: Simple syscall: 0.1761 microseconds > get_random_bytes (4096 bytes per-cpu buffer): 0.1793 microsecons > get_random_bytes (64 bytes per-cpu buffer): 0.1866 microsecons > > It does not make sense to go less than 64 bytes since this seems to be > Chacha20 block size, so if we go lower, we will trash useful bits. > You can go even higher than 4096 bytes, but even this looks like > okish performance to me. > > Below is a snip of what I quickly did (relevant parts) to get these numbers. > I do initial population of per-cpu buffers in late_initcall, but > practice shows that rng might not always be in good state by then. > So, we might not have really good randomness then, but I am not sure > if this is a practical problem since it only applies to system boot and by > the time it booted, it already issued enough syscalls that buffer gets > refilled > with really good numbers. > Alternatively we can also do it on the first syscall that each cpu gets, but I > am not sure if that is always guaranteed to have a good randomness. ... > +unsigned char random_get_byte(void) > +{ > + struct rnd_buffer *buffer = &get_cpu_var(stack_rand_offset); > + unsigned char res; > + > + if (buffer->byte_counter >= RANDOM_BUFFER_SIZE) { > + get_random_bytes(&(buffer->buffer), sizeof(buffer->buffer)); > + buffer->byte_counter = 0; > + } > + > + res = buffer->buffer[buffer->byte_counter]; > + buffer->buffer[buffer->byte_counter] = 0; > + buffer->byte_counter ++; > + put_cpu_var(stack_rand_offset); > + return res; > +} > +EXPORT_SYMBOL(random_get_byte);
You'll almost certainly get better code if you copy buffer->byte_counter to a local 'unsigned long' variable. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)