> On Apr 26, 2019, at 7:01 AM, Theodore Ts'o <ty...@mit.edu> wrote:
>
>> On Fri, Apr 26, 2019 at 11:33:09AM +0000, Reshetova, Elena wrote:
>> Adding Eric and Herbert to continue discussion for the chacha part.
>> So, as a short summary I am trying to find out a fast (fast enough to be
>> used per syscall
>> invocation) source of random bits with good enough security properties.
>> I started to look into chacha kernel implementation and while it seems that
>> it is designed to
>> work with any number of rounds, it does not expose less than 12 rounds
>> primitive.
>> I guess this is done for security sake, since 12 is probably the lowest
>> bound we want people
>> to use for the purpose of encryption/decryption, but if we are to build an
>> efficient RNG,
>> chacha8 probably is a good tradeoff between security and speed.
>>
>> What are people's opinions/perceptions on this? Has it been considered
>> before to create a
>> kernel RNG based on chacha?
>
> Well, sure. The get_random_bytes() kernel interface and the
> getrandom(2) system call uses a CRNG based on chacha20. See
> extract_crng() and crng_reseed() in drivers/char/random.c.
>
> It *is* possible to use an arbitrary number of rounds if you use the
> low level interface exposed as chacha_block(), which is an
> EXPORT_SYMBOL interface so even modules can use it. "Does not expose
> less than 12 rounds" applies only if you are using the high-level
> crypto interface.
>
> We have used cut down crypto algorithms for performance critical
> applications before; at one point, we were using a cut down MD4(!) for
> initial TCP sequence number generation. But that was getting rekeyed
> every five minutes, and the goal was to make it just hard enough that
> there were other easier ways of DOS attacking a server.
>
> I'm not a cryptographer, so I'd really us to hear from multiple
> experts about the security level of, say, ChaCha8 so we understand
> exactly kind of security we'd offering. And I'd want that interface
> to be named so that it's clear it's only intended for a very specific
> use case, since it will be tempting for other kernel developers to use
> it in other contexts, with undue consideration.
>
>
I don’t understand why we’re even considering weaker primitives. It seems to me
that we should be using the “fast-erasure” construction for all
get_random_bytes() invocations. Specifically, we should have a per cpu buffer
that stores some random bytes and a count of how many random bytes there are.
get_random_bytes() should take bytes from that buffer and *immediately* zero
those bytes in memory. When the buffer is empty, it gets refilled with the full
strength CRNG.
The obvious objection is “oh no, a side channel could leak the buffer,” to
which I say so what? A side channel could just as easily leak the entire CRNG
state.
For Elena’s specific use case, we would probably want a
try_get_random_bytes_notrace() that *only* tries the percpu buffer, since this
code runs so early in the syscall path that we can’t run real C code. Or it
could be moved a bit later, I suppose — the really early part is not really an
interesting attack surface.