[bcc tech-kern, tech-security, tech-crypto; followups to tech-userlevel to keep discussion in one place]
Many of you have no doubt noticed that a lot more things hang waiting for entropy than used to on machines without hardware random number generators (even as we've added a bunch of new drivers for HWRNGs) -- e.g., python, firefox. This is related to the adoption of the getrandom system call from Linux, which we adopted with the semantics that getrandom(p,n,0) will block until the kernel is certain there is enough entropy for security. In retrospect, based on experience with the change, such as the following threads and bugs (as well as many private discussions on IRC), I think adopting getrandom with this semantics was a mistake: https://gnats.NetBSD.org/55641 https://gnats.netbsd.org/55847 https://mail-index.NetBSD.org/current-users/2020/09/02/msg039470.html https://mail-index.NetBSD.org/current-users/2020/11/21/msg039931.html https://mail-index.netbsd.org/current-users/2020/12/05/msg040019.html It's certainly a problem when keys are generated with too little entropy -- e.g., https://factorable.net -- but it's increasingly clear that _the middle of an application trying to get something else done_ is not a good place for hanging until someone plugs in a USB HWRNG. Such an application, like a Python program in the middle of just doing `import multiprocessing', is not in a position to remedy the situation or even usefully alert an operator to the problem. To better address the system integration, I added hooks in /etc/rc and /etc/security for alerting the operator to the problem with entropy: - setting `entropy=check' in /etc/rc.conf will abort multiuser boot and enter single-user mode if there's not enough entropy before starting any network services (or setting `entropy=wait' will make multiuser boot hang -- caveat: possibly indefinitely) - the daily /etc/security script will check for entropy and send an alert citing the new entropy(7) man page in the security report We might also do something similar with the motd -- add a single line, citing entropy(7) for more details, if there's not enough entropy. With these in mind, I propose that we change getrandom(p,n,0) so that it does not block -- under the premise that dealing with low entropy is a system integration problem, not a problem that it is helpful to ask an application to resolve in the heat of the sampling moment. Programs can still poll /dev/random (or getrandom(p,n,GRND_RANDOM)), if testing for entropy is actually their goal, but the default recommended choice for all applications to generate keys, which is getrandom(p,n,0), will not. I also propose we introduce never-blocking getentropy like nia@ briefly did last year, as an alias for getrandom(p,n,0) soon to be in POSIX (https://www.austingroupbugs.net/view.php?id=1134), under the premise that the never-block semantics (from the original in OpenBSD) is justified again by treating low entropy as a system integration problem. Thoughts? P.S. Previous discussions about getrandom, getentropy, blocking, and changes to the kernel entropy subsystem for NetBSD 10: https://gnats.NetBSD.org/55659 https://mail-index.NetBSD.org/tech-userlevel/2020/05/02/msg012333.html https://mail-index.NetBSD.org/current-users/2020/05/01/msg038495.html https://mail-index.NetBSD.org/tech-kern/2019/12/21/msg025876.html