Thanks for the bug report. The bug appears to be due to a weakness in ISAAC that was reported in 2006 by Jean-Philippe Aumasson of the University of Applied Sciences Northwestern Switzerland. Although Aumasson wrote a paper about it, nobody seems to have connected the paper with coreutils.

I installed the attached patch, which fixed things for me, and am marking the bug as fixed.
From bfbb3ec7f798b179d7fa7b42673e068b18048899 Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Sat, 3 Aug 2024 22:31:20 -0700
Subject: [PATCH] shuf: fix randomness bug

Problem reported by Daniel Carpenter <https://bugs.gnu.org/72445>.
* gl/lib/randread.c (randread_new): Fill the ISAAC buffer
instead of storing at most BYTES_BOUND bytes into it.
---
 NEWS              |  3 +++
 THANKS.in         |  1 +
 gl/lib/randread.c | 12 +++++++++++-
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index 6251a2f68..2da258c9d 100644
--- a/NEWS
+++ b/NEWS
@@ -16,6 +16,9 @@ GNU coreutils NEWS                                    -*- outline -*-
   have exited with a "Function not implemented" error.
   [bug introduced in coreutils-8.28]
 
+  'shuf' generates more-random output when the output is small.
+  [bug introduced in coreutils-8.6]
+
   'tail -c 4096 /dev/zero' no longer loops forever.
   [This bug was present in "the beginning".]
 
diff --git a/THANKS.in b/THANKS.in
index 17f9d9c69..57ace387e 100644
--- a/THANKS.in
+++ b/THANKS.in
@@ -140,6 +140,7 @@ Dameon G. Rogers                    dg...@uark.edu
 Dan Hagerty                         h...@gnu.ai.it.edu
 Dan Pascu                           d...@services.iiruc.ro
 Daniel Bergstrom                    n...@melody.se
+Daniel Carpenter                    danseb...@gmail.com
 Daniel Mach                         dm...@redhat.com
 Daniel P. Berrangé                  berra...@redhat.com
 Daniel Stavrovski                   d...@stavrovski.net
diff --git a/gl/lib/randread.c b/gl/lib/randread.c
index cbee224bb..43c0cf09f 100644
--- a/gl/lib/randread.c
+++ b/gl/lib/randread.c
@@ -189,9 +189,19 @@ randread_new (char const *name, size_t bytes_bound)
         setvbuf (source, s->buf.c, _IOFBF, MIN (sizeof s->buf.c, bytes_bound));
       else
         {
+          /* Fill the ISAAC buffer.  Although it is tempting to read at
+             most BYTES_BOUND bytes, this is incorrect for two reasons.
+             First, BYTES_BOUND is just an estimate.
+             Second, even if the estimate is correct
+             ISAAC64 poorly randomizes when BYTES_BOUND is small
+             and just the first few bytes of s->buf.isaac.state.m
+             are random while the other bytes are all zero.  See:
+             Aumasson J-P. On the pseudo-random generator ISAAC.
+             Cryptology ePrint Archive. 2006;438.
+             <https://eprint.iacr.org/2006/438>.  */
           s->buf.isaac.buffered = 0;
           if (! get_nonce (s->buf.isaac.state.m,
-                           MIN (sizeof s->buf.isaac.state.m, bytes_bound)))
+                           sizeof s->buf.isaac.state.m))
             {
               int e = errno;
               randread_free_body (s);
-- 
2.43.0

Reply via email to