Within my UML guest kernel, I'm finding that between 5% and 
10% of reads from /dev/hwrng block, seemingly indefinitely. 
(I've certainly seem reads block for more than 12 hours.) 
All other reads complete in significantly less than 1 

I've tried a variety of 2.6.27, .28 and .29 guest UML 
kernels.  (I've had problems, almost certainly unrelated, 
getting 2.6.30 working reliably, so haven't tested this 
there.)  I have CONFIG_UML_RANDOM=y and /dev/hwrng has the 
correct permissions (0644) and device numbers (10 183).  I'm 
using a 32-bit x86 architecture.

Under strace(1) I see it blocking in a read(2) syscall:

   guest:~# strace dd if=/dev/hwrng of=/dev/zero bs=32 count=4
   open("/dev/hwrng", O_RDONLY|O_LARGEFILE) = 3
   dup2(3, 0)                              = 0
   close(3)                                = 0
   _llseek(0, 0, [0], SEEK_CUR)            = 0

If I recompile the guest kernel with CONFIG_DEBUG_KERNEL=y 
and CONFIG_FRAME_POINTER=y and attach gdb to the main 
running UML process, I see the following backtrace during a 
blocked read:

   (gdb) bt
   #0  0xb7ff0e9e in read () from /lib/libc.so.6
   #1  0x0806a2a1 in os_read_file ()
   #2  0x08069684 in rng_dev_read ()
   #3  0x080b0c7a in vfs_read ()
   #4  0x080b0dac in sys_read ()
   #5  0x08061b5f in handle_syscall ()
   #6  0x0806e7a8 in userspace ()
   #7  0x0805f929 in fork_handler ()
   #8  0x00000000 in ?? ()

I've struggled to get gdb to access the function parameters 
in the kernel (yes, debugging symbols are there, or so 
readelf thinks), but the stack layout in glibc's read 
function (i.e. frame #0) is straightforward:

   (gdb) x/8x $esp
   0xba74e70:      0x0ba74000      0x0806a2a1      0x00000007      0x0ba74ebc
   0xba74e80:      0x00000004      0x0ba74ecc      0x08069684      0x00000007

i.e. we've just called read(7, (void*)0xba74ebc, 4).  The 
host's /proc/$PID/fd/7 confirms that fd 7 is /dev/random.
The code in rng_dev_read starts:

   u32 data;
   int n, ret = 0, have_data;

   while (size) {
     n = os_read_file(random_fd, &data, sizeof(data));

... so second argument to read is plausible (it's on the 
stack in the right function), and sizeof(data)==4 so the 
third argument is correct.  In other words, we don't seem to 
have passed bogus parameters to the VDS host's read.

When the kernel blocks, the instruction pointer, %eip, is 
pointing to the instruction after the

   0xb7ff0e9c <read+28>:   int    $0x80

in glibc's read.  I.e. we appear to be blocking in the host 
kernel.  My obvious first thought is that the host kernel's 
entropy is depleted, although it seems hard to believe that 
this could explain the guest blocking for 12+ hours.

The host kernel is  I haven't yet tried a 
host kernel, though I plan to.  (That said, there seems to 
be no relevant changes in drivers/char/random.c between 
these two releases.)

The host is running rngd which seems to be effective at 
preventing the UML guests from depleting the host kernel's 
entropy pool.  If I monitor the available host entropy, I 

   host:~# while true; do tr '\n' ' ' \
     < /proc/sys/kernel/random/entropy_avail; sleep 1;
     done; echo
   3840 3968 3968 3968 3968 3968 3968 2944 3968 3968 3968 3968
   3968 3968 3968 3968 3968 2944 3968 3968 3968 3968 3776 3520
   3520 3537 3968 3968 3968 3968 3968 3968 3968 2944 3968 3968
   3968 3968 3968 3968 3968 3968 3968 2955 3968 3968 3968 3968
   3968 3968 3968 3968 3968 3968 3968 3968 3968 3968 3968 3968

I'm not sure why it should peak at 3968 rather than 4096 
(the poolsize) or 3686 (rngd is running with 
--fill-watermark=90%), but that doesn't seem cause for 
concern.  In any case, that seems like plenty of entropy.

I'd be very grateful for any thoughts or suggestions.  The 
lack of a working /dev/hwrng is preventing me from using 
rngd on the UML guests, which in turn is resulting in them 
running out of entropy which most usually manifests as ssh 
dying or hanging.

Thanks in advance for any advice,


Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
User-mode-linux-user mailing list

Reply via email to