[Bug 219216] sched_bind() blocks if the entropy pool is starved

bugzilla-noreply Thu, 11 May 2017 02:24:26 -0700

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219216


            Bug ID: 219216
           Summary: sched_bind() blocks if the entropy pool is starved
           Product: Base System
           Version: 11.0-STABLE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Many People
          Priority: ---
         Component: kern
          Assignee: freebsd-bugs@FreeBSD.org
          Reporter: k...@freebsd.org

I recently updated my 11-stable system:

FreeBSD AprilRyan.norad 11.0-STABLE FreeBSD 11.0-STABLE #3 r318143: Wed May 10
17:56:12 CEST 2017    
root@AprilRyan.norad:/usr/obj/S403/amd64/usr/src/sys/S403  amd64

I immediately noticed that rand_harvestq is permanently running and consuming a
small but significant amount of CPU time, now. To investigate I started to `dd
bs=1m < /dev/random > /dev/null`

Coincidentally I was running a release candidate of powerd++ in foreground mode
with temperature throttling at the same time:
https://github.com/lonkamikaze/powerdxx/releases/tag/0.3.0-rc1

The following happened when I started the `dd`:

- Two cores were fully consumed, one by dd, one by random_harvestq
- powerd++ started to stutter and then completely freeze

After I killed the `dd` process the following happened:

- random_harvestq continued to consume an entire core for a long time
- powerd++ remained frozen

By erratically swiping my fingers over the touch screen I got powerd++ to
return operation in a stuttering fashion. It took several minutes before the
system acted normal again.

The two surprising conclusions so far:

- /dev/random blocks
- powerd++ consumes randomness

So I investigated the issue to find that it is the access to the following
sysctls that blocks:

dev.cpu.0.temperature
dev.cpu.1.temperature
dev.cpu.2.temperature
dev.cpu.3.temperature

Unloading the coretemp module in the blocked state resulted in a kernel panic
that told me coretemp was stuck in coretemp_get_val_sysctl().

With an unhealthy dose of uprintf() calls I figured out that the block happens
in coretemp_get_thermal_msr() (see /usr/src/sys/dev/coretemp/coretemp.c:306).

The problem is the following code:

311         thread_lock(curthread);
312         sched_bind(curthread, cpu);
313         thread_unlock(curthread);

The call to sched_bind() blocks when the entropy pool is starved (I suspect
only if the thread is not currently running on the right core any way).

Because I cannot fiddle with and replace sched_ule at runtime, I have decided
this is as far as I'm digging.

I think that the scheduler depends on entropy is very worrying, not to say a
bug, especially if randomness is a scarce resource. I got the system to panic
many times during this investigation, mostly because locks have been held too
long. E.g.:

spin lock 0xffffffff81c8e380 (sched lock 3) held by 0xfffff80028b19560 (tid
100196) too long
spin lock 0xffffffff81c8e380 (sched lock 3) held by 0xfffff80028b19560 (tid
100196) too long
panic: spin lock held too long
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe045154f850
vpanic() at vpanic+0x186/frame 0xfffffe045154f8d0
panic() at panic+0x43/frame 0xfffffe045154f930
_mtx_lock_spin_cookie() at _mtx_lock_spin_cookie+0x311/frame 0xfffffe045154f9a0
sched_idletd() at sched_idletd+0x3aa/frame 0xfffffe045154fa70
fork_exit() at fork_exit+0x85/frame 0xfffffe045154fab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe045154fab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
Uptime: 4m31s


I also find it questionable that entropy harvesting continues after initially
seeding the RNG, making /dev/random susceptible to entropy poisoning by a
malicious process that feeds bad entropy into /dev/random.

-- 
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"

[Bug 219216] sched_bind() blocks if the entropy pool is starved

Reply via email to