https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219399
--- Comment #70 from Don Lewis <truck...@freebsd.org> --- I now think that AGESA 1006 actually didn't fix anything for me. I must have gotten lucky with that first poudriere run after the BIOS upgrade. The next time I ran poudriere, I got a silent reboot after ~3 hours. The times to failure just looked too consistent for me, so I looked at the poudriere build logs to see what was being built at the time of the crash. One of them was openjdk7. One of the ports that got built when I restarted poudriere to build the remaining ports that failed after the BIOS upgrade was openoffice, which uses java, so things started making sense. If I try try building openjdk7, I can pretty much consistently trigger a system reboot, even with SMT off, only two cores enabled in the BIOS, the CPU clock speed lowered to 3 GHz, and the RAM clock cranked down from 2400 MHz to 1866 MHz. Then I marked openjdk7 BROKEN so that poudriere doesn't build it and skips the ports that depend on it, the system stayed up and poudriere ran for almost 9 hours, though two ports failed with the jemalloc assertion failure that I previously mentioned. I also now think that the Dragonfly patch isn't needed on FreeBSD and potentially could be harmful. It is meant to work around what looks like a Ryzen SMT bug. The problem appears to be triggered by executing code close to the top of user address space. On Dragonfly, the signal trampoline code is located just above the stack and very close to the top of user address space. By adding space to the end of sigtramp.S, the trampoline code is moved to a lower starting address. On FreeBSD, the signal trampoline code was moved to a separate memory page so that the stack could be marked non-executable. This page is located at the very top of user address space. I haven't looked at what all is in this page, but if the contents are loaded started at the bottom of the page, then the start of the signal trampoline is likely to be at a lower address than on Dragonfly. If other code is loaded in this page after the signal trampoline, then adding space at the end could move that code closer to the danger zone. In any case, I had been doing much of my testing with SMT disabled, so I removed this patch from my kernel. After backing out the Dragonfly patch and also marking bootstrap-openjdk as BROKEN to eliminate any vestige of java, setting the RAM and CPU clocks back to auto, I ran poudriere again and the run was mostly successful, though I did see a lang/go build failure due to a runaway build problem. I then enabled SMT and core performance boost and ran poudriere again. I observed build failures of lang/go, gdb, and cairo. I didn't see any obvious problems with the latter two, it looked like something in each just returned the wrong exit status. Restarted poudriere successfully built the latter two, but go failed again. The go failures appeared to be caused by some sort of corruption of its malloc state. Note: go is multi-threaded. Just for grins, I decided to try building ports in an i386 jail. I got no unexpected failures. The results were the same when I re-enabled the java ports. It successfully built 1594 ports in 8 hours 33 minutes. I was even able to build lang/ghc on i386. That one always had segfaults in the bootstrap compiler for me on amd64. I have no idea if it uses threads, though. At least on my hardware there are one or more problems with amd64 code. It might just be multi-threaded processes. The java problem could also be caused by the hotspot compiler, which may look like self-modifying code. In any case, it can cause system hangs or reboots and may also corrupt the state of other processes. I finally received the hardware to set up a serial console yesterday, but I haven't had time to install it yet. The reboots that I've seen don't seem to leave any trace in the logs, don't seem to trigger ddb, and don't leave crash dumps. -- You are receiving this mail because: You are the assignee for the bug. _______________________________________________ freebsd-bugs@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-bugs To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"