On 10/23/2015 9:34 AM, Russell L. Carter wrote: > > Greetings, > > Recently my nightly cron poudriere builds have been occasionally > hanging. For instance, here's last night's, with apparently no > progress for over 10 hours: > > root@terpsichore> poudriere status > SET PORTS JAIL BUILD STATUS QUEUE > BUILT FAIL SKIP IGNORE REMAIN TIME LOGS > - default 10-stable-amd64 2015-10-22_22h30m08s parallel_build 488 > 34 0 0 0 454 10:45:56 > /ssd1/poudriere/data/logs/bulk/10-stable-amd64-default/2015-10-22_22h30m08s > root@terpsichore> >
Also check 'poudriere status -b' to see per-builder status. Something may be actually doing something. Poudriere will timeout builds after a long time. I forget the default but it may be up to 24 hours. > htop now shows no significant activity for the specified 3 builders: > > root@terpsichore> ps xa | grep poud > 72482 - Is 0:00.01 /bin/sh /root/poudriere/run-poudriere-bulk > 73202 - S 0:04.24 sh -e /usr/local/share/poudriere/bulk.sh -f > /root/poudriere/ports -j 10-stable-amd64 > 73347 - S 1:55.38 sh -e /usr/local/share/poudriere/bulk.sh -f > /root/poudriere/ports -j 10-stable-amd64 > 73352 - I 0:00.08 sh -e /usr/local/share/poudriere/bulk.sh -f > /root/poudriere/ports -j 10-stable-amd64 > 6119 1 S+ 0:00.00 grep poud > root@terpsichore> > > If I reboot, so that the tmp zfs filesystems are unmounted, and > manually rerun the exact same script as the previous cron'd, hung > instance, poudriere has (so far) run to completion. Please record 'procstat -kka' before rebooting in case this is some kind of deadlock. > > I'm not sure how to debug this, but in the interim, I'm very curious > how I can stop the hung bulk run, and either restart it, or clean up > the various mounted zfs filesystems and manually restart from the > beginning w/o rebooting. Studying the man page, it's not clear at all > the Right Way to do this, so any pointers here would be appreciated. Kill -TERM the main poudriere process. It will clean up children. Beyond that you can 'poudriere jail -j NAME -p TREE -z SET -k' to clean up any mounts leftover from a previous build. Adding a 'poudriere kill' command is on the todo list. > > I'm leaving the system untouched for now so that I can try out any > suggestions for cleanup and restart. -- Regards, Bryan Drewery
signature.asc
Description: OpenPGP digital signature