On 12.11.17 21:59, Nick Holland wrote:
On 11/12/17 14:13, Otto Moerbeek wrote:
On Sun, Nov 12, 2017 at 01:28:39PM -0500, Nick Holland wrote:
Help.
I was upgrading a few very similar machines to -current today.
ONE of the three decided to be unpleasant. The thing has a
serial console, and but it is about 370km from me. :-/
Upgrade from Sep 9 current to today's current via bsd.rd, just
like the other two.
Upon reboot, it does this (from /boot) :
booting hd0a:/bsd: 8484712+2429968+244048+0+667648 [636809heap full
(0x9d304+65536)
And then reboots the system, as if from power-down/power-up.
(already something I haven't seen before)
Reboot from "bsd.rd" and "bsd.sp", same results. reboot from "obsd"
(Sept 9), same results. Not a kernel problem, it seems. About this
point, I'm starting to think how the serial console has let me down.
I remember how to bring up a DRAC remote CD image via ssh tunnels
to the drac and how to run java in a windows browser, and
reboot off the remote CD image, do another upgrade, all goes fine
(again), but upon reboot, same results... "heap full" and reboot.
Boot from remote CD, at the boot> prompt, enter "boot hd0a:/bsd",
and it boots Just Fine from the local hard disk (only boot pulled
from the remote CD). Boot loader! Reinstalled boot:
# installboot -v sd0
Using / as root
installing bootstrap on /dev/rsd0c
using first-stage /usr/mdec/biosboot, second-stage /usr/mdec/boot
copying /usr/mdec/boot to /boot
/boot is 3 blocks x 32768 bytes
fs block shift 3; part offset 64; inode block 24, offset 2088
master boot record (MBR) at sector 0
partition 3: type 0xA6 offset 64 size 2000397671
/usr/mdec/biosboot will be written at sector 64
good, right?
Reboot off local hard disk, boom. problem is still there. maybe
not the boot loader. :-/
Verified /boot on trouble system and good system are the same.
I'm not going to cry "bug", since there are two nearly identical
systems working just fine. But I can't think of what I did wrong
or what to do to fix it.
Suggestions?
You are hitting -DHEAP_LIMIT=0xA0000 in /boot. The code is in libsa/alloa.c
No idea why. But something in that system is different.
You do have one weird line in your disklabel output: a filesystem
mounted on swap?
that's an mfs. This application has one directory which has a HUGE
benefit to an MFS for tmp files. Though the reboot happens long before
the mfs is created.
scsibus1 at ahci0: 32 targets
-sd0 at scsibus1 targ 2 lun 0: <ATA, Samsung SSD 850, EXM0> SCSI3 0/direct
fixed naa.50025388400562d4
+sd0 at scsibus1 targ 0 lun 0: <ATA, Samsung SSD 850, EXM0> SCSI3 0/direct
fixed naa.50025388400563fe
sd0: 976762MB, 512 bytes/sector, 2000409264 sectors, thin
-sd1 at scsibus1 targ 3 lun 0: <ATA, Samsung SSD 850, EXM0> SCSI3 0/direct
fixed naa.5002538c70007b02
-sd1: 1953514MB, 512 bytes/sector, 4000797360 sectors, thin
+cd0 at scsibus1 targ 1 lun 0: <PLDS, DVD+-RW DS-8A8SH, KD51> ATAPI 5/cdrom
removable
ichiic0 at pci0 dev 31 function 3 "Intel 6 Series SMBus" rev 0x04: apic 0 int
19
iic0 at ichiic0
My suspicion goes to SSDs. one of them have somehow become bad.
Nick.