Hi everybody Firstly, thanks to all people contributing to the OpenBSD project. Thanks also to people who try to help others on that list.
I'm fairly new to OpenBSD. I installed the 3.7 release some days ago on an old laptop, which is a Dell Latitude Xpi p133st. Hard disk: IBM OEM model DMCA-21440 size: 1440MB This was my first installation of OpenBSD, I read the FAQ carefully and managed to install it from floppy (floppyC37.fs for laptops), downloading the tgzs from FTP. As it was just a test install, I put OpenBSD in only 500MB. A few days after, as the system was running ok, I got a new hard disk and tried to install OpenBSD 3.7 on the new disk. New hard disk: IBM Travelstar mode DBCA-206480 size: 6.49GB I did the same as the first install, and all the installation procedure goes ok... (partitionning, download and extaction of tgzs). At the first boot, while the kernel was loading, it does a kernel panic : the kernel was not able to reach '/' on wd0a. Here was my problem in a nutshell. I will know give you more information. First openbsd install on the 1440MB drive (ok) : I installed openbsd in a 500mb partition, at the begining of the disk. ----------------------------------------- Result of 'fdisk wd0': Disk: wd0 geometry: 699/64/63 [2818368 Sectors] Offset: 0 Signature: 0xAA55 Starting Ending LBA Info: #: id C H S - C H S [ start: size ] ------------------------------------------------------------------------ *0: A6 0 1 1 - 253 63 63 [ 63: 1024065 ] OpenBSD 1: 05 254 0 1 - 277 63 63 [ 1024128: 96768 ] Extended DOS 2: 83 278 0 1 - 698 63 63 [ 1120896: 1697472 ] Linux files* 3: 00 0 0 0 - 0 0 0 [ 0: 0 ] unused Offset: 1024128 Signature: 0xAA55 Starting Ending LBA Info: #: id C H S - C H S [ start: size ] ------------------------------------------------------------------------ 0: 82 254 1 1 - 277 63 63 [ 1024191: 96705 ] Linux swap 1: 00 0 0 0 - 0 0 0 [ 0: 0 ] unused 2: 00 0 0 0 - 0 0 0 [ 0: 0 ] unused 3: 00 0 0 0 - 0 0 0 [ 0: 0 ] unused ----------------------------------------- Result of 'disklabel wd0': # /dev/rwd0c: type: ESDI disk: ESDI/IDE disk label: IBM-DMCA-21440 flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 16 sectors/cylinder: 1008 cylinders: 2796 total bytes: 1376.2M rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: # size offset fstype [fsize bsize cpg] a: 99.9M 0.0M 4.2BSD 2048 16384 202 # Cyl 0*- 202 b: 32.0M 99.9M swap # Cyl 203 - 267 c: 1376.2M 0.0M unused 0 0 # Cyl 0 - 2795 d: 50.2M 131.9M 4.2BSD 2048 16384 102 # Cyl 268 - 369 e: 50.2M 182.1M 4.2BSD 2048 16384 102 # Cyl 370 - 471 f: 267.8M 232.3M 4.2BSD 2048 16384 328 # Cyl 472 - 1015 i: 828.8M 547.3M ext2fs # Cyl 1112 - 2795 j: 47.2M 500.1M unknown # Cyl 1016*- 1111 ------ (mount points :) a: / b: swap d: /tmp e: /var f: /usr ----------------------------- Result of 'dmesg': OpenBSD 3.7 (GENERIC) #50: Sun Mar 20 00:01:57 MST 2005 [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC cpu0: Intel Pentium (P54C) ("GenuineIntel" 586-class) 134 MHz cpu0: FPU,V86,DE,PSE,TSC,MSR,MCE,CX8 cpu0: F00F bug workaround installed real mem = 24752128 (24172K) avail mem = 14585856 (14244K) using 327 buffers containing 1339392 bytes (1308K) of memory mainbus0 (root) bios0 at mainbus0: AT/286+(00) BIOS, date 07/22/97, BIOS32 rev. 0 @ 0xffe90 apm0 at bios0: Power Management spec V1.1 apm0: battery life expectancy 88% apm0: AC on, battery charge high, charging, estimated 3:45 hours pcibios0 at bios0: rev 2.1 @ 0xf0000/0x10000 pcibios0: PCI BIOS has 0 Interrupt Routing table entries pcibios0: no compatible PCI ICU found pcibios0: Warning, unable to fix up PCI interrupt routing pcibios0: PCI bus #0 is the last bus bios0: ROM list: 0xc0000/0xc000! cpu0 at mainbus0 pci0 at mainbus0 bus 0: configuration mode 1 (bios) pchb0 at pci0 dev 0 function 0 "Picopower PT86C521" rev 0x04 pcib0 at pci0 dev 6 function 0 "Picopower PT86C523_2" rev 0x00 vga1 at pci0 dev 7 function 0 "Neomagic Magicgraph NM2070" rev 0x01 wsdisplay0 at vga1: console (80x25, vt100 emulation) wsdisplay0: screen 1-5 added (80x25, vt100 emulation) pciide0 at pci0 dev 8 function 0 "CMD Technology PCI0643" rev 0x00: DMA, channel 0 wired to compatibility, channel 1 wired to compatibility wd0 at pciide0 channel 0 drive 0: <IBM-DMCA-21440> wd0: 16-sector PIO, LBA, 1376MB, 2818368 sectors wd0(pciide0:0:0): using PIO mode 4, DMA mode 2 pciide0: channel 1 ignored (disabled) isa0 at pcib0 isadma0 at isa0 pckbc0 at isa0 port 0x60/5 pckbd0 at pckbc0 (kbd slot) pckbc0: using irq 1 for kbd slot wskbd0 at pckbd0 (mux 1 ignored for console): console keyboard, using wsdisplay0 pcppi0 at isa0 port 0x61 midi0 at pcppi0: <PC speaker> sysbeep0 at pcppi0 npx0 at isa0 port 0xf0/16: using exception 16 fdc0 at isa0 port 0x3f0/6 irq 6 drq 2 fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec pcic0 at isa0 port 0x3e0/2 iomem 0xd0000/65536 pcic0 controller 0: <Vadem VG469> has sockets A and B pcmcia0 at pcic0 controller 0 socket 0 pcmcia0: CIS checksum failed ne3 at pcmcia0 function 0 "IBM Corp., Ethernet, 0934214" port 0x340/32, irq 3 ne3: address 00:04:ac:d5:79:47 pcmcia1 at pcic0 controller 0 socket 1 pcic0: irq 4, polling enabled biomask ffe5 netmask ffed ttymask ffff pctr: 586-class performance counters and user-level cycle counter enabled dkcsum: wd0 matched BIOS disk 80 root on wd0a rootdev=0x0 rrootdev=0x300 rawdev=0x302 That was for the system working properly, now : ------------------------------------------------------ Second openbsd install on the 6.49GB drive (failed): As the first install was ok, I decided to put openbsd on the whole wd0 drive. So, as I said, when rebooting after install, I got a kernel panic, saying that the kernel was not able to reach '/' on wd0a. I rebooted using the install floppy, to open a shell and see what happened. When booting on the floppy, the kernel issued some error messages when trying to access the disk. --------------------------------------------- The messages are in the 'dmesg' bellow: OpenBSD 3.7 (RAMDISKC) #565: Sun Mar 20 00:46:18 MST 2005 [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/RAMDISKC cpu0: Intel Pentium (P54C) ("GenuineIntel" 586-class) 134 MHz cpu0: FPU,V86,DE,PSE,TSC,MSR,MCE,CX8 cpu0: F00F bug workaround installed real mem = 24752128 (24172K) avail mem = 17055744 (16656K) using 327 buffers containing 1339392 bytes (1308K) of memory mainbus0 (root) bios0 at mainbus0: AT/286+(00) BIOS, date 07/22/97, BIOS32 rev. 0 @ 0xffe90 apm0 at bios0: Power Management spec V1.1 pcibios0 at bios0: rev 2.1 @ 0xf0000/0x10000 pcibios0: PCI BIOS has 0 Interrupt Routing table entries pcibios0: no compatible PCI ICU found pcibios0: Warning, unable to fix up PCI interrupt routing pcibios0: PCI bus #0 is the last bus bios0: ROM list: 0xc0000/0xc000! cpu0 at mainbus0 pci0 at mainbus0 bus 0: configuration mode 1 (bios) pchb0 at pci0 dev 0 function 0 "Picopower PT86C521" rev 0x04 pcib0 at pci0 dev 6 function 0 "Picopower PT86C523_2" rev 0x00 vga1 at pci0 dev 7 function 0 "Neomagic Magicgraph NM2070" rev 0x01 wsdisplay0 at vga1: console (80x25, vt100 emulation) pciide0 at pci0 dev 8 function 0 "CMD Technology PCI0643" rev 0x00: DMA, channel 0 wired to compatibility, channel 1 wired to compatibility wd0 at pciide0 channel 0 drive 0: <IBM-DBCA-206480> wd0: 16-sector PIO, LBA, 6194MB, 12685680 sectors wd0(pciide0:0:0): using PIO mode 4, DMA mode 2 pciide0: channel 1 ignored (disabled) isa0 at pcib0 isadma0 at isa0 pckbc0 at isa0 port 0x60/5 pckbd0 at pckbc0 (kbd slot) pckbc0: using irq 1 for kbd slot wskbd0 at pckbd0 (mux 1 ignored for console): console keyboard, using wsdisplay0 npx0 at isa0 port 0xf0/16: using exception 16 fdc0 at isa0 port 0x3f0/6 irq 6 drq 2 fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec pcic0 at isa0 port 0x3e0/2 iomem 0xd0000/16384 pcic0 controller 0: <Vadem VG469> has sockets A and B pcmcia0 at pcic0 controller 0 socket 0 pcmcia0: CIS checksum failed ne0 at pcmcia0 function 0 "IBM Corp., Ethernet, 0934214" port 0x340/32, irq 3 ne0: address 00:04:ac:d5:79:47 pcmcia1 at pcic0 controller 0 socket 1 pcic0: irq 4, polling enabled biomask ffe5 netmask ffed ttymask ffff rd0: fixed, 3800 blocks wd0c: id not found reading fsbn 1754660928 (wd0 bn 1754660928; cn 1856784 tn 0 sn 48), retrying wd0: transfer error, downgrading to PIO mode 4 wd0(pciide0:0:0): using PIO mode 4 wd0c: id not found reading fsbn 1754660928 (wd0 bn 1754660928; cn 1856784 tn 0 sn 48), retrying wd0c: id not found reading fsbn 1754660928 (wd0 bn 1754660928; cn 1856784 tn 0 sn 48), retrying wd0c: id not found reading fsbn 1754660928 (wd0 bn 1754660928; cn 1856784 tn 0 sn 48), retrying wd0c: id not found reading fsbn 1754660928 (wd0 bn 1754660928; cn 1856784 tn 0 sn 48), retrying wd0c: id not found reading fsbn 1754660928 (wd0 bn 1754660928; cn 1856784 tn 0 sn 48) wd0: disk label I/O error root on rd0a rootdev=0x1100 rrootdev=0x2f00 rawdev=0x2f02 ----- To access the disklabel information, the kernel seems to try to read blocks that are outside the capacity of the drive: 12685680 (seen in dmesg above). The kernel is not able to retrieve correctly the disklabel information. That's what I think. -------------------------- Now is the result of 'fdisk wd0': Disk: wd0 geometry: 789/255/63 [12675285 Sectors] Offset: 0 Signature: 0xAA55 Starting Ending LBA Info: #: id C H S - C H S [ start: size ] ------------------------------------------------------------------------ 0: 00 0 0 0 - 0 0 0 [ 0: 0 ] unused 1: 00 0 0 0 - 0 0 0 [ 0: 0 ] unused 2: 00 0 0 0 - 0 0 0 [ 0: 0 ] unused *3: A6 0 1 1 - 788 254 63 [ 63: 12675222 ] OpenBSD The output of fdisk seems ok. ------------------------------ Now is the output of 'disklabel wd0': # using MBR partition 3: type A6 off 63 (0x3f) size 12675222 (0xc16896) # /dev/rwd0c: type: ESDI disk: ESDI/IDE disk label: IBM-DBCA-206480 flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 15 sectors/cylinder: 945 cylinders: 13424 total sectors: 12685680 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: # size offset fstype [fsize bsize cpg] a: 12648448 1754660927 unused 0 0 # Cyl 1856784*-1870168* c: 12685680 0 unused 0 0 # Cyl 0 - 13423 i: 12675222 63 unknown # Cyl 0*- 13412 disklabel: partition a: offset past end of unit disklabel: partition a: partition extends past end of unit ------------------------ and the (partial) output of 'disklabel -p m wd0' (for convenience): 16 partitions: # size offset fstype [fsize bsize cpg] a: 6176.0M 856768.0M unused 0 0 # Cyl 1856784*-1870168* c: 6194.2M 0.0M unused 0 0 # Cyl 0 - 13423 i: 6189.1M 0.0M unknown # Cyl 0*- 13412 The output do not correspond at all with the partitions I had defined during the install procedure (/ swap /var /var/www /usr /tmp /home). The starting offset of the a partition (1754660927) begins one sector before the error sector off 'dmesg' above (1754660928). This output seems to be the consequence of the 'disk label I/O error' in the 'dmesg'. I read disklabel(8) and disklabel(5). The kernel has an in-core copy of the disklabel and, if during boot it cannot retrieve it correctly, it try to build the in-core copy as it can. So, what I think is that when booting, as the kernel was not able to retrieve the disklabel, it tried to build it, but the built version is wrong. The '-r' option of 'disklabel' is for reading the disklabel directly from disk rather than the in-core copy. -------------------------------------------- And when I execute 'disklabel -r wd0': # /dev/rwd0c: type: ESDI disk: ESDI/IDE disk label: IBM-DBCA-206480 flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 15 sectors/cylinder: 945 cylinders: 13424 total sectors: 12685680 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: # size offset fstype [fsize bsize cpg] a: 205002 63 4.2BSD 2048 16384 224 # Cyl 0*- 216 b: 819315 205065 swap # Cyl 217 - 1083 c: 12685680 0 unused 0 0 # Cyl 0 - 13423 d: 512190 1024380 4.2BSD 2048 16384 320 # Cyl 1084 - 1625 e: 2096955 1536570 4.2BSD 2048 16384 320 # Cyl 1626 - 3844 f: 2457945 3633525 4.2BSD 2048 16384 320 # Cyl 3845 - 6445 g: 205065 6091470 4.2BSD 2048 16384 224 # Cyl 6446 - 6662 h: 6378750 6296535 4.2BSD 2048 16384 320 # Cyl 6663 - 13412 ------------------------ and the (partial) output of 'disklabel -rp m wd0' (for convenience): 16 partitions: # size offset fstype [fsize bsize cpg] a: 100.1M 0.0M 4.2BSD 2048 16384 224 # Cyl 0*- 216 b: 400.1M 100.1M swap # Cyl 217 - 1083 c: 6194.2M 0.0M unused 0 0 # Cyl 0 - 13423 d: 250.1M 500.2M 4.2BSD 2048 16384 320 # Cyl 1084 - 1625 e: 1023.9M 750.3M 4.2BSD 2048 16384 320 # Cyl 1626 - 3844 f: 1200.2M 1774.2M 4.2BSD 2048 16384 320 # Cyl 3845 - 6445 g: 100.1M 2974.4M 4.2BSD 2048 16384 224 # Cyl 6446 - 6662 h: 3114.6M 3074.5M 4.2BSD 2048 16384 320 # Cyl 6663 - 13412 And that's exactly the partitions I had configured during the install ! The kernel seems having troubles to read the disklabel at boot, but retrieve it correcty with the '-r' option.. I don't understand anything.. Then I reboot, always on the install diskette, without doing any write operation to the disk. Strangely, the 'dmesg' does not show the errors seen before (trying to access the disklabel outside the disk) but simply : wd0: no disk label --------------------------- Output of 'fdisk wd0': correct, exacly the same as before : Disk: wd0 geometry: 789/255/63 [12675285 Sectors] Offset: 0 Signature: 0xAA55 Starting Ending LBA Info: #: id C H S - C H S [ start: size ] ------------------------------------------------------------------------ 0: 00 0 0 0 - 0 0 0 [ 0: 0 ] unused 1: 00 0 0 0 - 0 0 0 [ 0: 0 ] unused 2: 00 0 0 0 - 0 0 0 [ 0: 0 ] unused *3: A6 0 1 1 - 788 254 63 [ 63: 12675222 ] OpenBSD ---------------------------- Output of 'disklabel wd0' : # /dev/rwd0c: type: ESDI disk: ESDI/IDE disk label: IBM-DBCA-206480 flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 15 sectors/cylinder: 945 cylinders: 13424 total sectors: 12685680 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: # size offset fstype [fsize bsize cpg] c: 12685680 0 unused 0 0 # Cyl 0 - 13423 i: 12648641 1754660927 unknown # Cyl 1856784*-1870168* disklabel: partition i: offset past end of unit disklabel: partition i: partition extends past end of unit ------- The a partition disapeared, and a i partition begins at the former a offset. --------------------- Output of 'disklabel -r wd0': # using MBR partition 3: type A6 off 63 (0x3f) size 12675222 (0xc16896) wd0(pciide0:0:0) timeout type: ata c_bcount: 8192 c_skip: 0 pciide0:0:0: bus-master DMA error: missing interrupt, status 0x20 wd0c: device timeout reading fsbn 63 of 63-78 (wd0 bn 63; cn 0 tn 1 sn 0), retrying wd0: soft error (corrected) disklabel: disk label corrupted I don't think the disk is deffective, because I made a full dos format with no problem, and during the OpenBSD install, the files where copied successfully. ------------------------------------- And when I re-run 'disklabel -r wd0' again : # using MBR partition 3: type A6 off 63 (0x3f) size 12675222 (0xc16896) wd0(pciide0:0:0) timeout type: ata c_bcount: 8192 c_skip: 0 pciide0:0:0: bus-master DMA error: missing interrupt, status 0x20 wd0c: device timeout reading fsbn 63 of 63-78 (wd0 bn 63; cn 0 tn 1 sn 0), retrying wd0: soft error (corrected) # /dev/rwd0c: type: ESDI disk: ESDI/IDE disk label: IBM-DBCA-206480 flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 15 sectors/cylinder: 945 cylinders: 13424 total sectors: 12685680 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: # size offset fstype [fsize bsize cpg] a: 205002 63 4.2BSD 2048 16384 224 # Cyl 0*- 216 b: 819315 205065 swap # Cyl 217 - 1083 c: 12685680 0 unused 0 0 # Cyl 0 - 13423 d: 512190 1024380 4.2BSD 2048 16384 320 # Cyl 1084 - 1625 e: 2096955 1536570 4.2BSD 2048 16384 320 # Cyl 1626 - 3844 f: 2457945 3633525 4.2BSD 2048 16384 320 # Cyl 3845 - 6445 g: 205065 6091470 4.2BSD 2048 16384 224 # Cyl 6446 - 6662 h: 6378750 6296535 4.2BSD 2048 16384 320 # Cyl 6663 - 13412 And my partitions are here !! despite the error at the begining And then, each time I re-run 'disklabel -r wd0', I get my partitions, but with no error message at the beginning. --------- Now, each time I reboot on the diskette, it does exactly the same as before: at boot: 'wd0: no disk label' 'disklabel wd0': only one i partition outside the disk 'disklabel -r wd0' (1st time): kernel error followed by 'disk label corrupted' 'disklabel -r wd0' (2nd time): kernel error followed by my partitions 'disklabel -r wd0' (3rd time and so on): no kernel error, simply my partitions ---------------------- But I found a workaround to make the kernel have a good in-core copy of the disklabel : I run 'fdisk -e wd0' and then do a 'reinit', then write and quit. After the 'reinit', when I run 'disklabel wd0', I see my partitions so the in-core copy of the kernel is good. I think launching a 'reinit' tells the kernel to update his in-core disklabel copy. (And it is said in the FAQ that it also install an openbsd partition on the whole disk, and a boot block) ---------------- I can then mount the partitions without any problem in a directory under /mnt (mount /dev/wd0d /mnt/test) I can navigate in the fs with no problem, the files extracted during the install are here. ------------------------------ And when I reboot once again with the install floppy I get the famous error in 'dmesg': wd0c: id not found reading fsbn 1754660928 (wd0 bn 1754660928; cn 1856784 tn 0 sn 48), retrying wd0: transfer error, downgrading to PIO mode 4 wd0(pciide0:0:0): using PIO mode 4 wd0c: id not found reading fsbn 1754660928 (wd0 bn 1754660928; cn 1856784 tn 0 sn 48), retrying wd0c: id not found reading fsbn 1754660928 (wd0 bn 1754660928; cn 1856784 tn 0 sn 48), retrying wd0c: id not found reading fsbn 1754660928 (wd0 bn 1754660928; cn 1856784 tn 0 sn 48), retrying wd0c: id not found reading fsbn 1754660928 (wd0 bn 1754660928; cn 1856784 tn 0 sn 48), retrying wd0c: id not found reading fsbn 1754660928 (wd0 bn 1754660928; cn 1856784 tn 0 sn 48) wd0: disk label I/O error Which is exactly the same as the beginnning, at the first boot with the diskette. I can repeat the same steps as above, and it will do the same. ------ I spend hours on that problem, searching on the Web, reading the FAQ, the man pages... I also upgraded the BIOS but it did not solve the problem. So where is the problem ? fdisk ? disklabel ? the bios ? the kernel ? disk geometry ? or the disk itself ? I really don't know. I'm still looking for information on the Net. Any help, any advice, any information is welcomed. Thanks in advance to all the people who will take the time to read my (quite long) mail and sorry for my english. Thanks again Marc