I'm just in the process of setting up a Sarge server to be used as a sort
of backup server. The main PATA discs are used to boot the OS offof
software RAID1, with the rest of the disc space used in JBOD for
not-so-important backups. However, I'm having problems getting the new disc
array up and running.
We've put a SATA controller in the box, a cheap-as-chips PCI Adaptec 1210SA
which, according to lspci, uses the SIlicon Image SI3112 chipset to provide
two SATA channels. Connected to this are two 320GB drives which I want to
turn into a RAID1 array. When the system booted first, I used mdadm to
create the RAID1 array md2 (mdadm --create /dev/md2 --level=1
--raid-disks=2 /dev/sda1 /dev/sdb1), checked /proc/mdstat to wait for the
array to finish syncing, and then formatted it ext3 and mounted it.
Everything seemed to work fine until I rebooted, whereupon the mount failed
with the report that it wasn't a valid ext[2|3] superblock; fsck confirmed
this and on further inspection it seemed that it wasn't a RAID device any
more either.
I thought this may have been due to the kernel trying to mount the drives
before the needed modules (as far as I can tell, libata, scsi_mod and
sata_sil) had been loaded, as I'm using the stock debian 2.6.8-k7-smp
kernel image. So I tried making a custom initrd with the needed modules in
it, namely:
[EMAIL PROTECTED]:~$ cat /etc/mkinitrd/modules
# /etc/mkinitrd/modules: Kernel modules to load for initrd.
#
# This file should contain the names of kernel modules and their arguments
# (if any) that are needed to mount the root file system, one per line.
# Comments begin with a `#', and everything on the line after them are ignored.
#
# You must run mkinitrd(8) to effect this change.
#
# Examples:
#
# ext2
# wd io=0x300
#First the modules needed to init the discs
ide_core
ide_generic
amd74xx
scsi_mod
libata
sr_mod
sd_mod
dm_mod
sata_sil
md
raid1
#Filesystems
ext2
ext3
#Other stuff I'm not sure if we need
shpchp
pciehp
pci_hotplug
===============================================
...and booted with that instead after editing GRUB's menu.lst. The exact
same error occurred, and I'm now at a bit of a loss to explain what's
happening. If I try and mount the discs on their own (i.e. mount /dev/sdX
/mnt/somedir) then they work just fine, so the hardware works fine - so I'm
almost certain it's a problem with initting the RAID arrays at boot. At the
moment I'm just rebuilding the array to see what happens when I don't try
and mount it at boot, but only after the OS has finished booting, but of
course that'll only be a temporary workaround. If it's any help, here are
my fstab and mdadm.conf's:
[EMAIL PROTECTED]:~$ cat /etc/fstab
# /etc/fstab: static file system information.
#
# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc defaults 0 0
/dev/md1 / ext3 defaults,errors=remount-ro 0 1
/dev/md0 /boot ext2 defaults 0 2
/dev/hdb9 /home ext3 defaults 0 2
/dev/hdb4 /mnt/avj-backup ext3 defaults 0 2
/dev/hda7 /mnt/dcj-backup ext3 defaults 0 2
/dev/hdb8 /tmp ext3 defaults 0 2
/dev/md4 /usr ext3 defaults 0 2
/dev/md3 /var ext3 defaults 0 2
/dev/hdb7 none swap sw 0 0
/dev/hdc /media/cdrom0 iso9660 ro,user,noauto 0 0
#/dev/md2 /mnt/dcj-archive ext3 defaults 0 2
# Dirs from the main server (zaphod) over X-over cable
zaphodxover:/home/share/avj /mnt/zaphod/avj
nfs ro,hard,intr,bg,rsize=8192,wsize=8192 0 0
zaphodxover:/home/share/dcj /mnt/zaphod/dcj
nfs ro,hard,intr,bg,rsize=8192,wsize=8192 0 0
===============================================
[EMAIL PROTECTED]:~$ cat /etc/mdadm/mdadm.conf
DEVICE partitions
ARRAY /dev/md4 level=raid1 num-devices=2
UUID=b8093124:a6d6f876:a29eecb7:e1b332f3
devices=/dev/hda6,/dev/hdb6
ARRAY /dev/md3 level=raid1 num-devices=2
UUID=1973b0c3:e38869d2:ffef0cde:92048042
devices=/dev/hda5,/dev/hdb5
ARRAY /dev/md2 level=raid1 num-devices=2
UUID=78a3be5a:f0838fe2:4d4ce7ed:3a969954
devices=/dev/sda1,/dev/sdb1
ARRAY /dev/md1 level=raid1 num-devices=2
UUID=51d55d28:3e653dce:631dd682:8dd52a37
devices=/dev/hda2,/dev/hdb2
ARRAY /dev/md0 level=raid1 num-devices=2
UUID=56e09876:a751356e:b86535d0:95091b5b
devices=/dev/hda1,/dev/hdb1
As you can see, most of the important directories are mounted in software
RAID1 on the two PATA discs with unimportant stuff on JBOD, although of
course this shouldn't make any difference. All the usual dmesg etc. stuff
doesn't seem to tell me anything I don't already know. If anyone has
experienced this before or has any pointers as to how I can troubleshoot
it, I'd be much obliged!
Stephen Tait
P.S. before all you hardware types tell me that the SI3112 sucks, yes I
know but it was the only SATA controller my company could get hold of, and
we already have a 3ware, we just can't afford another one!
--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]