Re: softraid

Marco Peereboom Wed, 13 May 2009 06:40:31 -0700

On Wed, May 13, 2009 at 01:11:43PM +0800, Uwe Dippel wrote:
> Beautyful, as it looks like!
>
> I tried here on 2 300 GB U320, and the setup went through without any  
> warnings (?? most users encounter some?).
> What I did was: (my system disk is sd0)
>
> fdisk -iy sd1
> fdisk -iy sd2
>
> printf "a\n\n\n\nRAID\nw\nq\n\n" | disklabel -E sd1
> printf "a\n\n\n\nRAID\nw\nq\n\n" | disklabel -E sd2
>
> bioctl -c 1 -l /dev/sd1a,/dev/sd2a softraid0
>
> dd if=/dev/zero of=/dev/rsd3c bs=1m count=1
> disklabel sd3 (creating my partitions/slices)
>
> newfs /dev/rsd3a
> newfs /dev/rsd3b
>
> mount /dev/sd3b /mnt/
> cd /mnt/
> [pull one hot-swap out]
> echo Nonsense > testo


Yay your data survived on the remaining disk!!

> [push the disk back in]

Stale metadata, disk will remain unused from now on.

> [pull the other disk]

You lose.  all data is gone (for all intents and purposes).

> # ls -l
> total 4
> -rw-r--r--  1 root  wheel  9 May 13 12:00 testo
> [everything okay until here]

Nope, this comes out of cache.

> # rm testo 
>
> rm: testo: Input/output error
> [I still guess this may happen]

Shall happen.

>
> But now my question: All posts say all info is in 'man softraid' and  
> 'man bioctl'. There is nothing about *warnings* in there. I also tried  
> bioctl -a/-q, but none would indicate that anything was wrong when one  
> of the drives was pulled.

Show me bioctl softraid0 output.

>
> This will be a production server, but it can take downtime, in case.
> However:
> 1. I *need to know* when a disk goes offline

Provided in sensors and bioctl.

> 2. I need to know, in real life(!), if I can simply use the broken  
> mirror to save my data; how I can mount it in another machine. Alas,  
> softraid and bioctl are silent about these two.

You must have done something wrong.  Also you completely destroyed your
volume by pulling the 2nd and last disk out of the system.

>
> Another reason for asking:
> Next I issued 'reboot'; and could play hangman :(

I really need to see a trace.  It can be something induced but not the
fault of softraid.  You failed the whole volume and bad things happen
when that happens.

>
> After the reboot, I got:
> ...
> softraid0 at root
> softraid0: sd3 was not shutdown properly
> scsibus3 at softraid0: 1 targets, initiator 1
> sd3 at scsibus3 targ 0 lun 0: <OPENBSD, SR RAID 1, 003> SCSI2 0/direct fixed
> sd3: 286094MB, 36471 cyl, 255 head, 63 sec, 512 bytes/sec, 585922538 sec  
> total

To be expected because the 2nd disk you pulled contained valid metadata.

>
> Now I wonder what to do. Will a traditional fsck do, or do I have to  
> recreate the softraid?

I am actively working on rebuilds but there are some other pieces being
modified before that can make it in.  You can fsck + dump/restore this
volume onto a new one.

You need to realize that double failures are considered the end of your
data on raid volumes.  In softraid you can can create a 10 disk raid1
but once the last one (considered the double failure) fails you lose.

>
> Can anyone please help me further?
>
> Uwe

Re: softraid

Reply via email to