On Friday, 14 March 2003 at 10:05:28 +0200, Vallo Kallaste wrote: > On Fri, Mar 14, 2003 at 01:16:02PM +1030, Greg 'groggy' Lehey > <[EMAIL PROTECTED]> wrote: > >>> So I did. Loaned two SCSI disks and 50-pin cable. Things haven't >>> improved a bit, I'm very sorry to say it. >> >> Sorry for the slow reply to this. I thought it would make sense to >> try things out here, and so I kept trying to find time, but I have to >> admit I just don't have it yet for a while. I haven't forgotten, and >> I hope that in a few weeks time I can spend some time chasing down a >> whole lot of Vinum issues. This is definitely the worst I have seen, >> and I'm really puzzled why it always happens to you. >> >>> # simulate disk crash by forcing one arbitrary subdisk down >>> # seems that vinum doesn't return values for command completion status >>> # checking? >>> echo "Stopping subdisk.. degraded mode" >>> vinum stop -f r5.p0.s3 # assume it was successful >> >> I wonder if there's something relating to stop -f that doesn't happen >> during a normal failure. But this was exactly the way I tested it in >> the first place. > > Thank you Greg, I really appreciate your ongoing effort for making > vinum stable, trusted volume manager. > I have to add some facts to the mix. Raidframe on the same hardware > does not have any problems. The later tests I conducted was done > under -stable, because I couldn't get raidframe to work under > -current, system did panic everytime at the end of initialisation of > parity (raidctl -iv raid?). So I used the raidframe patch for > -stable at > http://people.freebsd.org/~scottl/rf/2001-08-28-RAIDframe-stable.diff.gz > Had to do some patching by hand, but otherwise works well.
I don't think that problems with RAIDFrame are related to these problems with Vinum. I seem to remember a commit to the head branch recently (in the last 12 months) relating to the problem you've seen. I forget exactly where it went (it wasn't from me), and in cursory searching I couldn't find it. It's possible that it hasn't been MFC'd, which would explain your problem. If you have a 5.0 machine, it would be interesting to see if you can reproduce it there. > Will it suffice to switch off power for one disk to simulate "more" > real-world disk failure? Are there any hidden pitfalls for failing > and restoring operation of non-hotswap disks? I don't think so. It was more thinking aloud than anything else. As I said above, this is the way I tested things in the first place. Greg -- See complete headers for address and phone numbers
pgp00000.pgp
Description: PGP signature