Thanks so much. I am also using raid1 since I met Debian, so many years ago. However the poor way I described. I'll do what you suggest as soon time permits, although the cables to the HDs in the old server are difficultly accessible. And, in the meantime, I would be at a single server, insecure as with a bad raid1.
Failure that I described in adding grub to the other HD was in a single trial and now the HDs are different, taken from a dismissed four-sockets dual-core AMD server. cheers francesco On Fri, Oct 4, 2013 at 10:13 PM, Bob Proulx <b...@proulx.com> wrote: > Francesco Pietra wrote: > > Bob Proulx wrote: > > > After installing simply run the grub install script against both > > > disks manually and then you will be assured that it has been > > > installed on both disks. > > > > I had problems with that methodology and was unable to detect my error. > > >From a thread on debian dated Mar 2, 2013: > > ... > > > grub-install /dev/sdb > > > was reported by complete installation. No error, no warning. > > > On rebooting, GRUB was no more found. Then entering in > > > grub rescue > > > > prefix/root/ were now wrong. > > If the command does not work on the command line then it won't work > from the installer either. The installer is doing the same things > that you can do from the command line. Therefore asking if it is in > the installer won't help. Because if it doesn't work then it doesn't > work either place. If it does work then it will work either place. > That is my conjecture at least. And since I have been using this > feature I believe it does work. Works for me anyway. > > I have been using RAID1 for a long time and have not encountered the > problem you describe. That doesn't mean that such an error doesn't > occur. Just that I can't recreate it. Or rather after much user have > never recreated it. This applies to both the good grub version 1 as > well as the newer and IMNHO buggier grub version 2 rewrite. They are > completely different from each other. Statements made about one do > not apply to the other because it was a complete rewrite. But it is > certainly possible that in your configuration that you have a case > that does not work. > > I have a workbench with a variety of hardware. When I want to test > something like this I construct a victim system in which to try the > action. If you could do the same I think it would help to get to the > root cause of the problem. I would create a victim machine with two > drives for installation testing. Then test the installation. After > install and reboot then shutdown, unplug one disk, test boot. Do not > boot all of the way to the system. Simply boot to the grub menu and > stop there. Then power off, switch disks, and test boot again. Do > not boot all of the way to the system. Simply boot to the grub menu > and again stop there. If you can get to the grub menu from either > disk then grub has been installed on both disks. If not then plug > both disks in and boot the system and test the grub-install script on > the non-booting disk and then repeat the single disk boot. > > The reason to only boot to the grub menu is of course so that the > RAID1 doesn't get split. If booting with one disk and then the other > one disk it will get a split brain of course. No real problem on a > victim machine. But it is faster to keep them in sync. So I only > boot to the grub menu when testing the grub boot code. Avoiding > booting the system avoids splitting the raid unnecessarily and speeds > up the debugging. > > By testing this way you can verify that you can boot either disk in > isolation after the other disk has failed. By using a victim machine > you can experiment. Then if you find a bug you will have a recipe to > recreate it and can file a bug report on it. Being able to recreate > the problem is the most valuable part. > > And here is the challenge. I think if you do this you will find that > it does actually work. But feel free to write back here and tell me > that I am wrong and that there is a problem with it. :-) As the great > Mark Twain wrote "There is nothing so annoying as a good example." If > you can get to a repeatable test case that fails that would be awesome. > > > Now I am in the same situation, two servers with mirroring raid, grub on > > /dev/sda only. Identical data on both servers to cope with grub on one > disk > > only. Not smart from my side. > > Two servers so that you can switch your services from one server to > the other in case one of the servers cannot boot? > > If you have two servers and one is the hot spare for the other then > perhaps after doing your own victim machine testing then you can > perform the fix on the spare and test there. Then apply the fix to > the running server. I think that should be a safe way to "sneak up" > on the solution. > > Bob >