Hello Dimitri, In your case cluster active/passive with shared storage replicated by DRBD I think you need to use only one Bacula FD config, the same on both nodes.
In your cluster in the same time the File Daemon will be running on the only one node (A or B), not on both. It is the cluster way to have accessible the service by one (or more) common virtual IP address(es) the same for both nodes. It is the virutal IP address that in your case is 1.2.3.1. You should store Bacula FD config on a disk that is replicated by DRBD. This way you will have automatically the FD configuration on remote host up-to-date as well. Now let's look on usage cases with begining state: A - active node, B - passive node: In case manual failover action the cluster software should: 1) umount your disks on node A and switch DRBD disks to passive mode, 2) mount your disks on node B and switch DRBD disks to active mode, 3) stop your FD on node A and start FD on node B, 4) your virtual IP address should be down on node A and up on node B. In the cluster software there should be handlers/triggers to do points 1), 2), 3) and 4). In case automatic failover (for example: server crash or power off): 1) mount your disks on node B and switch DRBD disks to active mode, 2) start FD on node B, 3) your virtual IP address should be up on node B. In case failback (crash node A is fixed, node A is back healthy): 1) umount your disks on node B and switch DRBD disks to passive mode 2) mount your disks on node A and switch DRBD disks to active mode 2) stop your FD on node B and start FD on node A. 3) your virtual IP address should be down on node B and up on node A The steps for failback are opposite to manual failover action. I case one FD in the cluster environment you need only following configs (1.2.3.1 - virtual IP address): Client { name = cluster-fd address = 1.2.3.1 ... } Job { name = nodea-etc client = cluster-fd fileset = etc } FileDaemon { name = cluster-fd ... } You have to be careful when you switch active/passive after failover because there can be miliseconds when your cluster can be splitted (it is split brain state in DRBD nomenclature). It is the moment when users are able to write to both nodes at the same time. For DRBD there are some scenarios to recovery this situation. I hope that I helped. Please let know on the mailing list your experiences when you finish preparing your cluster. Good luck. Best regards. Marcin Haba (gani) On 26 October 2016 at 21:46, Dimitri Maziuk <dmaz...@bmrb.wisc.edu> wrote: > On 10/24/2016 04:15 PM, Josh Fisher wrote: > > ... snipped ... > > Yes, this is more or less what I've been doing up until now. The good > news is, it seems I don't have to anymore. Here's what I have working now: > > corosync/pacemaker cluster with node A @ 1.2.3.4, node B @ 1.2.3.5, and > cluster ip @ 1.2.3.1, shared storage mounted a /raid on the active node. > > node A bacula-fd.conf: > FileDaemon { > name = nodea-fd > ... > } > > node B bacula-fd.conf: > FileDaemon { > name = nodeb-fd > ... > } > > bacula-dir config: > > Client { > name = nodea-fd > address = 1.2.3.4 > ... > } > Client { > name = nodeb-fd > address = 1.2.3.5 > ... > } > Client { > name = cluster-fd > address = 1.2.3.1 > ... > } > Job { > name = nodea-etc > client = nodea-fd > fileset = etc > } > Job { > name = nodeb-etc > client = nodeb-fd > fileset = etc > } > Job { > name = cluster-raid > client = cluster-fd > fileset = raid > } > > -- and it's happily spooling the 21GB /raid right now. > > What seems to be happening is bacula is connecting to the cluster > address (checked with lsof -i), completely ignoring FD name "cluster-fd" > and is backing up "fileset = raid" from "nodea-fd". > > Which is great, if not checking FD name is a bug, *please* don't fix it. :) > > So all you need is start FDs at boot listening on * and the director > will automagically get the shared filesystem off of the node that > happens to have it mounted. > > (Of course the backup will fail if the cluster fails over or the > connection is otherwise disrupted.) > -- > Dimitri Maziuk > Programmer/sysadmin > BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu > > > ------------------------------------------------------------------------------ > The Command Line: Reinvented for Modern Developers > Did the resurgence of CLI tooling catch you by surprise? > Reconnect with the command line and become more productive. > Learn the new .NET and ASP.NET CLI. Get your free copy! > http://sdm.link/telerik > _______________________________________________ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users > -- "Greater love hath no man than this, that a man lay down his life for his friends." Jesus Christ "Większej miłości nikt nie ma nad tę, jak gdy kto życie swoje kładzie za przyjaciół swoich." Jezus Chrystus ------------------------------------------------------------------------------ The Command Line: Reinvented for Modern Developers Did the resurgence of CLI tooling catch you by surprise? Reconnect with the command line and become more productive. Learn the new .NET and ASP.NET CLI. Get your free copy! http://sdm.link/telerik _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users