On 4/15/2019 10:50 PM, Gary R. Schmidt wrote:
On 16/04/2019 12:24, Marcio Demetrio Bacci wrote:
...

2. Is there any restriction on mounting the Bacula VM on a DRBD volume in the hypervisor. In this VM I will implement ZFS deduplication for my backups?

I /think/ you are asking if Bacula can handle fail-over during backup?
Short answer is "no".  (I am willing to accept corrections on this, but AFAICS Bacula doesn't checkpoint its output.)


I would say "mostly", rather than "no".  On fail-over, running jobs will eventually fail, since the FD clients will not reconnect and their open TCP sockets will timeout. AFAIK, there is no transparent TCP connection migration. The virtual IP is migrated, but not the currently opened TCP connections. However, this is not necessarily a bad thing. Bacula can be configured to automatically re-run failed jobs, and there was some reason for the fail-over in the first place. Do we really trust the first part of the backup that was made while running on the failing cluster node?

I have experimented with running Dir and SD in a KVM VM on a two-node Corosync/Pacemaker cluster using active/passive DRBD storage for the VM's OS. Backup storage was via iSCSI provided by an existing NAS. I found no major issues.

On fail-over, Pacemaker attempts to initiate a shutdown on the failing node. If the shutdown succeeds, then the bacula-dir and bacula-sd services are stopped in the normal way and it affects running jobs exactly the same as if those services were stopped manually. If the shutdown fails, then the node is STONITH'd and it is the same effect as pulling the power on the node. In that case, the clients' actively running jobs are orphaned. Eventually, their TCP sockets timeout. In either case, the running jobs are not successful, but as I stated earlier, I don't see that as a major reason not to run Bacula in a VM.

Currently, I run Dir and SD in a VM on a four-node cluster, but without fail-over for the Bacula VM. The VM and its DRBD storage have to be brought up manually, but it allows a single catalog, a single IP, and a single set of config, log, and spool files that can be brought up in seconds on any one of the nodes. The reason for this is because I do not have a FC-attached tape library or other HA capable backup device. To bring up on another node I have to move my backup devices (USB-attached RDS and portable hard drives plus SATA drives in removable drive bays using vchanger). Based on how it worked with iSCS NAS storage, I believe it would also work well with HA capable tape hardware on FC (or iSCSI?).

It should also work (ie. jobs running at fail-over will fail, but can be automatically re-run) when storing backups on DRBD attached locally to the VM. Of course, it doubles the storage space needed for volume files and so will be considerable. Since the running jobs will fail at fail-over, I see no reason for volume files to be HA. They just need to be accessible to the VM, for example on iSCSI-attached NAS.




_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to