Hi, I recently made several changes to my network, and ever since my bacula backups to the affected server error out after exactly 15 minutes. I don't know exactly which change is the culprit, but I suspect it is somehow IPSec-related, and would appreciate some help troubleshooting the problem.
I have servers in two locations: my main office with the bacula director and SD, and several clients, and two remote servers in a data center 2000 miles away. These servers only run the bacula-fd. The director used to use SSH to connect to both remote servers, and this worked reliably. Recently, I implemented IPSec to one of the two remote servers. Director, remote FD and SD can now connect directly to each other, without any other tunnel (logically, IPSec is transparent). At the same time, I also switched my Internet connection from Cable modem to DSL. The second server still uses the SSH tunnel to do the backups. Unfortunately, using IPSec, the backups seem to fail after 15 minutes; the connection from FD to SD seems to get severed. The backups to the second server, using SSH, work without a problem. The director produces this log output: 26-Sep 19:31 my-dir JobId 10686: Start Backup JobId 10686, Job=remoteserver.2011-09-26_19.05.00_06 26-Sep 19:31 my -dir JobId 10686: Created new Volume " remoteserver _20110926193150_Differential.bacula" in catalog. 26-Sep 19:31 my -dir JobId 10686: Using Device "SATADisk1" 26-Sep 19:31 Disk1 JobId 10686: Labeled new Volume " remoteserver _20110926193150_Differential.bacula" on device "SATADisk1" (/misc/BACKUP1). 26-Sep 19:31 Disk1 JobId 10686: Wrote label to prelabeled Volume " remoteserver _20110926193150_Differential.bacula" on device "SATADisk1" (/misc/BACKUP1) 26-Sep 19:31 my -dir JobId 10686: Max Volume jobs=1 exceeded. Marking Volume " remoteserver_20110926193150_Differential.bacula" as Used. 26-Sep 19:46 my -dir JobId 10686: Fatal error: Network error with FD during Backup: ERR=Connection reset by peer 26-Sep 19:46 Disk1 JobId 10686: JobId=10686 Job=" remoteserver.2011-09-26_19.05.00_06" marked to be canceled. 26-Sep 19:46 Disk1 JobId 10686: Error: bsock.c:548 Read expected 65536 got 1392 from client:xxx.xxx.xxx.xxx:366432 The FD produces this error message: Sep 27 02:46:58 remoteserver bacula-fd: bsock.c:393 Write error sending 270 bytes to client::::36387: ERR=Connection reset by peer ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users