Robert Nelson schrieb: > This is due to the algorithm used by Bacula to do connect timeouts. It > isn't really a timeout, it is really a retry count. If you take the connect > timeout in seconds and divide it by 10 you get the number of retries. It > doesn't account for the time spent in the connect call. If the connect took > zero amount of time to fail, the two would be the same thing. To make > matters worse, the connect call takes a different amount of time to fail > depending on whether or not a switch is involved. > Hi Robert, thank you very much for your help. Your explanation made the things clearer for me. Do you know , if the algorithm you described has been changed in 1.38 ? I think, I first observed this effect on 1.38, but I don't know for sure . It dosen't matter now.
I will shorten the "timeout" value to 30 seconds . That should decrease my timeout to approx. 10 minutes. and I can live with that. Thank you once again and greetings Marc > So in your case, 5 minutes is equal to 300 seconds, divided by 10 equals 30. > So you will get 30 retries. > > Now, on the same subnet, it takes 6 minutes and 36 seconds to do 30 retries. > So it takes 1 minute and 36 seconds for 30 calls to connect to fail or > roughly 3 seconds per try. > > On different subnets, it takes 1 hour 39 minutes and 31 seconds or 189 > seconds or roughly 3 minutes per try. > > The reason for the differences is probably caching on the switch. I suspect > that in the same subnet case the arp is failing (so the IP address can't be > converted to an Ethernet address), in the other case the switch is > responding to the arp and a higher level (and longer timeout) is coming into > play, probably the TCP connect timer. > > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Marc > Brückner > Sent: Thursday, October 26, 2006 5:45 AM > To: bacula-users@lists.sourceforge.net > Cc: Knischka; Holger Luedecke > Subject: [Bacula-users] Different timeouts in different subnets > > Hi @ all Bacula users, > > I am using Bacula for several year now and I am really satisfied with it. > But now I have a strange Problem. I am not sure but I think it first > occurred since I updated from > version 1.36 to 1.38 . Now I am running 1.38.11 > My Bacula has to backup several WinXP clients over night. > When the client runs, there is no problem and the backup is done properly. > But if the users switch off their clients ( what happen often, > unfortunately ) the duration of the timeout depends on the IP-Subnet the > client is in . > I have the following Timeout settings in the bacula-dir.conf > > FD Connect Timeout = 5 minutes > SD Connect Timeout = 5 minutes > > If the Client is in the same IP-Subnet as the Bacula-director, the > director tells: > > 24-Oct 08:43 Bacula-dir: Start Backup JobId 3040, > Job=StudentA2190_A.2006-10-23_19.40.53 > 24-Oct 08:44 Bacula-dir: StudentA2190_A.2006-10-23_19.40.53 Warning: > bnet.c:853 Could not connect to File daemon on 192.168.10.67:9102. ERR=No > route to host > Retrying ... > 24-Oct 08:50 Bacula-dir: StudentA2190_A.2006-10-23_19.40.53 Fatal error: > bnet.c:859 Unable to connect to File daemon on 192.168.10.67:9102. ERR=No > route to host > 24-Oct 08:50 Bacula-dir: StudentA2190_A.2006-10-23_19.40.53 Error: Bacula > 1.38.11 (28Jun06): 24-Oct-2006 08:50:29 > > ... > > Scheduled time: 23-Oct-2006 19:40:52 > Start time: 24-Oct-2006 08:43:53 > End time: 24-Oct-2006 08:50:29 > Elapsed time: 6 mins 36 secs > > > Timeout after 6 and a half minutes,ERR=No route to host ; thats OK. > But if the Client resides in a different IP-Subnet is says: > > 25-Oct 02:39 Bacula-dir: Start Backup JobId 3070, > Job=StudentA1080_A.2006-10-24_18.00.11 > 25-Oct 02:45 Bacula-dir: StudentA1080_A.2006-10-24_18.00.11 Warning: > bnet.c:853 Could not connect to File daemon on 192.168.30.33:9102. > ERR=Connection timed out > Retrying ... > 25-Oct 04:18 Bacula-dir: StudentA1080_A.2006-10-24_18.00.11 Fatal error: > bnet.c:859 Unable to connect to File daemon on 192.168.30.33:9102. > ERR=Connection timed out > 25-Oct 04:18 Bacula-dir: StudentA1080_A.2006-10-24_18.00.11 Error: Bacula > 1.38.11 (28Jun06): 25-Oct-2006 04:18:47 > > ... > > Scheduled time: 24-Oct-2006 18:00:10 > Start time: 25-Oct-2006 02:39:16 > End time: 25-Oct-2006 04:18:47 > Elapsed time: 1 hour 39 mins 31 secs > > Timeout after 1 hour and 40 minutes, ERR=Connection timed out; thats a > little bit long. > > I have observed many log entries and its always the same: same subnet => > 6 m different subnet =>1:40 h. > There is no packet filtering between the subnets. > > Has anyone experienced an behavior like that? Has anyone a hint for me > how to shorten this 1:40 h timeout. > > Thank you for your help. > > Marc > > ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users