The last two scheduled runs of the job again had the connection errors (and again the the backup was still taken fine). yesterday I even ran the longer running job a few hours ahead to see if this was the reason why the connection error disappeared the other night - but that was not it. Also commenting out the reconnect clause didn’t make a difference.
the only two things I am aware I can do now to check it out further: (1) use a connection schedule for the FD (2) downgrade the FD from 13 to 11 (can this really be the cause?) > On 20. Jul 2022, at 22:35, Justin Case <jus7inc...@gmail.com> wrote: > > > Hey Bill, thanks for spending time on this! > >> On 20. Jul 2022, at 21:46, Bill Arlofski via Bacula-users >> <bacula-users@lists.sourceforge.net >> <mailto:bacula-users@lists.sourceforge.net>> wrote: >> >> >> Justin, >> >> I know what you told us, but I think we have a situation that I (and Martin) >> described: > > I understand your experiment, but it is not like that here. > >> - FD cannot connect to Director due to firewall > > It can. > >> - Director CAN connect to FD (I know, I know... :) > > It cannot. > >> - Job starts, Director connects to FD and receives all the queued "Cannot >> connect" messages >> - Job runs fine >> >> >> Here is how I tested: >> >> - In my FD config I set in the the Director{} block: >> >> - ConnectToDirector = yes >> - A BOGUS IP address for the `Address =` setting >> >> >> I killed and restarted the FD in foreground and debug mode, and I see that >> it goes on to try to connect to an IP address that >> is not taken on my network.... >> ----8<---- >> speedy-fd: events.c:48-0 Events: code=FD0001 daemon=speedy-fd ref=0x238e >> type=daemon source=*Daemon* text=Filed startup >> 13.0.0 (04Jul22) >> speedy-fd: filed.c:296-0 filed: listening on port 9102 >> speedy-fd: bnet_server.c:90-0 Addresses 0.0.0.0:9102 >> speedy-fd: bsockcore.c:354-0 Current 10.1.1.99:9101 All 10.1.1.99:9101 >> speedy-fd: bsockcor >> e.c:443-0 Could not connect to server Director daemon 10.1.1.99:9101. ERR=No >> route to host >> speedy-fd: bsockcore.c:253-0 Unable to connect to Director daemon on >> 10.1.1.99:9101. ERR=No route to host >> speedy-fd: bsockcore.c:354-0 Current 10.1.1.99:9101 All 10.1.1.99:9101 >> speedy-fd: bsockcore.c:443-0 Could not connect to server Director daemon >> 10.1.1.99:9101. ERR=No route to host >> speedy-fd: bsockcore.c:253-0 Unable to connect to Director daemon on >> 10.1.1.99:9101. ERR=No route to host >> ----8<---- >> >> Meanwhile, from the Director, I do a `status client=xxxx` and BAMM.. >> Director connects to Client and I get the FD's status - >> so a Job would also work in this manner. >> >> >> From your Director, can you try: > > good thinking. This was the first thing I checked when I saw the errors, > though. I usually try everything i can think of before I turn to the mailing > list, but of course you cannot know what I tried, as I did not mention it. > >> # telnet <IP of Client> 9102 > > Connection refused > >> And from the Client: >> >> # telnet <IP of Director> 9101 > > no telnet there, using netcat instead, the connection gets established. I can > write stuff, after some “invalid keywords” the connection is closed by the > director. > > to be sure I tried again with other port numbers that have no daemons > running. netcat returns immediately (due to port being closed). > >> And show us the results? > > see above. > > In the mean while the schedule ran again. > > today: no connection error messages. > > OK OK, but why. What was different? I did some experiments earlier, so the > job did run twice before. > > Also I ran another longer running job on the other tier, but actually the > problematic job did not queue up but ran through immediately (so both jobs > ran simultaneously) and no errors were thrown. > > When a few minutes ago the schedule started the longer running job also was > started as “incremental” and had no files to be backed up (because it ran a > few hours before and no changes had been made in the fileset. > > Finally, I had commented out the Reconnect clause. > > Hard to say what was the reason. > > I will observe whether tomorrow it will or will not throw connection errors > and will report back in both cases. And I will not make any experiments on > Bacula before the schedule runs tomorrow. > >> >> Thank you! >> Bill >> >> -- >> Bill Arlofski >> w...@protonmail.com <mailto:w...@protonmail.com> >> _______________________________________________ >> Bacula-users mailing list >> Bacula-users@lists.sourceforge.net >> <mailto:Bacula-users@lists.sourceforge.net> >> https://lists.sourceforge.net/lists/listinfo/bacula-users >> <https://lists.sourceforge.net/lists/listinfo/bacula-users>
_______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users