Von: DAHLBOKUM Markus (FPT INDUSTRIAL) [mailto:markus.dahlbo...@fptindustrial.com] Gesendet: Donnerstag, 4. Oktober 2012 10:06 An: bacula-users@lists.sourceforge.net Betreff: Re: [Bacula-users] Network error with FD during Backup: ERR=Connection reset by peer
Hi Tom, Thank you for your answer. >The heartbeats are only setup when a job with a client is initiated. >So, there should be no activity when no job is running. When you >initiate a job with the client, the director sets up a connection with >the client telling the client what storage daemon to use. The client >then initiates a connection back to that storage daemon. If you have >the heartbeat settings in place as you do then you should see heartbeat >packets sent from the client back to the director in order to keep that >connection alive while the data is being sent back to the storage >daemon. In addition, you may see heartbeat packets send from the >storage daemon to the client. I'd have to re-look at the code but I >believe this is used in the scenario where the storage daemon is waiting >for a volume to write the data to (i.e. operator intervention). If the >heartbeat setting is on then the storage daemon will send heartbeats >back to the client in order to keep the connection alive while it waits. Yesterday I waited for the job to finish the first tape and then wait for me to insert the next one. I opened wireshark to see if there is a heartbeat during waiting - and there was none. During the job the heartbeat was active. >From what you wrote the heartbeat should be active when waiting for a tape. Could you try to confirm that (have a look at the code)? As one side of the backup is a VMware server I had a closer look to the configuration of this environment. As far as I know Michael's environment (the starter of this thread) is also including VMware. So this might be interesting for him. My job cancels exactly 15 min after entering the wait mode for a new tape. In the VMware settings there is an idle timeout set to 900 sec (i.e. 15 min). The timeout doesn't exactly fit to that kind of connection, but you never know. I disabled this timeout now and restarted my backup. In 7 hours I will see the result. But even if this setting caused the trouble, I would have thought the heartbeat should solve this (idle connection timeout). Again, it would be good to know if the heartbeat should be active during waiting for a tape. Thank you again. Markus Hi Markus, I searched for an appropriate idle setting, but didn't find some. Can you give me a hint, where to look? By the way, all jobs, which are failing, have "Run Before" and "Run After" scripts assigned (create and delete a systemstate file or stop and start a SQL-Server). Regards Michael NovaNet GmbH Kupferstr. 65 44532 Lunen Telefon: 02306/202100 FAX: 02306/202109 WEB: www.novanetgmbh.de Firmensitz: Lunen Amtsgericht Dortmund HRB 17273 USt-ID DE 124793480, St.-Nr. 316/5759/0318 Geschaftsfuhrerin: Dipl. Informatikerin (FH) Desiree Wunsche
<<inline: NNLogo.jpg>>
------------------------------------------------------------------------------ Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users