First, my request:  Is there anything in Bacula I can do to keep the socket 
between the director and the storage daemon alive?

Now, my explanation why I need this.  As I'm trying to narrow down why my 
lengthy backups to an offsite storage daemon don't work, I sat and watched the 
debug output for the director, the storage daemon, and the file daemon.  About 
almost exactly 10 minutes in, the director's debug output said this:

msgchan.c:333 === End msg_thread. use=2

After a lot of research, it appears what's going on is this:

1) The director starts the job.
2) A socket is opened between the director and the storage daemon.
3) A socket is opened between the file daemon and the storage daemon.
4) The file data transfers just fine over the file daemon/storage daemon 
socket.  
5) At almost exactly 10 minutes in, I get the above debug message which means 
the socket between the director and storage daemon has been closed.  netstat 
confirmed this.
6) The file data continues to transfer just fine to the storage daemon.
7) When the file daemon is done, it tells the director that it is finished.
8) The storage daemon tries to tell the director it received the data 
perfectly, but cannot, because it cannot communicate with the director anymore 
(which makes sense, because the socket died).
9) *I think* the director waits a bit for the the storage daemon, or it just 
knows it can't receive info from the storage daemon.  In any event, the 
director quickly marks the job as having an error because it never heard from 
the storage daemon as to its final result.

So, what can I do in Bacula to keep the socket between the director and the 
storage daemon alive?

I'v already set a heartbeats of 30 seconds, but according to the manual, the 
heartbeats help the file daemon talk to the director, the file daemon talk to 
the storage daemon, and the storage daemon talk to the file daemon.  But in my 
situation, I'm losing a socket between the director and the storage daemon, and 
the heartbeat doesn't help out with that.

I'm also starting to think a Linksys router may be the reason why it loses the 
inactive socket after almost exactly 10 minutes, as other Linksys users have 
found this happens to them.  I'll be able to test this out tonight by bypassing 
the router completely. 

Anyways, in the meantime, anybody know how I can keep that socket alive?

Brad Peterson
[EMAIL PROTECTED]







 
____________________________________________________________________________________
Don't pick lemons.
See all the new 2007 cars at Yahoo! Autos.
http://autos.yahoo.com/new_cars.html 

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to