Kudos - you have probably learned a lot during the debugging. thanks for
sharing !
Michael
On Nov 6, 2007 6:06 AM, Ron Cormier <[EMAIL PROTECTED]>
wrote:
>
> Hi,
> I just wanted to share an experience I had setting up a backup this
> weekend.
>
> Quick overview:
> The file daemon on my Windows server was overloading the storage daemon on
> my Ubuntu machine. This happened when trying to backup many small, image
> files with compression turned on. Turning off compression fixed it.
>
> Details:
> Basically I have a website on my Windows web server which contains many
> small image files (several thousand) in addition to several text files.
> Originally I set up the fileset to backup the root of the website but the
> job would hang and eventually die with errors like the following:
>
> Network error with FD during Backup: ERR=Connection timed out
> Network send error to SD. ERR=Input/output error
>
> I mucked around with trying different values for heartbeat interval,
> maximum
> network buffer size, and kernel buffer sizes to no avail.
>
> Debugging with Wireshark showed that the storage daemon communicating with
> the file daemon on the windows server was advertising a tcp receive window
> size of zero (TCP ZeroWindow). The following commands yielded more
> information:
>
> netstat -p -c -n -t > stat.txt
> grep bacula-sd stat.txt > sd.txt
>
> The ZeroWindow showed itself right about when the storage daemon was
> trying
> to backup the ~50th thumbnail. Netstat showed that there were two ports
> open by the storage daemon: one from the LAN IP address to the WAN IP
> address and another from the LAN IP address to the remote server. What
> was
> happening is the send-q between the LAN and the WAN (i.e. the connection
> between storage daemon and itself) was filling up until it was full. When
> it became full, the recv-q between the LAN and the remote server would
> then
> fill up and the storage daemon would publish the TCP ZeroWindow. So it
> seemed the bottleneck was on the storage daemon doing its thing when it
> got
> these small files... it couldn't keep up.
>
> Finally I tried splitting the fileset to use two different include
> resources
> within the fileset, one for the thumbnails, the other for the rest of the
> website. I turned OFF compression for the include resource that held the
> thumbnails. I'm happy to say the backup has run successfully since.
>
> I haven't debugged many network problems so it was a lot of trial and
> error
> for me. I suspect that the root cause of the problem comes down to
> slow/insufficient resources on my Ubuntu machine since it is virtually
> hosted.
>
> Hope this helps the next person. Thanks to any developers/contributors
> for
> the great software.
> Ron
>
> Ron Cormier
> Communicate Solutions
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems? Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> _______________________________________________
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
>
--
Michael Lewinger
MBR Computers
http://mbrcomp.co.il
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users