-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sorry I'm so late in replying back to this thread, I've had a number of
long-running full backups running on my Bacula 2.0.3 server that I
didn't want to touch until completed. Now that they are done I have some
time to continue investigating this issue.

Arno Lehmann wrote:
> Hi,
> 
> On 4/3/2007 12:04 AM, Michael Proto wrote:
...
> I built and packaged all the Bacula Linux clients myself (so they all
> pull from the same set of config files for quick installation), and I
> used the following compile-time flags when building them:
> 
> --with-openssl --enable-client-only --enable-static-fd --enable-smartalloc
> 
> I'm using the static-bacula-fd binary (instead of the bacula-fd binary)
> 
>> Have you checked that the binary is really completely static? Lat time I 
>> tried I could not create a static binary of the FD, the one I could come 
>> up with was created under a really old linux/gcc/glibc combination, and 
>> lacked ACL support.
> 
>> If your binary pulls in shared objects, version differences there might 
>> be related to the problem.
> 
>> Of course, trying the heartbeat interval setting will be the most useful 
>> first step. I'm can't prove that, but I have the impression that Bacula 
>> reacts more sensitive to network problems recently. Might be because 
>> it's more efficient, and so leaves the TCP connections idling longer, or 
>> something...

Regarding static client binaries, when building the client with the
"--enable-client-only --enable-static-fd" flags, 2 binaries were
produced, bacula-fd and static-bacula-fd. Running file and ldd on the
static binary do seem to indicate that it is compiled statically:

[EMAIL PROTECTED] ~]# file /sbin/static-bacula-fd
/sbin/static-bacula-fd: ELF 32-bit LSB executable, Intel 80386, version
1 (SYSV), for GNU/Linux 2.2.0, statically linked, stripped

[EMAIL PROTECTED] ~]# ldd /sbin/static-bacula-fd
        not a dynamic executable

Even stranger, this error also occurs intermittently on the bacula-fd
program running on my Bacula server, which was not compiled staticly.


On the network/heartbeat issue:
The strange thing is, when I initiate a backup job against an affected
client, it fails immediately, before sending much of anything to the SD
with the "Fatal error: Socket error on Storage command: ERR=No data
available" error. Could that really be related to heartbeat?

In any case, I've added the following to the bacula-sd.conf:

  Heartbeat Interval = 60

And I'll try adding the same interval to a few affected clients today
and see if that helps.

I'll also see if I can get a valid tcpdump of the client and server
communications to see exactly what sorts of packets are being sent when
this failure occurs. Somewhat difficult, as the failure is intermittent
at best but its happening on enough hosts that I might be able to get
some valid data.

> 
> for maximum portability. They were built on a Debian Sarge host and then
> packaged into appropriate distribution packages.
> 
> On one of the often-affected hosts I now have the client started with
> the following flags (out of /etc/inittab):
> 
> /sbin/static-bacula-fd -fvc -d100 /etc/bacula/bacula-fd.conf
>>>> /tmp/bacula-fd.out
> When the client fails, I see the modification timestamp update on the
> resultant /tmp/bacula-fd.out file, but its currently empty. Do I need to
> redirect stderr to this file instead of stdout?
> 
>> It wouldn't do any harm redirecting *both* :-)
> 
> Anyone have any ideas what might be causing these errors or how I can go
> about debugging this unusual (and while not critical, still very
> annoying) problem?
> 
>> Check your system logs... I found some cases where Baculas components 
>> were killed because the used more memory than was healthy... the DIR 
>> mainly, but I have seen cases where I suspect the SD... no details 
>> available, unfortunately.
> 
>> Arno
> 

I haven't seen anything in the standard system logs on the client or
server when the failure occurs, but I'll try the stderr redirect on a
few clients and see if anything is showing-up there.

Thanks for the tips guys!


- -Proto
- --
Michael Proto            | SecureWorks
Unix Administrator       |
PGP ID: 5D575BBE         | [EMAIL PROTECTED]
*******************************************************
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (FreeBSD)

iD8DBQFGGqCtOLq/wl1XW74RAhXXAJ9PfwwKxWY6UiQEm3yccOE5CgLSpQCeMedy
WrikI6eqOZt6gD9PDLvVgN0=
=GIqr
-----END PGP SIGNATURE-----

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to