I ran the memtest on our bacula server last night. After 14 hours and 8 passes
it didn't find any problems. I'm at the end of my rope here. I'm trying a new
virtual server to see if that fixes the issue.
_____________________
Corey Shaw
Technology Specialist
O. 801.491.0705 (x. 157)
F. 801.491.8774
Winner of the 2009 Utah Work/Life Award
----- Original Message -----
From: "Corey Shaw" <cs...@q90.com>
To: bacula-users@lists.sourceforge.net
Sent: Tuesday, August 18, 2009 2:38:48 PM GMT -07:00 US/Canada Mountain
Subject: Re: [Bacula-users] Client backups crash director until full backup is
run -- UPDATE
This is a physical machine. If absolutely necessary I could run memtest on the
box. I would like to exhaust other options if possible first though. The
machine is in our datacenter and I'd like to save myself a couple of trips up
there. What can I say? I'm lazy. :)
_____________________
Corey Shaw
Technology Specialist
O. 801.491.0705 (x. 157)
F. 801.491.8774
Winner of the 2009 Utah Work/Life Award
----- Original Message -----
From: "Jean Gobin" <jgo...@strozfriedberg.com>
To: "Corey Shaw" <cs...@q90.com>, bacula-users@lists.sourceforge.net
Sent: Tuesday, August 18, 2009 2:35:16 PM GMT -07:00 US/Canada Mountain
Subject: RE: [Bacula-users] Client backups crash director until full backup is
run -- UPDATE
Hello,
Virtual or physical machine?
Is running Memtest on this for a couple of hours an option?
J.
Jean F. Gobin, CCENT, CCNA
Network Engineer
Tel:
212.542.3175
Mobile:
917.213.2532
Fax:
212.981.6545
32 Avenue of the Americas, 4th Floor, New York, NY 10013
jgo...@strozfriedberg.com
www.strozfriedberg.com
S T R O Z F R I E D B E R G
This message is for the named person's use only. It may contain confidential,
proprietary or legally privileged information. No right to confidential or
privileged treatment of this message is waived or lost by any error in
transmission. If you have received this message in error, please immediately
notify the sender by e-mail or by telephone, delete the message and all copies
from your system and destroy any hard copies. You must not, directly or
indirectly, use, disclose, distribute, print or copy any part of this message
if you are not the intended recipient.
From: Corey Shaw [mailto:cs...@q90.com]
Sent: Tuesday, August 18, 2009 4:30 PM
To: bacula-users@lists.sourceforge.net
Subject: [Bacula-users] Client backups crash director until full backup is run
-- UPDATE
Version: 3.0.2
OS: Gentoo
My Bacula director recently decided that it needs to crash randomly after doing
backups. It mostly happens when backups run from the schedule, but sometimes
happens when I run backups manually. This suddenly started happening on August
11. Up to that point I had been running 3.0.1 for about a month just fine and
without problems. I upgraded to 3.0.2 to see if it would fix the problem, but
it didn't.
I have tried rebuilding the MySQL database as well as re-compiling in case I
missed something, but neither of those ideas worked either.
Any ideas that people can shed on the subject would be very helpful. It looks
like libbac.so.1 is causing some sort of issue. Using gdb, I got the following
output:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f889599c950 (LWP 5933)]
0x00007f889ad68622 in sm_realloc_pool_memory () from /usr/lib/libbac.so.1
Thread 14 (Thread 0x7f889699e950 (LWP 5935)):
#0 0x00007f8898894b92 in select () from /lib/libc.so.6
#1 0x00007f889ad71f09 in tls_bsock_readn () from /usr/lib/libbac.so.1
#2 0x00007f889ad56b25 in BSOCK::recv () from /usr/lib/libbac.so.1
#3 0x000000000041d820 in ?? ()
#4 0x0000000000429248 in ?? ()
#5 0x00007f8899aa5007 in start_thread () from /lib/libpthread.so.0
#6 0x00007f889889b48d in clone () from /lib/libc.so.6
#7 0x0000000000000000 in ?? ()
Thread 13 (Thread 0x7f889599c950 (LWP 5933)):
#0 0x00007f889ad68622 in sm_realloc_pool_memory () from /usr/lib/libbac.so.1
#1 0x00007f889ad68e72 in pm_strcat () from /usr/lib/libbac.so.1
#2 0x00007f889b3a9757 in db_get_int_handler () from /usr/lib/libbacsql.so.1
#3 0x00007f889b3a30d9 in db_sql_query () from /usr/lib/libbacsql.so.1
#4 0x00007f889b3a98ba in db_accurate_get_jobids () from /usr/lib/libbacsql.so.1
#5 0x000000000041204a in ?? ()
#6 0x00000000004125b6 in ?? ()
#7 0x0000000000421cb5 in ?? ()
#8 0x0000000000423e58 in ?? ()
#9 0x00007f8899aa5007 in start_thread () from /lib/libpthread.so.0
#10 0x00007f889889b48d in clone () from /lib/libc.so.6
#11 0x0000000000000000 in ?? ()
Thread 12 (Thread 0x7f889619d950 (LWP 5932)):
#0 0x00007f8899aac181 in nanosleep () from /lib/libpthread.so.0
#1 0x00007f889ad51fb6 in bmicrosleep () from /usr/lib/libbac.so.1
#2 0x00000000004244b1 in ?? ()
#3 0x00007f8899aa5007 in start_thread () from /lib/libpthread.so.0
#4 0x00007f889889b48d in clone () from /lib/libc.so.6
#5 0x0000000000000000 in ?? ()
Thread 5 (Thread 0x7f889719f950 (LWP 5897)):
#0 0x00007f8898894b92 in select () from /lib/libc.so.6
#1 0x00007f889ad71f09 in tls_bsock_readn () from /usr/lib/libbac.so.1
#2 0x00007f889ad56b25 in BSOCK::recv () from /usr/lib/libbac.so.1
#3 0x0000000000446bcd in ?? ()
#4 0x00007f889ad7a642 in workq_server () from /usr/lib/libbac.so.1
#5 0x00007f8899aa5007 in start_thread () from /lib/libpthread.so.0
#6 0x00007f889889b48d in clone () from /lib/libc.so.6
#7 0x0000000000000000 in ?? ()
Thread 4 (Thread 0x7f88979a0950 (LWP 5894)):
#0 0x00007f8899aa903d in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib/libpthread.so.0
#1 0x00007f889ad7a047 in watchdog_thread () from /usr/lib/libbac.so.1
#2 0x00007f8899aa5007 in start_thread () from /lib/libpthread.so.0
#3 0x00007f889889b48d in clone () from /lib/libc.so.6
#4 0x0000000000000000 in ?? ()
Thread 3 (Thread 0x7f88987ca950 (LWP 5893)):
#0 0x00007f8898894b92 in select () from /lib/libc.so.6
#1 0x00007f889ad54a0e in bnet_thread_server () from /usr/lib/libbac.so.1
#2 0x0000000000446b3c in ?? ()
#3 0x00007f8899aa5007 in start_thread () from /lib/libpthread.so.0
#4 0x00007f889889b48d in clone () from /lib/libc.so.6
#5 0x0000000000000000 in ?? ()
---Type <return> to continue, or q <return> to quit---
Thread 1 (Thread 0x7f889b9d1700 (LWP 5889)):
#0 0x00007f8899aac181 in nanosleep () from /lib/libpthread.so.0
#1 0x00007f889ad51fb6 in bmicrosleep () from /usr/lib/libbac.so.1
#2 0x000000000042fb37 in ?? ()
#3 0x000000000040ee2c in ?? ()
#4 0x00007f88987e95c6 in __libc_start_main () from /lib/libc.so.6
#5 0x000000000040cba9 in ?? ()
#6 0x00007fffa39e3918 in ?? ()
#7 0x000000000000001c in ?? ()
#8 0x0000000000000005 in ?? ()
#9 0x00007fffa39e4191 in ?? ()
#10 0x00007fffa39e41a6 in ?? ()
#11 0x00007fffa39e41a9 in ?? ()
#12 0x00007fffa39e41ac in ?? ()
#13 0x00007fffa39e41af in ?? ()
#14 0x0000000000000000 in ?? ()
#0 0x00007f889ad68622 in sm_realloc_pool_memory () from /usr/lib/libbac.so.1
_____________________
Corey Shaw
Technology Specialist
O. 801.491.0705 (x. 157)
F. 801.491.8774
Winner of the 2009 Utah Work/Life Award
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users