----- "Martin Simmons" <mar...@lispworks.com> wrote: > >>>>> On Mon, 11 Jul 2011 11:42:35 +0200 (CEST), Pierre Bourgin said: > > > > Hello, > > > > I have installed bacula 5.0.3 on a CentOS 5.4 x86_64 system (RPM > x86_64 rebuilt from source) and it's working great since a year. > > > > After a mistake I mad, I need to restore my catalog. > > So I tried to use bextract in order to restore a 51 MB file from a > volume-disk file of 20GB. > > bextract hangs a lot: 100% CPU used, no I/O wait at all. > > After several minutes of run, I stopped it without any success: > restored file with created, but empty. > > > > Since I really need this file, I've tried the 32bit version of > bextract on the same system: worked fine ! > > > > I've tried to debug it by the use of strace, but I'm not clever > enough to find anything usefull in these outputs. > > (please find the strace files attached to this email) > > > > So I don't know if it's a bug from the packaging or a bextract bug > related to 64bit platform ? > > > > If someone has a clue ... > > To find out where is it looping, attach gdb to the process when it is > hanging > (use gdb -p $pidofbextract) and then issue the gdb commands > > thread apply all bt > detach > quit > > Do this a few times to get an idea of how it changes.
Hello, Thanks for your help. Once bextract has started, I've launched a batched gdb once per minute with the gdb commands you provided. gdb then always shows a similar output like this (see below): - adresses of the Thread 1 stack are always the same - addresss of the Thread 2 stack: only #0 and #1 are different (inflate_table() and inflate()), - Thread 1: sometime call to inflate_table() does not appears # while [ 1 ]; do gdb -p `pgrep bextract` -x gdb.show-backtrace.commands ; sleep 4; done ============== gdb sample output ========================================== This GDB was configured as "x86_64-redhat-linux-gnu". Attaching to process 10007 Reading symbols from /usr/sbin/bextract...(no debugging symbols found)...done. Reading symbols from /lib64/libacl.so.1...(no debugging symbols found)...done. Loaded symbols for /lib64/libacl.so.1 .... Reading symbols from /lib64/libpthread.so.0...done. [Thread debugging using libthread_db enabled] [New Thread 0x2b7ee0469850 (LWP 10007)] [New Thread 0x40af4940 (LWP 10008)] Loaded symbols for /lib64/libpthread.so.0 .... Loaded symbols for /lib64/libselinux.so.1 Reading symbols from /lib64/libsepol.so.1...done. Loaded symbols for /lib64/libsepol.so.1 0x00002b7ee025a106 in inflate_table () from /usr/lib64/libz.so.1 Thread 2 (Thread 0x40af4940 (LWP 10008)): #0 0x0000003a7620dfe1 in nanosleep () from /lib64/libpthread.so.0 #1 0x0000003e32a1425b in bmicrosleep (sec=30, usec=0) at bsys.c:63 #2 0x0000003e32a40efb in check_deadlock () at lockmgr.c:571 #3 0x0000003a76206617 in start_thread () from /lib64/libpthread.so.0 #4 0x0000003a75ad3c2d in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x2b7ee0469850 (LWP 10007)): #0 0x00002b7ee025a106 in inflate_table () from /usr/lib64/libz.so.1 #1 0x00002b7ee0257537 in inflate () from /usr/lib64/libz.so.1 #2 0x00002b7ee0252396 in uncompress () from /usr/lib64/libz.so.1 #3 0x0000000000406b6f in record_cb () #4 0x0000000000425298 in read_records () #5 0x0000000000406438 in main () ============== gdb sample output ========================================== so trouble related to zlib and its use by bextract ? # rpm -qf /usr/lib64/libz.so.1 zlib-1.2.3-3 My RPM build system uses exactly the same version for the -devel version (unchanged since build of bacula): zlib-devel-1.2.3-3 On the bacula server, I've updated my zlib package with the most recent one: zlib-1.2.3-4. No difference, the same problem arises with bextract. I've checked my bacula's backups: they are fine: I've restored the bacula.sql file (BackupCatalog job) with bconsole and its "restore" command in seconds (for the same tape file). Another thing: bextract and bconsole do not have the same entry point for libz.so, and only for that one; does it mean they do not use libz the same way ? # ldd /usr/sbin/bextract /usr/sbin/bconsole libacl.so.1 => /lib64/libacl.so.1 (0x0000003a7ba00000) libbacfind-5.0.3.so => /usr/lib64/libbacfind-5.0.3.so (0x0000003e33200000) libbaccfg-5.0.3.so => /usr/lib64/libbaccfg-5.0.3.so (0x0000003e32e00000) libbac-5.0.3.so => /usr/lib64/libbac-5.0.3.so (0x0000003e32a00000) libz.so.1 => /usr/lib64/libz.so.1 (0x00002b0f663b7000) libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003a76200000) libdl.so.2 => /lib64/libdl.so.2 (0x0000003a75e00000) libssl.so.6 => /lib64/libssl.so.6 (0x0000003a79200000) libcrypto.so.6 => /lib64/libcrypto.so.6 (0x0000003a78200000) libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x0000003a78600000) libm.so.6 => /lib64/libm.so.6 (0x0000003a76600000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003a78a00000) libc.so.6 => /lib64/libc.so.6 (0x0000003a75a00000) libattr.so.1 => /lib64/libattr.so.1 (0x0000003a7aa00000) /lib64/ld-linux-x86-64.so.2 (0x0000003a75600000) libgssapi_krb5.so.2 => /usr/lib64/libgssapi_krb5.so.2 (0x0000003a78e00000) libkrb5.so.3 => /usr/lib64/libkrb5.so.3 (0x0000003a7b200000) libcom_err.so.2 => /lib64/libcom_err.so.2 (0x0000003a79600000) libk5crypto.so.3 => /usr/lib64/libk5crypto.so.3 (0x0000003a7ae00000) libkrb5support.so.0 => /usr/lib64/libkrb5support.so.0 (0x0000003a7b600000) libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x0000003a79a00000) libresolv.so.2 => /lib64/libresolv.so.2 (0x0000003a79e00000) libselinux.so.1 => /lib64/libselinux.so.1 (0x0000003a76a00000) libsepol.so.1 => /lib64/libsepol.so.1 (0x0000003a76e00000) /usr/sbin/bconsole: libreadline.so.5 => /usr/lib64/libreadline.so.5 (0x000000311fc00000) libncurses.so.5 => /usr/lib64/libncurses.so.5 (0x0000003a7a600000) libbaccfg-5.0.3.so => /usr/lib64/libbaccfg-5.0.3.so (0x0000003e32e00000) libbac-5.0.3.so => /usr/lib64/libbac-5.0.3.so (0x0000003e32a00000) libz.so.1 => /usr/lib64/libz.so.1 (0x00002b688e3de000) libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003a76200000) libdl.so.2 => /lib64/libdl.so.2 (0x0000003a75e00000) libssl.so.6 => /lib64/libssl.so.6 (0x0000003a79200000) libcrypto.so.6 => /lib64/libcrypto.so.6 (0x0000003a78200000) libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x0000003a78600000) libm.so.6 => /lib64/libm.so.6 (0x0000003a76600000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003a78a00000) libc.so.6 => /lib64/libc.so.6 (0x0000003a75a00000) /lib64/ld-linux-x86-64.so.2 (0x0000003a75600000) libgssapi_krb5.so.2 => /usr/lib64/libgssapi_krb5.so.2 (0x0000003a78e00000) libkrb5.so.3 => /usr/lib64/libkrb5.so.3 (0x0000003a7b200000) libcom_err.so.2 => /lib64/libcom_err.so.2 (0x0000003a79600000) libk5crypto.so.3 => /usr/lib64/libk5crypto.so.3 (0x0000003a7ae00000) libkrb5support.so.0 => /usr/lib64/libkrb5support.so.0 (0x0000003a7b600000) libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x0000003a79a00000) libresolv.so.2 => /lib64/libresolv.so.2 (0x0000003a79e00000) libselinux.so.1 => /lib64/libselinux.so.1 (0x0000003a76a00000) libsepol.so.1 => /lib64/libsepol.so.1 (0x0000003a76e00000) > If you have debuginfo packages for bacula, then install them first. these packages definitions are not provided by bacula.spec from the bacula-5.0.3 sources. Would you have such a .spec file to generate them ? Regards, Pierre Bourgin ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users