I have tried to run several restorejobs, and every job who is beneth 516 MB i memory consumption is a sucess, every job above 516 crash whith same error message. MySQL is installed same way as when running bacula 1.36.

Bacula did crash earlyer when consuming more than fysical mermory, but the box was upgraded to 2GB prior to 1.36, and the versions prior to 1.36 and 1.36 worked fine.


Kern Sibbald wrote:
Hello,

I repeat what I said the last time: this looks like a MySQL problem. Given the new information you have presented, and the fact that you are having problems when working with almost 9 million files backed up (a lot), I would suspect that the problem is with your MySQL configuration. If you used the standard MySQL installation, it is probably not configured for such large databases.

See below for more ...

On Monday 12 December 2005 10:52, Roger Kvam wrote:

I Tried to upgrade MySQL to 5. and recomile bacula (yes I did a make
distclean), but same error, have now installed MySQL 4.1.15 again, and
recompiled bacula again, deleted the database and deleted the tapes so I
could start fresh.

When trying to restore from a large job, bacula crashes,


From the message you show below, it looks to me like MySQL crashed or at least disconnected.


when trying to restore from a small job everything works fine. Does bacula has
shortcomings when comes to enterprise systems?


Bacula is not what I would call an enterprise system, so one could say it has that kind of shortcoming. In your case, you are dealing with a quite large backup set, and it looks like you have not adapted MySQL to handle it appropriatedly.

Perhaps some of the other users on the list can help you. If I am not mistaken, the Bacula document also mentions some steps you might take for large databases, but your best bet is to look at the MySQL documentation.


Now I have run several backups of several servers and everything works
fine, until I wanted to do a restore from the biggest job,

+-------+-------+-----------+-----------------+---------------------+------
------+-----------+

| JobId | Level | JobFiles  | JobBytes        | StartTime           |

VolumeName | StartFile |
+-------+-------+-----------+-----------------+---------------------+------
------+-----------+

|     2 | F     | 8,759,302 | 821,505,476,167 | 2005-12-09 16:27:02 |

000009L2   |         0 |
+-------+-------+-----------+-----------------+---------------------+------
------+-----------+ You have selected the following JobId: 2

Building directory tree for JobId 2 ...  Query failed: SELECT MediaType
FROM JobMedia,Media WHERE JobMedia.JobId=2 AND
JobMedia.MediaId=Media.MediaId: ERR=Lost connection to MySQL server
during query

There were no files inserted into the tree, so file selection
is not possible.Most likely your retention policy pruned the files

Do you want to restore all the files? (yes|no):
¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

When trying to restore from a smaller job, everything works fine:

+-------+-------+----------+----------------+---------------------+--------
----+-----------+

| JobId | Level | JobFiles | JobBytes       | StartTime           |

VolumeName | StartFile |
+-------+-------+----------+----------------+---------------------+--------
----+-----------+

|     3 | F     |  833,832 | 52,358,007,603 | 2005-12-09 16:29:06 |

000009L2   |        29 |

|     4 | I     |    1,434 |    322,179,540 | 2005-12-10 00:05:01 |

000001L2   |         0 |
+-------+-------+----------+----------------+---------------------+--------
----+-----------+ You have selected the following JobIds: 3,4

Building directory tree for JobId 3 ...
+++++++++++++++++++++++++++++++++++++++++++++++++
Building directory tree for JobId 4 ...
2 Jobs, 828,824 files inserted into the tree.

Have I met any limitations in bacula?
The system status was like this the moment before the crash:

last pid: 34607;  load averages:  1.24,  0.58,  0.25  up 3+19:19:56
08:57:34
59 processes:  6 running, 53 sleeping

Mem: 572M Active, 1156M Inact, 212M Wired, 64M Cache, 112M Buf, 3008K Free
Swap: 4096M Total, 500K Used, 4095M Free


  PID USERNAME    THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU
COMMAND 20584 mysql         9  20    0 57640K 33344K kserel 0 256:57 86.77%
mysqld 31511 root          6  20    0  5608K  2304K kserel 0 144:10  0.00%
bacula-sd
31532 root          6  20    0   518M   513M RUN    0  53:32  0.00%
bacula-dir

Kern Sibbald wrote:

This looks to me like a problem with MySQL.  Did you by any chance load
pre-built binaries rather than build on your system.  My best guess at
the moment is that Bacula is built with version X of MySQL and you are
running with version Y on your machine. That means that some of the
packet fields have moved or are aligned differently.  Rebuiding Bacula
from source directly on your machine should resolve this, if I am right
...

To answer your question.  Yes, of course, you can do as many backup and
restores at the same time as you want without Bacula crashing (providing
it is working properly).

On Wednesday 07 December 2005 22:57, rkvam wrote:

Actually, even when there is no backup running, the restore job fails,
anyone have a clue? Did I fuck up the database while trying to restore
while backuping? Same crash everytime I try to restore the same job as I
tried to run while segfault.

Automatically selected FileSet: Store
+-------+-------+-----------+-----------------+---------------------+----
-- ------+-----------+

| JobId | Level | JobFiles  | JobBytes        | StartTime           |

VolumeName | StartFile |
+-------+-------+-----------+-----------------+---------------------+----
-- ------+-----------+

|    13 | F     | 8,743,518 | 819,499,095,510 | 2005-12-05 21:30:55 |

000010L2   |       117 |

|    18 | I     |   143,886 |  27,865,423,512 | 2005-12-07 00:15:07 |

000001L2   |         9 |

|    30 | I     |    36,820 |  21,111,393,490 | 2005-12-07 16:24:15 |

000001L2   |        42 |
+-------+-------+-----------+-----------------+---------------------+----
-- ------+-----------+ You have selected the following JobIds: 13,18,30

Building directory tree for JobId 13 ...  Query failed: SELECT MediaType

FROM JobMedia,Media WHERE JobMedia.JobId=13 AND

JobMedia.MediaId=Media.MediaId: ERR=Lost connection to MySQL server
during query

Building directory tree for JobId 18 ...
Building directory tree for JobId 30 ...
3 Jobs, 150,713 files inserted into the tree.

You are now entering file selection mode where you add (mark) and
remove (unmark) files to be restored. No files are initially added,
unless you used the "all" keyword on the command line.
Enter "done" to leave this mode.

cwd is: /
$

rkvam wrote:

While trying to restore (building the directory structure), while a
backup is running, the director gets a segmentation fault. Is it not
suppose to be possible to restore while doing backup? (for the record: I
got a changer with 2 streamers, and able to run two backup jobs
simultaneous):

07-Dec 19:39 alexandria-dir: Fatal Error because: Bacula interrupted by
signal 11: Segmentation violation

Trace:
warning: Unable to get location for thread creation breakpoint: generic
error
[New Thread 0x8119800 (sleeping)]
[New Thread 0x8119c00 (runnable)]
[New Thread 0x8119200 (runnable)]
[New Thread 0x80fb800 (runnable)]
[New Thread 0x80ef200 (runnable)]
[New Thread 0x80ef000 (sleeping)]
[New Thread 0x80d2e00 (runnable)]
[New Thread 0x80d2c00 (LWP 100119)]
[New Thread 0x80cc000 (sleeping)]
[New LWP 100203]
[Switching to LWP 100203]
0x2815e277 in pthread_testcancel () from /usr/lib/libpthread.so.2
$1 = "alexandria-dir", '\0' <repeats 15 times>
$2 = 0x80cf018 "bacula-dir"
$3 = 0x80cf058 "/usr/local/bin/bacula-dir"
$4 = "MySQL"
$5 = 0x80afd90 "1.38.2 (20 November 2005)"
$6 = 0x80a7925 "i386-unknown-freebsd6.0"
$7 = 0x80a791d "freebsd"
$8 = 0x80a7911 "6.0-RELEASE"
#0  0x2815e277 in pthread_testcancel () from /usr/lib/libpthread.so.2
#1  0x281576f3 in pthread_mutexattr_init () from
/usr/lib/libpthread.so.2 #2  0x080fb800 in ?? ()

Thread 10 (LWP 100203):
#0  0x2815e277 in pthread_testcancel () from /usr/lib/libpthread.so.2
#1  0x281576f3 in pthread_mutexattr_init () from
/usr/lib/libpthread.so.2 #2  0x080fb800 in ?? ()

Thread 9 (Thread 0x80cc000 (sleeping)):
#0  0x28156e7f in pthread_mutexattr_init () from
/usr/lib/libpthread.so.2 #1  0x2815098c in _nanosleep () from
/usr/lib/libpthread.so.2
#2  0x28150afa in nanosleep () from /usr/lib/libpthread.so.2
#3  0x08081f64 in bmicrosleep (sec=60, usec=0) at bsys.c:54
#4  0x08060ec6 in wait_for_next_job (one_shot_job_to_run=0x0) at
scheduler.c:96
#5  0x0804ca02 in main (argc=0, argv=0x1) at dird.c:248

Thread 8 (Thread 0x80d2c00 (LWP 100119)):
#0  0x2815e277 in pthread_testcancel () from /usr/lib/libpthread.so.2
#1  0x28156dac in pthread_mutexattr_init () from
/usr/lib/libpthread.so.2 #2  0x00000000 in ?? ()

Thread 7 (Thread 0x80d2e00 (runnable)):
#0  0x282910b3 in select () from /lib/libc.so.6
#1  0x28147639 in select () from /usr/lib/libpthread.so.2
#2  0x080850d5 in bnet_thread_server (addrs=0x80cf258, max_clients=10,
  client_wq=0x80c1b60,
  handle_client_request=0x8071c84 <handle_UA_client_request>)
  at bnet_server.c:148
#3  0x08071a86 in connect_thread (arg=0x80cf258) at ua_server.c:73
#4  0x28149ab1 in pthread_create () from /usr/lib/libpthread.so.2
#5  0x282ec45f in _ctx_start () from /lib/libc.so.6

Thread 6 (Thread 0x80ef000 (sleeping)):
#0  0x28156e7f in pthread_mutexattr_init () from
/usr/lib/libpthread.so.2 #1  0x28157013 in pthread_mutexattr_init ()
from /usr/lib/libpthread.so.2 #2  0x2815bdd9 in _pthread_cond_timedwait
() from
/usr/lib/libpthread.so.2 #3  0x2815c342 in pthread_cond_timedwait ()
from /usr/lib/libpthread.so.2 #4  0x080991d2 in watchdog_thread
(arg=0x0) at watchdog.c:296
#5  0x28149ab1 in pthread_create () from /usr/lib/libpthread.so.2
#6  0x282ec45f in _ctx_start () from /lib/libc.so.6

Thread 5 (Thread 0x80ef200 (runnable)):
#0  0x28291833 in read () from /lib/libc.so.6
#1  0x28147b32 in read () from /usr/lib/libpthread.so.2
#2  0x08082a0c in read_nbytes (bsock=0x80d7718,
  ptr=0xbf6fbf2c
"0ô\016\b0ô\016\b\030ô\016\bh¿o¿î\034\a\b\030w\r\bÿÿÿÿ", nbytes=4) at
bnet.c:73
#3  0x0808378b in bnet_recv (bsock=0x80d7718) at bnet.c:194
#4  0x08071cee in handle_UA_client_request (arg=0x80ef430) at
ua_server.c:127
#5  0x08099be5 in workq_server (arg=0x80c1b60) at workq.c:347
#6  0x28149ab1 in pthread_create () from /usr/lib/libpthread.so.2
#7  0x282ec45f in _ctx_start () from /lib/libc.so.6

Thread 4 (Thread 0x80fb800 (runnable)):
#0  0x28156e7f in pthread_mutexattr_init () from
/usr/lib/libpthread.so.2 #1  0x2815f089 in __error () from
/usr/lib/libpthread.so.2
#2  0x2814c012 in sigaction () from /usr/lib/libpthread.so.2
#3  0x0809334b in signal_handler (sig=11) at signal.c:159
#4  0x2814c252 in sigaction () from /usr/lib/libpthread.so.2
#5  0x2814d6ed in sigaction () from /usr/lib/libpthread.so.2
#6  0x28155c15 in pthread_mutexattr_init () from
/usr/lib/libpthread.so.2 #7  0x28155c83 in pthread_mutexattr_init ()
from /usr/lib/libpthread.so.2 #8  0x282ec45f in _ctx_start () from
/lib/libc.so.6
#9  0x00000000 in ?? ()
#10 0xbf4f9860 in ?? ()
#11 0xbf4f95a0 in ?? ()
#12 0x00000000 in ?? ()
#13 0x28155c40 in pthread_mutexattr_init () from
/usr/lib/libpthread.so.2 #14 0x280f1a6b in alloc_root ()
 from /usr/local/lib/mysql/libmysqlclient_r.so.14
#15 0x28104b4b in cli_read_rows ()
 from /usr/local/lib/mysql/libmysqlclient_r.so.14
#16 0x281068d6 in mysql_store_result ()
 from /usr/local/lib/mysql/libmysqlclient_r.so.14
#17 0x080784e5 in db_sql_query (mdb=0x80f4818,
  query=0x812e428 "SELECT
Path.Path,Filename.Name,FileIndex,JobId,LStat FROM File,Filename,Path
WHERE File.JobId=13 AND Filename.FilenameId=File.FilenameId AND
Path.PathId=File.PathId",
  result_handler=0x8073654 <insert_tree_handler(void*, int, char**)>,
  ctx=0xbf4f9a60) at mysql.c:331
#18 0x0806cb1e in build_directory_tree (ua=0x8119e18, rx=0xbf4f9df0)
  at ua_restore.c:864
#19 0x0806d6ef in restore_cmd (ua=0x8119e18, cmd=0x80fb628 "2")
  at ua_restore.c:128
#20 0x08061407 in do_a_command (ua=0x8119e18, cmd=0x80fb628 "2")
  at ua_cmds.c:158
#21 0x08071d34 in handle_UA_client_request (arg=0x8119e30) at
ua_server.c:134
#22 0x08099be5 in workq_server (arg=0x80c1b60) at workq.c:347
#23 0x28149ab1 in pthread_create () from /usr/lib/libpthread.so.2
#24 0x282ec45f in _ctx_start () from /lib/libc.so.6

Thread 3 (Thread 0x8119200 (runnable)):
#0  0x28291833 in read () from /lib/libc.so.6
#1  0x28147b02 in read () from /usr/lib/libpthread.so.2
#2  0x08119360 in ?? ()
#3  0x080ae6f4 in base64_digits ()
#4  0x00000004 in ?? ()
#5  0xbf1f6a8c in ?? ()
#6  0x08129018 in ?? ()
#7  0x08129018 in ?? ()
#8  0x00000000 in ?? ()
#9  0xbf1f6a98 in ?? ()
#10 0x0808378b in bnet_recv (bsock=0x8082a0c) at bnet.c:194
/usr/local/etc/bacula/btraceback.gdb:10: Error in sourced command file:
Previous frame inner to this frame (corrupt stack?)
#0  0x2815e277 in pthread_testcancel () from /usr/lib/libpthread.so.2



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log
files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click

-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log
files for problems?  Stop!  Download the new AJAX search engine that
makes searching your log files as easy as surfing the  web.  DOWNLOAD
SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log
files for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users





-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to