On Wednesday 14 December 2005 15:59, Roger Kvam wrote:
> Now I`m really embarrassed, actually ,FreeBSD is protecting itself
> against runaway processes by allowing maximum memory size for one
> process to 512MB, I had to increase the maximum to 2 gig by adding
> kern.maxdsiz=2147483648 to /boot/loader.conf. witch is my physical
> amount of memory, but maxed out that to, so I had tried another job who
> only used 870MB RAM. Maybe there should be a warning in the
> documentation about this problem? I think my job would have required at
> least 2.3GB memory :D I will divide it now ;)

Well, there is nothing to be embarrassed about. I had no idea that FreeBSD 
limited the maximum memory.  I agree this should be in the doc, and I will 
add it.

>
> But the error messages from bacula was far out :p
> "Query failed: SELECT MediaType
>
>  >>> FROM JobMedia,Media WHERE JobMedia.JobId=2 AND
>  >>> JobMedia.MediaId=Media.MediaId: ERR=Lost connection to MySQL server
>  >>> during query"
>
> Roger
>
> Roger Kvam wrote:
> > I have tried to run several restorejobs, and every job who is beneth 516
> > MB i memory consumption is a sucess, every job above 516 crash whith
> > same error message. MySQL is installed same way as when running bacula
> > 1.36.
> >
> > Bacula did crash earlyer when consuming more than fysical mermory, but
> > the box was upgraded to 2GB  prior to 1.36, and the versions prior to
> > 1.36 and 1.36 worked fine.
> >
> > Kern Sibbald wrote:
> >> Hello,
> >>
> >> I repeat what I said the last time: this looks like a MySQL problem.
> >> Given the new information you have presented, and the fact that you
> >> are having problems when working with almost 9 million files backed up
> >> (a lot), I would suspect that the problem is with your MySQL
> >> configuration.  If you used the standard MySQL installation, it is
> >> probably not configured for such large databases.
> >>
> >> See below for more ...
> >>
> >> On Monday 12 December 2005 10:52, Roger Kvam wrote:
> >>> I Tried to upgrade MySQL to 5. and recomile bacula (yes I did a make
> >>> distclean), but same error, have now installed MySQL 4.1.15 again, and
> >>> recompiled bacula again, deleted the database and deleted the tapes so
> >>> I could start fresh.
> >>>
> >>> When trying to restore from a large job, bacula crashes,
> >>
> >> From the message you show below, it looks to me like MySQL crashed or
> >> at least disconnected.
> >>
> >>> when trying to restore from a small job everything works fine. Does
> >>> bacula has
> >>> shortcomings when comes to enterprise systems?
> >>
> >> Bacula is not what I would call an enterprise system, so one could say
> >> it has that kind of shortcoming.   In your case, you are dealing with
> >> a quite large backup set, and it looks like you have not adapted MySQL
> >> to handle it appropriatedly.
> >>
> >> Perhaps some of the other users on the list can help you.  If I am not
> >> mistaken, the Bacula document also mentions some steps you might take
> >> for large databases, but your best bet is to look at the MySQL
> >> documentation.
> >>
> >>> Now I have run several backups of several servers and everything works
> >>> fine, until I wanted to do a restore from the biggest job,
> >>>
> >>> +-------+-------+-----------+-----------------+---------------------+--
> >>>----
> >>>
> >>> ------+-----------+
> >>>
> >>> | JobId | Level | JobFiles  | JobBytes        | StartTime           |
> >>>
> >>> VolumeName | StartFile |
> >>> +-------+-------+-----------+-----------------+---------------------+--
> >>>----
> >>>
> >>> ------+-----------+
> >>>
> >>> |     2 | F     | 8,759,302 | 821,505,476,167 | 2005-12-09 16:27:02 |
> >>>
> >>> 000009L2   |         0 |
> >>> +-------+-------+-----------+-----------------+---------------------+--
> >>>----
> >>>
> >>> ------+-----------+ You have selected the following JobId: 2
> >>>
> >>> Building directory tree for JobId 2 ...  Query failed: SELECT MediaType
> >>> FROM JobMedia,Media WHERE JobMedia.JobId=2 AND
> >>> JobMedia.MediaId=Media.MediaId: ERR=Lost connection to MySQL server
> >>> during query
> >>>
> >>> There were no files inserted into the tree, so file selection
> >>> is not possible.Most likely your retention policy pruned the files
> >>>
> >>> Do you want to restore all the files? (yes|no):
> >>> ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
> >>>
> >>> When trying to restore from a smaller job, everything works fine:
> >>>
> >>> +-------+-------+----------+----------------+---------------------+----
> >>>----
> >>>
> >>> ----+-----------+
> >>>
> >>> | JobId | Level | JobFiles | JobBytes       | StartTime           |
> >>>
> >>> VolumeName | StartFile |
> >>> +-------+-------+----------+----------------+---------------------+----
> >>>----
> >>>
> >>> ----+-----------+
> >>>
> >>> |     3 | F     |  833,832 | 52,358,007,603 | 2005-12-09 16:29:06 |
> >>>
> >>> 000009L2   |        29 |
> >>>
> >>> |     4 | I     |    1,434 |    322,179,540 | 2005-12-10 00:05:01 |
> >>>
> >>> 000001L2   |         0 |
> >>> +-------+-------+----------+----------------+---------------------+----
> >>>----
> >>>
> >>> ----+-----------+ You have selected the following JobIds: 3,4
> >>>
> >>> Building directory tree for JobId 3 ...
> >>> +++++++++++++++++++++++++++++++++++++++++++++++++
> >>> Building directory tree for JobId 4 ...
> >>> 2 Jobs, 828,824 files inserted into the tree.
> >>>
> >>> Have I met any limitations in bacula?
> >>> The system status was like this the moment before the crash:
> >>>
> >>> last pid: 34607;  load averages:  1.24,  0.58,  0.25  up 3+19:19:56
> >>> 08:57:34
> >>> 59 processes:  6 running, 53 sleeping
> >>>
> >>> Mem: 572M Active, 1156M Inact, 212M Wired, 64M Cache, 112M Buf, 3008K
> >>> Free
> >>> Swap: 4096M Total, 500K Used, 4095M Free
> >>>
> >>>
> >>>   PID USERNAME    THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU
> >>> COMMAND 20584 mysql         9  20    0 57640K 33344K kserel 0 256:57
> >>> 86.77%
> >>> mysqld 31511 root          6  20    0  5608K  2304K kserel 0 144:10
> >>> 0.00%
> >>> bacula-sd
> >>> 31532 root          6  20    0   518M   513M RUN    0  53:32  0.00%
> >>> bacula-dir
> >>>
> >>> Kern Sibbald wrote:
> >>>> This looks to me like a problem with MySQL.  Did you by any chance
> >>>> load pre-built binaries rather than build on your system.  My best
> >>>> guess at the moment is that Bacula is built with version X of MySQL
> >>>> and you are running with version Y on your machine. That means that
> >>>> some of the packet fields have moved or are aligned differently. 
> >>>> Rebuiding Bacula from source directly on your machine should resolve
> >>>> this, if I am right ...
> >>>>
> >>>> To answer your question.  Yes, of course, you can do as many backup
> >>>> and restores at the same time as you want without Bacula crashing
> >>>> (providing
> >>>> it is working properly).
> >>>>
> >>>> On Wednesday 07 December 2005 22:57, rkvam wrote:
> >>>>> Actually, even when there is no backup running, the restore job
> >>>>> fails, anyone have a clue? Did I fuck up the database while trying to
> >>>>> restore while backuping? Same crash everytime I try to restore the
> >>>>> same job as I
> >>>>> tried to run while segfault.
> >>>>>
> >>>>> Automatically selected FileSet: Store
> >>>>> +-------+-------+-----------+-----------------+---------------------+
> >>>>>----
> >>>>>
> >>>>> -- ------+-----------+
> >>>>>
> >>>>> | JobId | Level | JobFiles  | JobBytes        | StartTime           |
> >>>>>
> >>>>> VolumeName | StartFile |
> >>>>> +-------+-------+-----------+-----------------+---------------------+
> >>>>>----
> >>>>>
> >>>>> -- ------+-----------+
> >>>>>
> >>>>> |    13 | F     | 8,743,518 | 819,499,095,510 | 2005-12-05 21:30:55 |
> >>>>>
> >>>>> 000010L2   |       117 |
> >>>>>
> >>>>> |    18 | I     |   143,886 |  27,865,423,512 | 2005-12-07 00:15:07 |
> >>>>>
> >>>>> 000001L2   |         9 |
> >>>>>
> >>>>> |    30 | I     |    36,820 |  21,111,393,490 | 2005-12-07 16:24:15 |
> >>>>>
> >>>>> 000001L2   |        42 |
> >>>>> +-------+-------+-----------+-----------------+---------------------+
> >>>>>----
> >>>>>
> >>>>> -- ------+-----------+ You have selected the following JobIds:
> >>>>> 13,18,30
> >>>>>
> >>>>> Building directory tree for JobId 13 ...  Query failed: SELECT
> >>>>> MediaType
> >>>>>
> >>>>>
> >>>>> FROM JobMedia,Media WHERE JobMedia.JobId=13 AND
> >>>>>
> >>>>>
> >>>>> JobMedia.MediaId=Media.MediaId: ERR=Lost connection to MySQL server
> >>>>> during query
> >>>>>
> >>>>> Building directory tree for JobId 18 ...
> >>>>> Building directory tree for JobId 30 ...
> >>>>> 3 Jobs, 150,713 files inserted into the tree.
> >>>>>
> >>>>> You are now entering file selection mode where you add (mark) and
> >>>>> remove (unmark) files to be restored. No files are initially added,
> >>>>> unless you used the "all" keyword on the command line.
> >>>>> Enter "done" to leave this mode.
> >>>>>
> >>>>> cwd is: /
> >>>>> $
> >>>>>
> >>>>> rkvam wrote:
> >>>>>> While trying to restore (building the directory structure), while a
> >>>>>> backup is running, the director gets a segmentation fault. Is it not
> >>>>>> suppose to be possible to restore while doing backup? (for the
> >>>>>> record: I
> >>>>>> got a changer with 2 streamers, and able to run two backup jobs
> >>>>>> simultaneous):
> >>>>>>
> >>>>>> 07-Dec 19:39 alexandria-dir: Fatal Error because: Bacula
> >>>>>> interrupted by
> >>>>>> signal 11: Segmentation violation
> >>>>>>
> >>>>>> Trace:
> >>>>>> warning: Unable to get location for thread creation breakpoint:
> >>>>>> generic
> >>>>>> error
> >>>>>> [New Thread 0x8119800 (sleeping)]
> >>>>>> [New Thread 0x8119c00 (runnable)]
> >>>>>> [New Thread 0x8119200 (runnable)]
> >>>>>> [New Thread 0x80fb800 (runnable)]
> >>>>>> [New Thread 0x80ef200 (runnable)]
> >>>>>> [New Thread 0x80ef000 (sleeping)]
> >>>>>> [New Thread 0x80d2e00 (runnable)]
> >>>>>> [New Thread 0x80d2c00 (LWP 100119)]
> >>>>>> [New Thread 0x80cc000 (sleeping)]
> >>>>>> [New LWP 100203]
> >>>>>> [Switching to LWP 100203]
> >>>>>> 0x2815e277 in pthread_testcancel () from /usr/lib/libpthread.so.2
> >>>>>> $1 = "alexandria-dir", '\0' <repeats 15 times>
> >>>>>> $2 = 0x80cf018 "bacula-dir"
> >>>>>> $3 = 0x80cf058 "/usr/local/bin/bacula-dir"
> >>>>>> $4 = "MySQL"
> >>>>>> $5 = 0x80afd90 "1.38.2 (20 November 2005)"
> >>>>>> $6 = 0x80a7925 "i386-unknown-freebsd6.0"
> >>>>>> $7 = 0x80a791d "freebsd"
> >>>>>> $8 = 0x80a7911 "6.0-RELEASE"
> >>>>>> #0  0x2815e277 in pthread_testcancel () from
> >>>>>> /usr/lib/libpthread.so.2 #1  0x281576f3 in pthread_mutexattr_init ()
> >>>>>> from
> >>>>>> /usr/lib/libpthread.so.2 #2  0x080fb800 in ?? ()
> >>>>>>
> >>>>>> Thread 10 (LWP 100203):
> >>>>>> #0  0x2815e277 in pthread_testcancel () from
> >>>>>> /usr/lib/libpthread.so.2 #1  0x281576f3 in pthread_mutexattr_init ()
> >>>>>> from
> >>>>>> /usr/lib/libpthread.so.2 #2  0x080fb800 in ?? ()
> >>>>>>
> >>>>>> Thread 9 (Thread 0x80cc000 (sleeping)):
> >>>>>> #0  0x28156e7f in pthread_mutexattr_init () from
> >>>>>> /usr/lib/libpthread.so.2 #1  0x2815098c in _nanosleep () from
> >>>>>> /usr/lib/libpthread.so.2
> >>>>>> #2  0x28150afa in nanosleep () from /usr/lib/libpthread.so.2
> >>>>>> #3  0x08081f64 in bmicrosleep (sec=60, usec=0) at bsys.c:54
> >>>>>> #4  0x08060ec6 in wait_for_next_job (one_shot_job_to_run=0x0) at
> >>>>>> scheduler.c:96
> >>>>>> #5  0x0804ca02 in main (argc=0, argv=0x1) at dird.c:248
> >>>>>>
> >>>>>> Thread 8 (Thread 0x80d2c00 (LWP 100119)):
> >>>>>> #0  0x2815e277 in pthread_testcancel () from
> >>>>>> /usr/lib/libpthread.so.2 #1  0x28156dac in pthread_mutexattr_init ()
> >>>>>> from
> >>>>>> /usr/lib/libpthread.so.2 #2  0x00000000 in ?? ()
> >>>>>>
> >>>>>> Thread 7 (Thread 0x80d2e00 (runnable)):
> >>>>>> #0  0x282910b3 in select () from /lib/libc.so.6
> >>>>>> #1  0x28147639 in select () from /usr/lib/libpthread.so.2
> >>>>>> #2  0x080850d5 in bnet_thread_server (addrs=0x80cf258,
> >>>>>> max_clients=10,
> >>>>>>   client_wq=0x80c1b60,
> >>>>>>   handle_client_request=0x8071c84 <handle_UA_client_request>)
> >>>>>>   at bnet_server.c:148
> >>>>>> #3  0x08071a86 in connect_thread (arg=0x80cf258) at ua_server.c:73
> >>>>>> #4  0x28149ab1 in pthread_create () from /usr/lib/libpthread.so.2
> >>>>>> #5  0x282ec45f in _ctx_start () from /lib/libc.so.6
> >>>>>>
> >>>>>> Thread 6 (Thread 0x80ef000 (sleeping)):
> >>>>>> #0  0x28156e7f in pthread_mutexattr_init () from
> >>>>>> /usr/lib/libpthread.so.2 #1  0x28157013 in pthread_mutexattr_init ()
> >>>>>> from /usr/lib/libpthread.so.2 #2  0x2815bdd9 in
> >>>>>> _pthread_cond_timedwait
> >>>>>> () from
> >>>>>> /usr/lib/libpthread.so.2 #3  0x2815c342 in pthread_cond_timedwait ()
> >>>>>> from /usr/lib/libpthread.so.2 #4  0x080991d2 in watchdog_thread
> >>>>>> (arg=0x0) at watchdog.c:296
> >>>>>> #5  0x28149ab1 in pthread_create () from /usr/lib/libpthread.so.2
> >>>>>> #6  0x282ec45f in _ctx_start () from /lib/libc.so.6
> >>>>>>
> >>>>>> Thread 5 (Thread 0x80ef200 (runnable)):
> >>>>>> #0  0x28291833 in read () from /lib/libc.so.6
> >>>>>> #1  0x28147b32 in read () from /usr/lib/libpthread.so.2
> >>>>>> #2  0x08082a0c in read_nbytes (bsock=0x80d7718,
> >>>>>>   ptr=0xbf6fbf2c
> >>>>>> "0ô\016\b0ô\016\b\030ô\016\bh¿o¿î\034\a\b\030w\r\bÿÿÿÿ", nbytes=4)
> >>>>>> at bnet.c:73
> >>>>>> #3  0x0808378b in bnet_recv (bsock=0x80d7718) at bnet.c:194
> >>>>>> #4  0x08071cee in handle_UA_client_request (arg=0x80ef430) at
> >>>>>> ua_server.c:127
> >>>>>> #5  0x08099be5 in workq_server (arg=0x80c1b60) at workq.c:347
> >>>>>> #6  0x28149ab1 in pthread_create () from /usr/lib/libpthread.so.2
> >>>>>> #7  0x282ec45f in _ctx_start () from /lib/libc.so.6
> >>>>>>
> >>>>>> Thread 4 (Thread 0x80fb800 (runnable)):
> >>>>>> #0  0x28156e7f in pthread_mutexattr_init () from
> >>>>>> /usr/lib/libpthread.so.2 #1  0x2815f089 in __error () from
> >>>>>> /usr/lib/libpthread.so.2
> >>>>>> #2  0x2814c012 in sigaction () from /usr/lib/libpthread.so.2
> >>>>>> #3  0x0809334b in signal_handler (sig=11) at signal.c:159
> >>>>>> #4  0x2814c252 in sigaction () from /usr/lib/libpthread.so.2
> >>>>>> #5  0x2814d6ed in sigaction () from /usr/lib/libpthread.so.2
> >>>>>> #6  0x28155c15 in pthread_mutexattr_init () from
> >>>>>> /usr/lib/libpthread.so.2 #7  0x28155c83 in pthread_mutexattr_init ()
> >>>>>> from /usr/lib/libpthread.so.2 #8  0x282ec45f in _ctx_start () from
> >>>>>> /lib/libc.so.6
> >>>>>> #9  0x00000000 in ?? ()
> >>>>>> #10 0xbf4f9860 in ?? ()
> >>>>>> #11 0xbf4f95a0 in ?? ()
> >>>>>> #12 0x00000000 in ?? ()
> >>>>>> #13 0x28155c40 in pthread_mutexattr_init () from
> >>>>>> /usr/lib/libpthread.so.2 #14 0x280f1a6b in alloc_root ()
> >>>>>>  from /usr/local/lib/mysql/libmysqlclient_r.so.14
> >>>>>> #15 0x28104b4b in cli_read_rows ()
> >>>>>>  from /usr/local/lib/mysql/libmysqlclient_r.so.14
> >>>>>> #16 0x281068d6 in mysql_store_result ()
> >>>>>>  from /usr/local/lib/mysql/libmysqlclient_r.so.14
> >>>>>> #17 0x080784e5 in db_sql_query (mdb=0x80f4818,
> >>>>>>   query=0x812e428 "SELECT
> >>>>>> Path.Path,Filename.Name,FileIndex,JobId,LStat FROM
> >>>>>> File,Filename,Path WHERE File.JobId=13 AND
> >>>>>> Filename.FilenameId=File.FilenameId AND Path.PathId=File.PathId",
> >>>>>>   result_handler=0x8073654 <insert_tree_handler(void*, int,
> >>>>>> char**)>, ctx=0xbf4f9a60) at mysql.c:331
> >>>>>> #18 0x0806cb1e in build_directory_tree (ua=0x8119e18, rx=0xbf4f9df0)
> >>>>>>   at ua_restore.c:864
> >>>>>> #19 0x0806d6ef in restore_cmd (ua=0x8119e18, cmd=0x80fb628 "2")
> >>>>>>   at ua_restore.c:128
> >>>>>> #20 0x08061407 in do_a_command (ua=0x8119e18, cmd=0x80fb628 "2")
> >>>>>>   at ua_cmds.c:158
> >>>>>> #21 0x08071d34 in handle_UA_client_request (arg=0x8119e30) at
> >>>>>> ua_server.c:134
> >>>>>> #22 0x08099be5 in workq_server (arg=0x80c1b60) at workq.c:347
> >>>>>> #23 0x28149ab1 in pthread_create () from /usr/lib/libpthread.so.2
> >>>>>> #24 0x282ec45f in _ctx_start () from /lib/libc.so.6
> >>>>>>
> >>>>>> Thread 3 (Thread 0x8119200 (runnable)):
> >>>>>> #0  0x28291833 in read () from /lib/libc.so.6
> >>>>>> #1  0x28147b02 in read () from /usr/lib/libpthread.so.2
> >>>>>> #2  0x08119360 in ?? ()
> >>>>>> #3  0x080ae6f4 in base64_digits ()
> >>>>>> #4  0x00000004 in ?? ()
> >>>>>> #5  0xbf1f6a8c in ?? ()
> >>>>>> #6  0x08129018 in ?? ()
> >>>>>> #7  0x08129018 in ?? ()
> >>>>>> #8  0x00000000 in ?? ()
> >>>>>> #9  0xbf1f6a98 in ?? ()
> >>>>>> #10 0x0808378b in bnet_recv (bsock=0x8082a0c) at bnet.c:194
> >>>>>> /usr/local/etc/bacula/btraceback.gdb:10: Error in sourced command
> >>>>>> file:
> >>>>>> Previous frame inner to this frame (corrupt stack?)
> >>>>>> #0  0x2815e277 in pthread_testcancel () from
> >>>>>> /usr/lib/libpthread.so.2
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> -------------------------------------------------------
> >>>>>> This SF.net email is sponsored by: Splunk Inc. Do you grep through
> >>>>>> log
> >>>>>> files
> >>>>>> for problems?  Stop!  Download the new AJAX search engine that makes
> >>>>>> searching your log files as easy as surfing the  web.  DOWNLOAD
> >>>>>> SPLUNK!
> >>>>>> http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
> >>>>>
> >>>>> -------------------------------------------------------
> >>>>> This SF.net email is sponsored by: Splunk Inc. Do you grep through
> >>>>> log files for problems?  Stop!  Download the new AJAX search engine
> >>>>> that makes searching your log files as easy as surfing the  web. 
> >>>>> DOWNLOAD SPLUNK!
> >>>>> http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
> >>>>> _______________________________________________
> >>>>> Bacula-users mailing list
> >>>>> Bacula-users@lists.sourceforge.net
> >>>>> https://lists.sourceforge.net/lists/listinfo/bacula-users
> >>>
> >>> -------------------------------------------------------
> >>> This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> >>> files for problems?  Stop!  Download the new AJAX search engine that
> >>> makes
> >>> searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> >>> http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
> >>> _______________________________________________
> >>> Bacula-users mailing list
> >>> Bacula-users@lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/bacula-users
> >
> > -------------------------------------------------------
> > This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> > files
> > for problems?  Stop!  Download the new AJAX search engine that makes
> > searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> > http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
>
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> files for problems?  Stop!  Download the new AJAX search engine that makes
> searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
> http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
> _______________________________________________
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users

-- 
Best regards,

Kern

  (">
  /\
  V_V


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to