Hi Arno,
I don't know if you know bacula source code, so I post you some 
parameters and information in my configuration that I think can cause 
this problem or I think is not well configured because I don't well 
understand the manual:

Arno Lehmann wrote:
> Hi,
>
> 04.07.2007 17:40,, Alfredo Marchini wrote::
>   
>> Hi,
>> The system and db logs doesn't tell me anything about this problem, like 
>> all the director processes or thread are locked concurrently.
>> If I restart only bacula-dir without restarting bacula-sd and 16 
>> bacula-fd the system restart working fine.
>>     
>
> It might be possible that the DIR is busy working on the catalog (like 
> pruning data) and just needs more time. You can check this using 
> 'mysqladmin processlist', for example.
>   

    ok, when rehappen I'll make also this test, but if bacula makes jobs 
and files pruning when volumes are all used, and there are no more 
appendable volumes, I don't have this problem because I've got used only 
10 volumes of 50Gb and have other 8 volumes avalaible and not already 
created.
>   
>> Now I have already restarted bacula-dir  and all works fine (I backup 16 
>> servers, I cannot take it in offline mode or someone kill me this 
>> evening), so I'm not able to reproduce the error until about 10-15 days.
>> Last time that I'd got this problem I used top and I didn't find 
>> anything strange.
>>     
>
> Ok, so let's assume the hardware, OS and relevant applications are 
> running ok.
>
>   
    Yes, I think is the right way.
>> But the test with time command will be the first when It will rehappen.
>> I don't think that the problem is with database, when I connect to 
>> database with mysql command line to db bacula it works fine and quickly.
>>     
>
> Bacula uses its own, internal locking, so you won't necessarily notice 
> anything from outside of Bacula.
>
>   
Ok
>> One thing:
>> I've setted for most of my fd 14 days for file and job retention.
>> One o two fd are setted to 7 days for both file and job retention.
>> The volume retention period is always setted to 14 days.
>>
>> I've got sufficient disk space
>>     
>
> Sounds interesting... I never have sufficient disk space :-)
>
>   
>> to use only one pool for 7 and 14 days 
>> retention client backup.
>>
>> Another thing is the maximum concurrent jobs :
>> On director = 30
>> On storage side director configuration file = 60
>> On storage = 60
>>     
>
> Quite a lot, I think. Running up to 30 jobs in parallel might load 
> your backup server beyond its reasonable working maximum, but that 
> depends on your hardware, software, and requirements.
>
>   

I've set this value because:
director = 30 because i've 16 fd that can connects concurrently (it is 
not the truth) plus
one job for fd to ask the status (16x2 = 32 rounded to 30).
storage = 60 because when 16 fd connects concurrently to the storage 
i've go also 16 connections from the director to the storage (when jobs 
starts).
I thought that the not responding problem was caused by this params, so 
I setted high values because I don't know how (at devel level) bacula 
works with tcp connections (I thought that the problem was caused by 
missing sufficient concurrent threads).

Another thing:
I've setted for all fd the messages that points to the director messages.
Example:
on director named = bacula-dir I've created messages named = 
bacula-dir-messages
on all fd I've setted message named = bacula-dir-message that points to 
director bacula-dir

Last thing and I've got no more:

If I go to working directory of bacula-dir, when is not responding, I 
find the files of mail that have to be send via e-mail to the operators 
old 2-3 days, as the bacula-dir is blocked and cannot send the e-mail 
(when is working fine the mail are correctly sent to all the operators).

I use a postfix smtp server configured for local and bsmtp to send email 
to a smtp server
installed in my LAN on another linux server.

Thank you
Alfredo
>> Can this parameters gives me this problem?
>>     
>
> Unlikely.
>
>   
>> These are the only parameters that I'm not sure to have understood where 
>> i've read the manual.
>> The others i think are correctly configured.
>>     
>
> If you can reproduce the problem, issue a 'setdebug level=400 trace=1 
> dir' shortly before you expect the problem and look at the resulting 
> trace file... it will have detailed information about what the DIR is 
> doing.
>
> Arno
>
>   

>> Thank you again
>> bye
>>
>>
>> Arno Lehmann wrote:
>>     
>>> Hi,
>>>
>>> 04.07.2007 16:51,, Alfredo Marchini wrote::
>>>   
>>>       
>>>> Hi all,
>>>> I've installed with rpm, on a Linux Fedora Core 6, a bacula-dir and 
>>>> bacula-sd daemon.
>>>> On this server there is also a Mysql-5.0.x server that correctly talks 
>>>> with bacula daemons.
>>>> Also there is a RAID-5 partition of a size of 1TB where I save my backups.
>>>> The server make backups of 16 bacula-fd that I've got in my LAN.
>>>> I've configured 1 pool with 18 volumes of 50Gb, with a retention period 
>>>> of 14 days and autoprune and recycle set to yes.
>>>> All works fine for some days.
>>>> Today (after some days, but is not the first time) I've noticed that 
>>>> bacula doesn't run scheduled backup jobs.
>>>> So I use bconsole and ask status of director, and director is locked, 
>>>> doesn't give me any answer, and any error.
>>>> I need to press CTRL+c to quit bconsole, I retry asking the status of 
>>>> storage, and doesn't give me any answer, and any error.
>>>> Same behaviour if I ask the status of any of 16 file daemon that I've 
>>>> configured in my director.
>>>> I don't know why.
>>>>     
>>>>         
>>> We'll try to find that out...
>>>
>>> It might be a locked-up database, for example. In that case, try a 
>>> command that doesn't require catalog access, like time.
>>>
>>> If that doesn't reply, I'd recommend attaching strace to the DIR 
>>> processes to see what they're doing (unless you're more comfortable 
>>> with gdb...)
>>>
>>> Also, use df and free to verify the necessary system resources are 
>>> available (memory and disk space), and check with ps or top and vmstat 
>>> if any process is using extraordinary amounts of CPU, ram, or I/O 
>>> capacity.
>>>
>>> If 'time' gets you a reply, but anything requiring catalog access 
>>> doesn't, check for database problems in the database logs or the 
>>> system logs.
>>>
>>> The system logs might tell you about problems anyway, so I recommend 
>>> having a look at them anyway.
>>>
>>>   
>>>       
>>>> If you need the director, storage or file-daemon configuration I need to 
>>>> prepare them, but is not a problem.
>>>>     
>>>>         
>>> Not yet...
>>>
>>> Arno
>>>
>>>   
>>>       
>>     
>
>   


-- 
Alfredo Marchini
Consulente IT
P.IVA: 05649240487
CF: MRCLRD81R07D612B
Via Imbriani, 66
50019 Sesto Fiorentino (FI)
Tel. +39 393 9566375
E-Mail: [EMAIL PROTECTED]



-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to