Hello Karl!

On Wednesday 16 March 2005 01:10, Karl Cunningham wrote:
> If you still have problems after you do the other things, I would bump the
> maximum concurrent jobs up to at least 25 or so.  I'm not ready to rule it
> out yet.

If you really think this could be a problem, I will increase the maximum 
concurrent jobs to 25, also I see no disadvantages in doing so.

Best Regards,
Tim

> --On Wednesday, March 16, 2005 12:38 AM +0100 Tim Oberfoell
>
> <[EMAIL PROTECTED]> wrote:
> > Hi Karl!
> >
> > On Tuesday 15 March 2005 20:30, you wrote:
> >> Tim --
> >>
> >> An intermittent problem like this can be tough to find.  I don't think
> >> there should be a problem with lots of jobs starting at the same time (I
> >> do that here and it's no problem), but are you sure you have the
> >> 'maximum concurrent jobs' setting high enough?  How about setting it to
> >> something considerably higher than what's needed, say 30 if you're
> >> running 19 jobs at once.  I think the number of concurrent jobs in the
> >> director resource has to be higher than you might expect because any
> >> consoles occupy connections too.  I would set all of them higher though
> >> as a test.
> >
> > Yes, I'm sure the amount of maximum concurrent jobs is defined correctly,
> > because if the director does not hang all jobs run fine at the same time.
> > The  variables are set to 20, and if there would be a problem a "status
> > Director"  would show it (waiting jobs for example).
> >
> >> Another thing to look at is if all the jobs you are starting
> >> simultaneously have the same priority.  If not, consider starting them a
> >> minute apart. It's possible that a job is blocked by one with a
> >> different priority, and if they're all started at the same time you
> >> really don't have control over what the priority is of the job that wins
> >> the first-come-first-served race.
> >
> > All 19 jobs, in my new configuration 5,  that are running at the same
> > time  have the same priority. And additionally all 19 jobs use the same
> > storage,  which also is defined to handle concurrent jobs, so there
> > should be no  blocking problem.
> >
> > I hope that resolving the mysql problem mentioned in my mail a few
> > minutes ago  will prevent the director to hang up.
> >
> > Regards,
> > Tim
> >
> >> --On Tuesday, March 15, 2005 6:49 PM +0100 Tim Oberfoell
> >> <[EMAIL PROTECTED]>
> >>
> >> wrote:
> >> > Hello Karl!
> >> >
> >> > On Tuesday 15 March 2005 17:47, you wrote:
> >> >> Tim --
> >> >>
> >> >> I haven't seen this complaint in recent times from other users so I
> >> >> don't think it's a very common problem.  What I conclude from that is
> >> >> there is something uncommon about your situation that is causing it
> >> >> to hang. Unless someone else has seen the same problem the rest of us
> >> >> with working systems have a hard time figuring out what might be
> >> >> wrong with yours.
> >> >
> >> > Yes, I agree with that. So I suppose it is not really a bacula
> >> > problem.
> >> >
> >> >> I assume your system did work at one time.  One approach is to try to
> >> >> backtrack to where it does work and see what broke it.
> >> >
> >> > The problem is, that it is not really reproduceable. Sometimes it
> >> > works for  two or three days, with 28 jobs per night and then it hangs
> >> > after starting  only one job manually.
> >> >
> >> > It works fine for three weeks with mysql 3.23 and bacula 1.36.1 but
> >> > then suddenly stops working. There were no updates within this time or
> >> > other changes.
> >> >
> >> >> Otherwise, I would suggest the old "divide and conquer" approach to
> >> >> troublshooting.  Start reducing the size of the backup or try to back
> >> >> up a different client, or something.  Try to back up a small test
> >> >> directory from the server itself, as suggested in the manual.  Find
> >> >> something that DOES work.  If you find something that does work, then
> >> >> try to close the gap between what does work and what doesn't: Try
> >> >> something halfway between and see if that works.  Keep dividing the
> >> >> gap between what works and what doesn't.  Try to get to a point where
> >> >> there is only a single configuration difference between what works
> >> >> and what doesn't. When you've narrowed it down like that you will
> >> >> probably know enough about the problem to fix it, or at least have a
> >> >> good start at it.
> >> >
> >> > Yes, I'm currently doing what you described above. Last weekend I
> >> > converted  all fileset definitions from the old to the new notation,
> >> > and checked all  entries of the configuration file but that does not
> >> > fix the problem.
> >> >
> >> > Today I've had another idea. Every night there are two runs each with
> >> > 19 jobs  starting at the same time (first run at 1:00 and second run
> >> > at 4:00) and the  problem occures everytime one or more jobs are
> >> > trying to start. Because of  data spooling I don't think that this
> >> > really is a problem for bacula, but  maybe the mysql database is not
> >> > able to handle requests for 19 jobs at the  same time? So, now I've
> >> > scheduled four runs each with 5 jobs with a offset of  20 minutes (two
> >> > times per night). We'll see what's happening tonight.
> >> >
> >> >> Hope this helps.
> >> >
> >> > Yes, thanks a lot for your answer.
> >> >
> >> > Best Regards,
> >> > Tim
> >> >
> >> >> --On Tuesday, March 15, 2005 2:28 AM +0100 Tim Oberfoell
> >> >> <[EMAIL PROTECTED]>
> >> >>
> >> >> wrote:
> >> >> > Hello!
> >> >> >
> >> >> > It's me again and I still have the same problem. After getting the
> >> >> > attached  error messages I supposed a mysql problem and updated
> >> >> > from Version 3.23 to  4.1 and I deleted the complete bacula
> >> >> > database and set it up again. But the  problem still remains.
> >> >> >
> >> >> > I really need help, because the backup hangs up nearly every night.
> >> >> >
> >> >> > Regards,
> >> >> > Tim
> >> >> >
> >> >> > On Sunday 06 March 2005 17:38, Tim Oberfoell wrote:
> >> >> >> Hello!
> >> >> >>
> >> >> >> I've a little problem with the director. The director has not
> >> >> >> executed our nightly full backup and I'm wondering why. The dir
> >> >> >> seems to run (a pid is given) but is not reachable by the console
> >> >> >> and is not doing anything.
> >> >> >>
> >> >> >> After restarting the dir I've tried to start the missed jobs by
> >> >> >> myself but the "run" coammand is not executed completetly, because
> >> >> >> the dir again is hanging.
> >> >> >>
> >> >> >> Here is what I've done in the console:
> >> >> >> -----------------------------------------------------------
> >> >> >> SCL01M01:/etc/bacula # bconsole
> >> >> >> Connecting to Director SCL01M01:9101
> >> >> >> 1000 OK: SCL01M01-dir Version: 1.36.2 (28 February 2005)
> >> >> >> Enter a period to cancel a command.
> >> >> >> *run
> >> >> >> Using default Catalog name=MyCatalog DB=bacula
> >> >> >> A job name must be specified.
> >> >> >> The defined Job resources are:
> >> >> >>      1: EjectTapeAfterJob
> >> >> >>      2: SED_SFILE-TAPE
> >> >> >>      3: SEDSFILE-HD
> >> >> >>      4: SCL01M01-HD
> >> >> >>      5: SCL01V11-HD
> >> >> >>      6: SCL01V11-TAPE
> >> >> >>      7: SCL01M01-TAPE
> >> >> >>      8: SCL01N01-HD
> >> >> >>      9: SCL01N01-TAPE
> >> >> >>     10: SCL01N02-HD
> >> >> >>     11: SCL01N02-TAPE
> >> >> >>     12: SCL01V02-HD
> >> >> >>     13: SCL01V02-TAPE
> >> >> >>     14: SRAS01-HD
> >> >> >>     15: SRAS01-TAPE
> >> >> >>     16: SCL01V09-HD
> >> >> >>     17: SCL01V09-TAPE
> >> >> >>     18: SNOTES01-HD
> >> >> >>     19: SNOTES01-TAPE
> >> >> >>     20: SRAS02-HD
> >> >> >>     21: SRAS02-TAPE
> >> >> >>     22: BASTION01-HD
> >> >> >>     23: BASTION01-TAPE
> >> >> >>     24: SCL01V08-HD
> >> >> >>     25: SCL01V08-TAPE
> >> >> >>     26: SCL01V10-HD
> >> >> >>     27: SCL01V10-TAPE
> >> >> >>     28: SCL01V12-HD
> >> >> >>     29: SCL01V12-TAPE
> >> >> >>     30: SFAX01-HD
> >> >> >>     31: SFAX01-TAPE
> >> >> >>     32: SCL01V03-HD
> >> >> >>     33: SCL01V03-TAPE
> >> >> >>     34: SCL01V05-HD
> >> >> >>     35: SCL01V05-TAPE
> >> >> >>     36: SCL01V13-HD
> >> >> >>     37: SCL01V13-TAPE
> >> >> >>     38: SCL01V14-HD
> >> >> >>     39: SCL01V14-TAPE
> >> >> >>     40: BackupCatalog
> >> >> >>     41: BackupCatalog-TAPE
> >> >> >>     42: RestoreFiles
> >> >> >> Select Job resource (1-42): 7
> >> >> >> Run Backup job
> >> >> >> JobName:  SCL01M01-TAPE
> >> >> >> FileSet:  Full Set
> >> >> >> Level:    Incremental
> >> >> >> Client:   SCL01M01-fd
> >> >> >> Storage:  EZ17
> >> >> >> Pool:     TapeDailyDiffPool
> >> >> >> When:     2005-03-06 15:46:12
> >> >> >> Priority: 10
> >> >> >> OK to run? (yes/mod/no): m
> >> >> >> Parameters to modify:
> >> >> >>      1: Level
> >> >> >>      2: Storage
> >> >> >>      3: Job
> >> >> >>      4: FileSet
> >> >> >>      5: Client
> >> >> >>      6: When
> >> >> >>      7: Priority
> >> >> >>      8: Pool
> >> >> >> Select parameter to modify (1-8): 8
> >> >> >> The defined Pool resources are:
> >> >> >>      1: Default
> >> >> >>      2: DiskIncPool
> >> >> >>      3: DiskFullPool
> >> >> >>      4: TapeDailyDiffPool
> >> >> >>      5: TapeWeeklyFullPool
> >> >> >>      6: TapeMonthlyFullPool
> >> >> >> Select Pool resource (1-6): 6
> >> >> >> Run Backup job
> >> >> >> JobName:  SCL01M01-TAPE
> >> >> >> FileSet:  Full Set
> >> >> >> Level:    Incremental
> >> >> >> Client:   SCL01M01-fd
> >> >> >> Storage:  EZ17
> >> >> >> Pool:     TapeMonthlyFullPool
> >> >> >> When:     2005-03-06 15:46:12
> >> >> >> Priority: 10
> >> >> >> OK to run? (yes/mod/no): m
> >> >> >> Parameters to modify:
> >> >> >>      1: Level
> >> >> >>      2: Storage
> >> >> >>      3: Job
> >> >> >>      4: FileSet
> >> >> >>      5: Client
> >> >> >>      6: When
> >> >> >>      7: Priority
> >> >> >>      8: Pool
> >> >> >> Select parameter to modify (1-8): 1
> >> >> >> Levels:
> >> >> >>      1: Base
> >> >> >>      2: Full
> >> >> >>      3: Incremental
> >> >> >>      4: Differential
> >> >> >>      5: Since
> >> >> >> Select level (1-5): 2
> >> >> >> Run Backup job
> >> >> >> JobName:  SCL01M01-TAPE
> >> >> >> FileSet:  Full Set
> >> >> >> Level:    Full
> >> >> >> Client:   SCL01M01-fd
> >> >> >> Storage:  EZ17
> >> >> >> Pool:     TapeMonthlyFullPool
> >> >> >> When:     2005-03-06 15:46:12
> >> >> >> Priority: 10
> >> >> >> OK to run? (yes/mod/no): yes
> >> >> >> !!!!!!!!!!!!!!!!!!!!(Here it hangs, directly after pressing
> >> >> >> enter)!!!!!!!!!!!!!!!!!!!!
> >> >> >> -----------------------------------------------------------
> >> >> >>
> >> >> >>
> >> >> >> Here is an excerpt the output of "bacula-dir -f -c bacula-dir.conf
> >> >> >> -d1000":
> >> >> >> -----------------------------------------------------------
> >> >> >> SCL01M01-dir: scan.c:138 Next arg=run
> >> >> >> SCL01M01-dir: scan.c:167 End arg=run next=
> >> >> >> SCL01M01-dir: scan.c:138 Next arg=
> >> >> >> SCL01M01-dir: scan.c:167 End arg= next=
> >> >> >> SCL01M01-dir: ua_cmds.c:150 Command: run
> >> >> >> SCL01M01-dir: ua_cmds.c:2004 Open database
> >> >> >> SCL01M01-dir: mysql.c:81 db_open first time
> >> >> >> SCL01M01-dir: mem_pool.c:111 sm_get_pool_memory reuse 80cdf58 to
> >> >> >> mysql.c:97 SCL01M01-dir: mem_pool.c:111 sm_get_pool_memory reuse
> >> >> >> 80c0fb0 to mysql.c:99 SCL01M01-dir: mem_pool.c:127
> >> >> >> sm_get_pool_memory give 80d5130 to mysql.c:100 SCL01M01-dir:
> >> >> >> mem_pool.c:127 sm_get_pool_memory give 80d5260 to mysql.c:103
> >> >> >> SCL01M01-dir: mem_pool.c:127
> >> >> >> sm_get_pool_memory give 80d5390 to mysql.c:104 SCL01M01-dir:
> >> >> >> mem_pool.c:127 sm_get_pool_memory give 80d54c0 to mysql.c:105
> >> >> >> SCL01M01-dir: mysql.c:141 mysql_init done
> >> >> >> SCL01M01-dir: mysql.c:161 mysql_real_connect done
> >> >> >> SCL01M01-dir: mysql.c:163 db_user=bacula db_name=bacula
> >> >> >> db_password= SCL01M01-dir: sql.c:55 int_handler starts with row
> >> >> >> pointing at 80db6c8 SCL01M01-dir: sql.c:58 int_handler finds '8'
> >> >> >> SCL01M01-dir: sql.c:64 int_handler finishes
> >> >> >> SCL01M01-dir: ua_cmds.c:2019 DB bacula opened
> >> >> >> SCL01M01-dir: ua_run.c:269 Done scan.
> >> >> >> SCL01M01-dir: ua_run.c:279 Using catalog=(null)
> >> >> >> SCL01M01-dir: ua_run.c:322 Using storage=EZ17
> >> >> >> SCL01M01-dir: ua_run.c:342 Using pool
> >> >> >> SCL01M01-dir: ua_run.c:362 Using client=SCL01M01-fd
> >> >> >> SCL01M01-dir: mem_pool.c:127 sm_get_pool_memory give 80d7d08 to
> >> >> >> jcr.c:202 SCL01M01-dir: mem_pool.c:127 sm_get_pool_memory give
> >> >> >> 80d8090 to jcr.c:204 SCL01M01-dir: mem_pool.c:127
> >> >> >> sm_get_pool_memory give 80d82c0 to job.c:777 SCL01M01-dir:
> >> >> >> ua_run.c:481 JobType=B SCL01M01-dir: watchdog.c:286
> >> >> >> pthread_cond_timedwait 30
> >> >> >> SCL01M01-dir: ua_run.c:481 JobType=B
> >> >> >> SCL01M01-dir: ua_run.c:481 JobType=B
> >> >> >> SCL01M01-dir: ua_run.c:851 Calling run_job job=80c2200
> >> >> >> SCL01M01-dir: message.c:246 Copy message resource 0x80cc080 to
> >> >> >> 0x80d8480 SCL01M01-dir: job.c:108 Open database
> >> >> >> SCL01M01-dir: mysql.c:74 DB REopen 1 bacula
> >> >> >> SCL01M01-dir: job.c:121 DB opened
> >> >> >> -----------------------------------------------------------
> >> >> >>
> >> >> >>
> >> >> >> I hope somebody is able to help me.
> >> >> >>
> >> >> >> Best Regards,
> >> >> >> Tim
> >> >> >>
> >> >> >>
> >> >> >> -------------------------------------------------------
> >> >> >> SF email is sponsored by - The IT Product Guide
> >> >> >> Read honest & candid reviews on hundreds of IT Products from real
> >> >> >> users. Discover which products truly live up to the hype. Start
> >> >> >> reading now.
> >> >> >> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
> >> >> >> _______________________________________________
> >> >> >> Bacula-users mailing list
> >> >> >> Bacula-users@lists.sourceforge.net
> >> >> >> https://lists.sourceforge.net/lists/listinfo/bacula-users
>
> -------------------------------------------------------
> SF email is sponsored by - The IT Product Guide
> Read honest & candid reviews on hundreds of IT Products from real users.
> Discover which products truly live up to the hype. Start reading now.
> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
> _______________________________________________
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to