Thanks for the response Bill.  I'll look into some of the points you
mentioned.

Best regards,
David

 

-----Original Message-----
From: Bill Moran [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, September 20, 2006 6:53 AM
To: David Hatcher
Cc: Bacula-users@lists.sourceforge.net
Subject: Re: [Bacula-users] 25-hour backup job

In response to "David Hatcher" <[EMAIL PROTECTED]>:
> Hi folks,
> 
> New user here. I find the following peculiar and wonder if "big" 
> backup jobs take longer to complete than running several consecutive 
> smaller jobs? Here's my story...
> 
> Server = bacula-fd Version: 1.38.9, OS=Linux Fedora Core 4 Client = 
> labssrv-fd Version: 1.38.4, OS=Windows NT 4.0
> 
> For my first full backup on my client (labssrv), I setup the 
> bacula-dir.conf file (see attached) to backup everything on the C, F, 
> G, and H drives.  Below is the summary, in particular, the job took 13

> hours to backup 173 GB and resulted in 430 non-fatal FD errors, most 
> of which were permission errors.  This sounds fairly reasonable to me.
> 
>   JobId:                  1150
>   Job:                    Labssrv.2006-09-08_19.00.03
>   Backup Level:           Full
>   Client:                 "labssrv-fd" Windows NT 4.0,MVS,NT 4.0.1381
>   FileSet:                "Labssrv FileSet" 2006-09-08 22:52:20
>   Pool:                   "Weekly"
>   Storage:                "SDLT"
>   Scheduled time:         08-Sep-2006 19:00:02
>   Start time:             08-Sep-2006 22:52:23
>   End time:               09-Sep-2006 12:39:33
>   Elapsed time:           13 hours 47 mins 10 secs
>   Priority:               11
>   FD Files Written:       296,222
>   SD Files Written:       296,222
>   FD Bytes Written:       173,119,126,494 (173.1 GB)
>   SD Bytes Written:       173,182,590,769 (173.1 GB)
>   Rate:                   3488.2 KB/s
>   Software Compression:   None
>   Volume name(s):         000103|000077
>   Volume Session Id:      5
>   Volume Session Time:    1157757233
>   Last Volume Bytes:      25,817,913,037 (25.81 GB)
>   Non-fatal FD errors:    430
>   SD Errors:              0
>   FD termination status:  OK
>   SD termination status:  OK
>   Termination:            Backup OK -- with warnings
> 
> 
> Then I went through the details of the permission errors and granted 
> access to the respective files and directories on my labssrv machine. 
> In addition, I excluded some old archived data that doesn't need to be

> backed up.  Following is the new summary.  The job took 25 hours to 
> backup 188 GB and resulted in 0 non-fatal FD errors  I'm surprised it 
> took 12 additional hours to backup 15 GB of data, as if there's an 
> exponential problem somewhere.  This doesn't make sense to me.  I'm 
> thinking about splitting this job into two separate jobs (job one for 
> drives C and F, job two for drives G and H) to see if it will complete

> in under 25 hours.  I have other clients that I back up that run 
> fairly quick backup jobs, although the backups are typically less than
100 GB.
> 
>   JobId:                  1225
>   Job:                    Labssrv.2006-09-15_19.00.03
>   Backup Level:           Full
>   Client:                 "labssrv-fd" Windows NT 4.0,MVS,NT 4.0.1381
>   FileSet:                "Labssrv FileSet" 2006-09-15 22:50:24
>   Pool:                   "Weekly"
>   Storage:                "SDLT"
>   Scheduled time:         15-Sep-2006 19:00:02
>   Start time:             15-Sep-2006 22:50:27
>   End time:               16-Sep-2006 23:34:34
>   Elapsed time:           1 day 44 mins 7 secs
>   Priority:               11
>   FD Files Written:       309,962
>   SD Files Written:       309,962
>   FD Bytes Written:       187,971,236,795 (187.9 GB)
>   SD Bytes Written:       188,037,056,495 (188.0 GB)
>   Rate:                   2110.9 KB/s
>   Software Compression:   None
>   Volume name(s):         000082|000100
>   Volume Session Id:      5
>   Volume Session Time:    1158352255
>   Last Volume Bytes:      39,196,049,239 (39.19 GB)
>   Non-fatal FD errors:    0
>   SD Errors:              0
>   FD termination status:  OK
>   SD termination status:  OK
>   Termination:            Backup OK

Lots of information missing here -- difficult to help much without some
additional diagnosis.

What DB are you using?  Is it possible that you've pushed the db server
past some limit where performance starts to degrade?  i.e.
created enough records that inserts have become expensive?

If you monitor CPU and IO usage during the backup, where is the holdup
and which program (DBserver? director? storage daemon?
file daemon?) is using that resource?

Is it possible that you hit something on the DLT tape that caused it to
have to rewind or do a bunch of seeking or something else that put the
whole job in wait mode for a long time?

What is the nature of the data in that directory?  Is it possible that
the FD is contested for access to those files and is spending a lot of
time waiting for them to free up when it tries to grab them?

Mostly guesses here, but hopefully something will be helpful.

--
Bill Moran
Collaborative Fusion Inc.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to