Re: [Bacula-users] Same job started twice

Bryn Hughes Wed, 13 May 2015 06:50:36 -0700

Hey Luc,

Generally I wouldn't use the 'Allow duplicates' on the 'Copy' jobs -they already have a built in duplicate handling mechanism. It's veryuseful for actual backups, but I would not turn it on for Copy jobs.


Bryn

On 2015-05-04 12:10 AM, Luc Van der Veken wrote:


Hi all,

I seem to be suffering from a side effect to “Allow duplicates = no”and the “Cancel [Queued | Lower Level] duplicates” settings.

I make full / differential backups to disk in the weekend, and copythose to tape for off-site storage on Monday.

There’s only one copy job definition with its corresponding schedule.When it was executed, it used to create a separate new job for eachjob it had to copy. These were queued up and executed one after theother because Maximum Concurrent Jobs = 1 on the tape device.

This morning, all those copy jobs except for the first one failed,saying they were duplicates.


Schedule {

Name = "TransferToTapeSchedule"

Run = Full mon at 07:00

}

Job {

Name = Transfer to Tape

## used to migrate instead of copy until 2014-10-15

#Type = Migrate

Type = Copy

Pool = File

#Selection Type = PoolTime

Selection Type = PoolUncopiedJobs

Messages = Standard

Client = bacula-main# required and checked for validity, but ignoredat runtime


Level = full# idem

FileSet = BaculaSet# ditto

# DO NOT run at lower priority than backup jobs,

# has adverse effect of holding them up until this job is finished.

Priority = 10

## only for migration jobs

#Purge Migration Job = yes # purge migrated jobs after successfulmigration


Schedule = TransferToTapeSchedule

Maximum Concurrent Jobs = 5

Allow Duplicate Jobs = no

Cancel Lower Level Duplicates = yes

Cancel Queued Duplicates = yes

}

Pool {

Name = File

Pool Type = Backup

Recycle = yes

AutoPrune = yes

Volume Retention = 3 months

Maximum Volume Bytes = 50G

Maximum Volumes = 100

LabelFormat = "FileStorage"

Action On Purge = Truncate

Storage = File

Next Pool = Tape

# data used to be left on disk for 1 week and then moved to tape(Migration Time = 1 week).


# changed, now copy to tape, can be done right away

Migration Time = 1 second

}

That “Maximum Concurrent Jobs = 5” in the job definition was probablycopied in by accident, it should be 1, but I don’t think that iscausing the problem.

The result: a whole bunch of errors like the one below, and only onejob copied.

04-May 07:00 bacula-dir JobId 28916: Fatal error: JobId 28913 alreadyrunning. Duplicate job not allowed.

04-May 07:00 bacula-dir JobId 28916: Copying using JobId=28884Job=A2sMonitor.2015-05-01_20.05.00_21

04-May 07:00 bacula-dir JobId 28916: Bootstrap records written to/var/lib/bacula/bacula-dir.restore.245.bsr

04-May 07:00 bacula-dir JobId 28916: Error: Bacula bacula-dir 5.2.5(26Jan12):


Build OS:x86_64-pc-linux-gnu ubuntu 12.04

Prev Backup JobId:28884

Prev Backup Job:A2sMonitor.2015-05-01_20.05.00_21

New Backup JobId:28917

Current JobId:28916

Current Job:TransfertoTape.2015-05-04_07.00.01_50

Backup Level:Full

Client:bacula-main

FileSet:"BaculaSet" 2013-09-27 20:05:00

Read Pool:"File" (From Job resource)

Read Storage:"File" (From Pool resource)

Write Pool:"Tape" (From Job Pool's NextPool resource)

Write Storage:"Tape" (From Storage from Pool's NextPool resource)

Catalog:"MyCatalog" (From Client resource)

Start time:04-May-2015 07:00:01

End time:04-May-2015 07:00:01

Elapsed time:0 secs

Priority:10

SD Files Written:0

SD Bytes Written:0 (0 B)

Rate:0.0 KB/s

Volume name(s):

Volume Session Id:0

Volume Session Time:0

Last Volume Bytes:0 (0 B)

SD Errors:0

SD termination status:

Termination:*** Copying Error ***

*From:*Bryn Hughes [mailto:li...@nashira.ca]
*Sent:* 30 April 2015 15:07
*To:* bacula-users@lists.sourceforge.net
*Subject:* Re: [Bacula-users] Same job started twice

These directives might also be useful to you:

  Allow Duplicate Jobs = no
  Cancel Lower Level Duplicates = yes
  Cancel Queued Duplicates = yes

Bryn

On 2015-04-30 02:57 AM, Luc Van der Veken wrote:

    So simple that I’m a bit embarrassed: a Maximum Concurrent Jobs
    setting in the Job resource itself should prevent it.

    I thought that setting was applicable to all kinds of resources
    except for job resources themselves, should have checked the
    documentation sooner.

    *From:*Luc Van der Veken [mailto:luc...@wimionline.com]
    *Sent:* 30 April 2015 9:09
    *To:* bacula-users@lists.sourceforge.net
    <mailto:bacula-users@lists.sourceforge.net>
    *Subject:* [Bacula-users] Same job started twice

    Hi all,

    Is it possible that, in version 5.2.5 (Ubuntu),

    1)An incremental job is started according to schedule, before a
    previous full run of the same job has finished?

    2)A nasty side effect when that happens is that the incremental
    job is bounced to full because “Prior failed job found in catalog.
    Upgrading to Full.”, while there have been no errors?

    I seem to be in that situation now.

    The client has ‘maximum concurrent jobs’ set to 3, because the
    same client is used for backing up different NFS-mounted shares as
    separate jobs. Most of those are small, except for one, and that’s
    that one that has the problem.

    The normal schedule is either full or differential on Friday
    night, incremental on Monday through Thursday, and nothing on
    Saturday or Sunday.

    Because many full jobs are scheduled for Friday and only a limited
    number run concurrently, it usually only really starts on Saturday
    morning.

    The first overrun of Full into the next scheduled run was caused
    not by the job itself taking too long, but by a copy job that was
    copying that job from disk to tape, and that had to wait for a new
    blank tape for too long.

    From there on I think it took longer than 24 hours to complete
    because it ran two schedules of the same job concurrently each time.

    At least that’s what the director and catalog report.

    Fom Webacula:

    Information from DB Catalog : List of Running Jobs

    IdJob NameStatusLevelErrorsClientStart Time

    yy-mm-dd

    28822NAS-ElvisRunningF-NAS2015-04-29 11:29:48

    28851NAS-ElvisRunningF-NAS2015-04-29 20:06:00

    Both are incremental jobs upgraded to Full because of a ‘previous
    error’ that never occurred.

    I just canceled the later one to give the other time to finish
    before it’s rescheduled again tonight at 20:06:00.

    Besides that, there must be something else I have to find. I don’t
    think it’s normal that a backup of 600 GB from an NFS share to
    disk on another NFS share takes more than 20 hours, as the last
    ‘normal’ run last Saturday did (the physical machine the job is on
    is the SD itself, backing up an NFS share to another NFS share).

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y

_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Same job started twice

Reply via email to