
In the meantime i have upgraded to 9.0.5, but the problem is still there.

This are the job status lines sent via email after job is canceled:

16-Nov 21:05 troll-dir JobId 777520: Start Backup JobId 777520, 
16-Nov 21:05 troll-dir JobId 777520: Using Device "FileStorage" to write.
16-Nov 21:05 troll-dir JobId 777520: Sending Accurate information to the FD.
16-Nov 21:11 troll-sd JobId 777520: Spooling data ...
16-Nov 21:11 user1-fd JobId 777520:      /var/lib/nfs/rpc_pipefs is a different 
filesystem. Will not descend from / into it.
17-Nov 09:05 troll-sd JobId 777520: Fatal error: append.c:184 Error reading 
data header from FD. n=-2 msglen=0 ERR=Interrupted system call
17-Nov 09:05 troll-dir JobId 777520: Fatal error: Max run time exceeded. Job 
17-Nov 09:05 troll-dir JobId 777520: Bacula troll-dir 9.0.5 (02Nov17):

The storage status shows Job Backup-user1 is despooling, but actually it is 
canceled and despooling never ends and therefore the FileStorage is BLOCKED 
forever (only restart bacula helps).

*status storage=File
Connecting to Storage daemon File at troll.obvsg.at:9103

troll-sd Version: 9.0.5 (02 November 2017) x86_64-pc-linux-gnu redhat 
Daemon started 15-Nov-17 09:51. Jobs: run=357, running=1.
 Heap: heap=307,200 smbytes=1,349,350 max_bytes=14,454,879 bufs=279 
 Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8 mode=0,0 newbsr=0
 Res: ndevices=4 nautochgr=1

Running Jobs:
Writing: Incremental Backup job Backup-user1 JobId=777520 Volume=""
    pool="DiskBackup" device="FileStorage" (/data/bacula/files)
    spooling=0 despooling=1 despool_wait=0
    Files=56,003 Bytes=2,504,128,940 AveBytes/sec=56,423 LastBytes/sec=7,513
    FDReadSeqNo=526,592 in_msg=400588 out_msg=6 fd=37
Reading: Full Copy job CopyDiskToTape JobId=777552 Volume=""
    pool="DiskBackup" device="FileStorage" (/data/bacula/files) newbsr=0
    Files=0 Bytes=0 AveBytes/sec=0 LastBytes/sec=0
    FDSocket closed

Jobs waiting to reserve a drive:
   3602 JobId=777552 File device "FileStorage" (/data/bacula/files) is busy 
(already reading/writing). read=0, writers=1 reserved=0

Terminated Jobs:
 JobId  Level      Files    Bytes   Status   Finished        Name 
777670  Full         107    92.16 M  OK       17-Nov-17 09:29 CopyDiskToExtClone
777671  Incr         107    92.16 M  OK       17-Nov-17 09:29 Backup-idefix
777673  Full          71    3.694 M  OK       17-Nov-17 09:29 CopyDiskToExtClone
777675  Incr          71    3.694 M  OK       17-Nov-17 09:29 Backup-pcmk1
777677  Full          70    3.611 M  OK       17-Nov-17 09:29 CopyDiskToExtClone
777678  Incr          70    3.611 M  OK       17-Nov-17 09:29 Backup-pcmk2
777681  Full         104    97.59 M  OK       17-Nov-17 09:29 CopyDiskToExtClone
777682  Incr         104    97.59 M  OK       17-Nov-17 09:30 Backup-paladin
777551  Full       1,089    336.3 M  OK       17-Nov-17 09:30 CopyDiskToExtClone
777685  Incr       1,089    336.3 M  OK       17-Nov-17 09:30 Backup-teamwork

Device status:
Autochanger "QTM-Scalar" with devices:
   "QTM-Drive-0" (/dev/qtm-nst0)
   "QTM-Drive-1" (/dev/qtm-nst1)

Device File: "FileStorage" (/data/bacula/files) is not open.
   Device is BLOCKED waiting for media.
   Available Space=2.058 TB

Device File: "FileStorage2" (/data/bacula/files) is not open.
   Available Space=2.058 TB

Device Tape is "QTM-Drive-0" (/dev/qtm-nst0) mounted with:
    Volume:      BACU.130
    Pool:        DiskCopy
    Media type:  LTO-6
    Total Bytes Read=129,024 Blocks Read=2 Bytes/block=64,512
    Positioned at File=0 Block=0
   Slot 22 is loaded in drive 0.

Device Tape is "QTM-Drive-1" (/dev/qtm-nst1) mounted with:
    Volume:      BACX.105
    Pool:        ExtClone
    Media type:  LTO-6
    Total Bytes=291,559,389,184 Blocks=1,112,250 Bytes/block=262,134
    Positioned at File=93 Block=0
   Slot 5 is loaded in drive 1.

Used Volume status:
Reserved volume: BACU.130 on Tape device "QTM-Drive-0" (/dev/qtm-nst0)
    Reader=0 writers=0 reserves=0 volinuse=0
Reserved volume: BACX.105 on Tape device "QTM-Drive-1" (/dev/qtm-nst1)
    Reader=0 writers=0 reserves=0 volinuse=0
Volume: Backup-0169 no device. volinuse=0

Data spooling: 1 active jobs, 2,508,838,698 bytes; 69 total jobs, 
47,483,401,984 max bytes/job.
Attr spooling: 1 active jobs, 2,129,346,316 bytes; 214 total jobs, 
2,129,346,316 max bytes.

Any ideas what is going on here ?

Best regards

> Ulrich Leodolter <ulrich.leodol...@obvsg.at> hat am 27. Oktober 2017 um 15:53 
> geschrieben:
> Hi,
> i have a problem which seems to be triggered by after job run timeout.
> for our desktop machines we have configured:
>     Max Run Time = 12 hours
> desktop machines are always somewhat unpredictable and i happens the we reach 
> the 12 hours timeout.  but sometimes the storage device is not released after 
> the job is canceled. 
> Device File: "FileStorage" (/data/bacula/files) is not open.
>    Device is BLOCKED waiting for media.
>    Available Space=2.018 TB
> lsof shows the storage daemon has spool files open even though the 
> corresponding job was canceled.
> Our bacula server version is 9.0.4,  but the problem happend also on 7.x 
> releases. 
> I know this description is somewhat vague, but maybe someone has seen 
> something like this?
> Maybe i should add the we run Copy jobs into 2 Tape pools after all backups 
> to disk are finished (or canceled).  To allow the Copy jobs run in parallel 
> we have defined FileStorage2 which points to the same directory 
> (/data/bacula/files) as FileStorage device.
> Is it thinkable that jobs canceled after MaxRunTime do not release the File 
> storage device?
> Best regards
> Ulrich
> Ulrich Leodolter <ulrich.leodol...@obvsg.at>
> Oesterreichische Bibliothekenverbund und Service GmbH
> Raimundgasse 1/3, A-1020 Wien
> Fax +43 1 4035158-30
> Tel +43 1 4035158-21
> Web https://www.obvsg.at
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Bacula-devel mailing list
> Bacula-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-devel

Ulrich Leodolter <ulrich.leodol...@obvsg.at>
Oesterreichische Bibliothekenverbund und Service GmbH
Raimundgasse 1/3, A-1020 Wien
Fax +43 1 4035158-30
Tel +43 1 4035158-21
Web https://www.obvsg.at

Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Bacula-devel mailing list

Reply via email to