The engineer is probably mistaken. The line seems clearly to have
dropped. Perhaps you are still missing a Heartbeat interval somewhere
-- there are quite a number of them. The other thing is that I would
recommend a Heartbeat Interval of 300.
Another alternative is to upgrade to version 7.2.0, which I believe has
the Heartbeat interval set by default to 300. I also recommend not
setting the Maximum Network Buffer Size as Bacula generally figures that
out for itself. Once you get your backups working, you can experiment
with it.
Best regards,
Kern
On 15-09-03 04:16 PM, Jeffrey R. Lang wrote:
First let me say thanks to Kern and all those that have helped make
bacula a great tool.
My current backup environment currently consists of a server, VTL and a
tape library connected by a 10GiG network. Bacula currently at 5.2.13.
I plan on upgrading once I've integrated the tape library and thing were
working. A good starting point for an upgrade.
My issue is when jobs are destined for the tape library I have enable
job spooling, but these job always timeout after the first spooled block
of data is written to tape. Here's an example:
03-Sep 11:19 bkupsvr2-dir JobId 12570: No prior Full backup Job record found.
03-Sep 11:19 bkupsvr2-dir JobId 12570: No prior or suitable Full backup found
in catalog. Doing FULL backup.
03-Sep 11:19 bkupsvr2-dir JobId 12570: Start Backup JobId 12570,
Job=bighorn-home.2015-09-03_11.19.32_03
03-Sep 11:20 bkupsvr2-dir JobId 12570: Using Device "LTO5-0" to write.
03-Sep 11:21 bkupsvr2-sd JobId 12570: 3304 Issuing autochanger "load slot 94, drive
0" command.
03-Sep 11:22 bkupsvr2-sd JobId 12570: 3305 Autochanger "load slot 94, drive 0",
status is OK.
03-Sep 11:22 bkupsvr2-sd JobId 12570: Volume "000094L5" previously written,
moving to end of data.
03-Sep 11:23 bkupsvr2-sd JobId 12570: Ready to append to end of Volume
"000094L5" at file=1724.
03-Sep 11:23 bkupsvr2-sd JobId 12570: Spooling data ...
03-Sep 15:07 bkupsvr2-sd JobId 12570: User specified Device spool size reached:
DevSpoolSize=322,122,610,512 MaxDevSpoolSize=322,122,547,200
03-Sep 15:07 bkupsvr2-sd JobId 12570: Writing spooled data to Volume.
Despooling 322,122,610,512 bytes ...
03-Sep 15:23 mmcnsd4-fd JobId 12570: Error: bsock.c:429 Write error sending
253977 bytes to Storage daemon:bkupsvr2.gg.uwyo.edu:9103: ERR=Connection timed
out
03-Sep 15:23 mmcnsd4-fd JobId 12570: Fatal error: backup.c:1200 Network send
error to SD. ERR=Connection timed out
03-Sep 15:23 bkupsvr2-dir JobId 12570: Error: Director's comm line to SD
dropped.
03-Sep 15:23 bkupsvr2-dir JobId 12570: Error: Bacula bkupsvr2-dir 5.2.13
(19Jan13):
Build OS: x86_64-unknown-linux-gnu redhat Enterprise release
JobId: 12570
Job: bighorn-home.2015-09-03_11.19.32_03
Backup Level: Full (upgraded from Incremental)
Client: "mmonsd4-fd" 5.2.13 (19Jan13)
x86_64-unknown-linux-gnu,redhat,
FileSet: "bighorn-home" 2015-06-12 08:21:10
Pool: "ARCC" (From Job resource)
Catalog: "MyCatalog" (From Client resource)
Storage: "NEO4200" (From Pool resource)
Scheduled time: 03-Sep-2015 11:19:31
Start time: 03-Sep-2015 11:20:26
End time: 03-Sep-2015 15:23:28
Elapsed time: 4 hours 3 mins 2 secs
Priority: 10
FD Files Written: 1,805,322
SD Files Written: 0
FD Bytes Written: 321,716,097,371 (321.7 GB)
SD Bytes Written: 0 (0 B)
Rate: 22062.5 KB/s
Software Compression: None
VSS: no
Encryption: no
Accurate: yes
Volume name(s): 000094L5
Volume Session Id: 1
Volume Session Time: 1441300753
Last Volume Bytes: 1,817,725,514,752 (1.817 TB)
Non-fatal FD errors: 2
SD Errors: 0
FD termination status: Error
SD termination status: Error
Termination: *** Backup Error ***
If I turn off job spooling then the job will complete as expected.
I have enable "heartbeats" on the client, storage daemon and director
but that didn't help.
My current client configuration is this:
FileDaemon { # this is me
Name = mmcnsd4-fd
FDport = 9102 # where we listen for the director
WorkingDirectory = /usr/local/bacula/working
Pid Directory = /usr/local/bacula/working
Maximum Concurrent Jobs = 20
Maximum Network Buffer Size = 262144
Heartbeat Interval = 60
}
My storage daemon configuration is:
Storage { # definition of myself
Name = bkupsvr2-sd
SDPort = 9103 # Director's port
WorkingDirectory = "/usr/local/bacula/working"
Pid Directory = "/usr/local/bacula/working"
Maximum Concurrent Jobs = 20
Heartbeat Interval = 60
}
The only thing I can see is that with spooling turned off, data is
constantly flowing over the network connection. With the spooling
turned on there is a quiet period on the network connection.
I've talked with my network engineer about this and he says there's
nothing in the network that would cause the application to close the
connection.
So has any one seen this problem before?
Any ideas on what to look at to figure this out?
jeff
------------------------------------------------------------------------------
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users
------------------------------------------------------------------------------
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users