A copy job will communicate using TCP between the Bacula daemons. A bsock error could indicate that bacula-sd closed the connection unexpectedly and I would expect media errors to be logged.
Your syslog did include some I/O errors. Any they caused by something else? Do you have the complete job log (from the Bacula log, not the syslog)? __Martin >>>>> On Wed, 13 Sep 2017 09:35:07 -0700, Jerry Lowry said: > > Kern, > My Offsite Backup just failed again on the same drive, different disk. It > failed with the same bsock error. If the backup is working on the same > system using the copy function, how far out of the network stack does it > go. My thinking is it does not get out of the application layer. Is this > right? Why would I get a bsock error? > > I have taken a look at the smart data for the disk and they seem to be > running okay. I am getting some sector relocation errors, would that cause > the bsock error during a remap? This procedure has been running flawlessly > for many years ( except for human error ). I am wondering if I should > delete the present disk files and let bacula recreate new ones. > > thanks for your help! > > jerry > > > On Wed, Sep 6, 2017 at 11:26 PM, Kern Sibbald <k...@sibbald.com> wrote: > > > Hello, > > > > If the job is marked as Incomplete in the catalog ("I" I think), then you > > can simply restart it and it should pickup where it left off. If not you > > must run it again from the beginning. > > > > If you are switching devices when one is full during a Job, it is unlikely > > you can restore that job when it terminates. I recommend carefully testing > > restores on your system. > > > > Best regards, > > > > Kern > > > > On 09/06/2017 05:38 PM, Jerry Lowry wrote: > > > > List, > > I am running, bacula 9.0.3, Mariadb 12.2.8 on Centos 6.9. I got notice > > last night that my Offsite backup failed due to a bsock error. My offsite > > drives are attached to an ATTO raid card which gives me hot swap > > capability. This configuration works great as it allows me to hot swap a > > drive when it fills up with a new drive to continue with. The problem is > > included below. The backup that I was doing is to the OffsiteMid drive > > which is mounted as /dev/sde. Is there a way to restart this backup job or > > am I left with an incomplete backup going forward. > > > > thanks for your help, > > > > jerry > > > > > > Sep 5 08:46:01 kilchis bat[4339]: bsock.c:147 Unable to connect to > > Director dae > > mon on kilchis:9101. ERR=Connection refused > > Sep 5 10:37:20 kilchis attocfgd: [CRIT] [ExpressSAS > > R608,50:01:08:60:00:57:3d:c > > 0] [FW] RAID Group state now Offline: OffsiteTop > > Sep 5 10:39:06 kilchis kernel: scsi 5:0:1:0: Direct-Access ATTO > > Offsite > > Top00 0001 PQ: 0 ANSI: 5 > > Sep 5 10:39:06 kilchis kernel: sd 5:0:1:0: Attached scsi generic sg6 type > > 0 > > Sep 5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte > > logical bl > > ocks: (2.00 TB/1.81 TiB) > > Sep 5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] Write Protect is off > > Sep 5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] Write cache: enabled, > > read cac > > he: enabled, doesn't support DPO or FUA > > Sep 5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte > > logical bl > > ocks: (2.00 TB/1.81 TiB) > > Sep 5 10:39:06 kilchis kernel: sdd: unknown partition table > > Sep 5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte > > logical bl > > ocks: (2.00 TB/1.81 TiB) > > Sep 5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] Attached SCSI disk > > Sep 5 10:39:35 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte > > logical bl > > ocks: (2.00 TB/1.81 TiB) > > Sep 5 10:39:35 kilchis kernel: sdd: > > Sep 5 10:44:54 kilchis kernel: EXT4-fs (sdd): mounted filesystem with > > ordered d > > ata mode. Opts: > > Sep 5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1 > > on cal > > l to client:10.20.10.21:9101 > > Sep 5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1 > > on cal > > l to client:10.20.10.21:9101 > > Sep 5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1 > > on cal > > l to client:10.20.10.21:9101 > > Sep 5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1 > > on cal > > l to client:10.20.10.21:9101 > > Sep 5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1 > > on cal > > l to client:10.20.10.21:9101 > > Sep 5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1 > > on cal > > l to client:10.20.10.21:9101 > > Sep 5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has errors=1 > > on cal > > l to client:10.20.10.21:9101 > > Sep 5 13:45:48 kilchis attocfgd: [CRIT] [ExpressSAS > > R608,50:01:08:60:00:57:3d:c > > 0] [FW] RAID Group state now Offline: OffsiteMid > > Sep 5 13:45:53 kilchis attocfgd: [CRIT] [ExpressSAS > > R608,50:01:08:60:00:57:3d:c > > 0] [FW] RAID Group state now Offline: OffsiteTop > > Sep 5 13:47:52 kilchis kernel: scsi 5:0:1:0: Direct-Access ATTO > > Offsite > > Mid00 0001 PQ: 0 ANSI: 5 > > Sep 5 13:47:52 kilchis kernel: sd 5:0:1:0: Attached scsi generic sg6 type > > 0 > > Sep 5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] 488366336 4096-byte > > logical bl > > ocks: (2.00 TB/1.81 TiB) > > Sep 5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] Write Protect is off > > Sep 5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] Write cache: enabled, > > read cac > > he: enabled, doesn't support DPO or FUA > > Sep 5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] 488366336 4096-byte > > logical bl > > ocks: (2.00 TB/1.81 TiB) > > Sep 5 13:47:52 kilchis kernel: sde: unknown partition table > > Sep 5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] 488366336 4096-byte > > logical bl > > ocks: (2.00 TB/1.81 TiB) > > Sep 5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] Attached SCSI disk > > Sep 5 13:48:01 kilchis kernel: EXT4-fs error (device sdd): > > __ext4_get_inode_loc > > : unable to read inode block - inode=2, block=1057 > > Sep 5 13:48:01 kilchis kernel: Buffer I/O error on device sdd, logical > > block 0 > > Sep 5 13:48:01 kilchis kernel: lost page write due to I/O error on sdd > > Sep 5 13:48:01 kilchis kernel: EXT4-fs error (device sdd) in > > ext4_reserve_inode > > _write: IO failure > > Sep 5 13:48:01 kilchis kernel: EXT4-fs (sdd): previous I/O error to > > superblock > > detected > > Sep 5 13:48:01 kilchis kernel: Buffer I/O error on device sdd, logical > > block 0 > > Sep 5 13:48:01 kilchis kernel: lost page write due to I/O error on sdd > > Sep 5 13:48:06 kilchis kernel: Aborting journal on device sdd-8. > > Sep 5 13:48:06 kilchis kernel: Buffer I/O error on device sdd, logical > > block 24 > > 3826688 > > Sep 5 13:48:06 kilchis kernel: lost page write due to I/O error on sdd > > Sep 5 13:48:06 kilchis kernel: JBD2: I/O error detected when updating > > journal s > > uperblock for sdd-8. > > Sep 5 13:48:08 kilchis kernel: EXT4-fs error (device sdd): > > ext4_put_super: Coul > > dn't clean up the journal > > Sep 5 13:48:08 kilchis kernel: EXT4-fs (sdd): Remounting filesystem > > read-only > > Sep 5 13:48:44 kilchis kernel: sd 5:0:1:0: [sde] 488366336 4096-byte > > logical bl > > ocks: (2.00 TB/1.81 TiB) > > Sep 5 13:48:44 kilchis kernel: sde: > > Sep 5 13:54:05 kilchis kernel: EXT4-fs (sde): mounted filesystem with > > ordered d > > ata mode. Opts: > > > > > > > > ------------------------------------------------------------------------------ > > Check out the vibrant tech community on one of the world's most > > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > > > > > > > _______________________________________________ > > Bacula-users mailing > > listBacula-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/bacula-users > > > > > > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users