The email below is from the writing side of the copy job and the message:

13-Sep 08:43 kilchis JobId 35203: Error: bsock.c:849 Read error from Storage 
daemon:kilchis:9103: ERR=Connection reset by peer

shows that the connection to the reading side of the job was closed
unexpectedly from the reading end.

Do you have the corresponding email from the reading side?  It will have a
different JobId (but should mention JobId 35203) and should start with
something like "Using Device ... to read."

__Martin


>>>>> On Mon, 18 Sep 2017 13:42:19 -0700, Jerry Lowry said:
> 
> Martin,
> Here is the complete email that was sent just before the "Copy Error"
> message:
> 
> 12-Sep 15:09 kilchis-dir JobId 35203: Using Device "MidSwap" to write.
> 12-Sep 15:09 kilchis JobId 35203: Volume "homeMS-200" previously written, 
> moving to end of data.
> 12-Sep 15:27 kilchis JobId 35203: End of medium on Volume "homeMS-200" 
> Bytes=1,932,735,274,146 Blocks=29,959,317 at 12-Sep-2017 15:27.
> 12-Sep 15:28 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50 is 
> waiting. Cannot find any appendable volumes.
> Please use the "label" command to create a new Volume for:
>     Storage:      "MidSwap" (/MidSwap)
>     Pool:         OffsiteMid
>     Media type:   File
> 12-Sep 15:36 kilchis JobId 35203: Wrote label to prelabeled Volume 
> "homeMS-201" on File device "MidSwap" (/MidSwap)
> 12-Sep 15:36 kilchis JobId 35203: New volume "homeMS-201" mounted on device 
> "MidSwap" (/MidSwap) at 12-Sep-2017 15:36.
> 12-Sep 19:54 kilchis JobId 35203: End of medium on Volume "homeMS-201" 
> Bytes=1,932,735,281,790 Blocks=29,959,315 at 12-Sep-2017 19:54.
> 12-Sep 19:54 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50 is 
> waiting. Cannot find any appendable volumes.
> Please use the "label" command to create a new Volume for:
>     Storage:      "MidSwap" (/MidSwap)
>     Pool:         OffsiteMid
>     Media type:   File
> 12-Sep 20:57 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50 is 
> waiting. Cannot find any appendable volumes.
> Please use the "label" command to create a new Volume for:
>     Storage:      "MidSwap" (/MidSwap)
>     Pool:         OffsiteMid
>     Media type:   File
> 12-Sep 23:03 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50 is 
> waiting. Cannot find any appendable volumes.
> Please use the "label" command to create a new Volume for:
>     Storage:      "MidSwap" (/MidSwap)
>     Pool:         OffsiteMid
>     Media type:   File
> 13-Sep 03:15 kilchis JobId 35203: Job BackupUsers.2017-09-12_09.05.09_50 is 
> waiting. Cannot find any appendable volumes.
> Please use the "label" command to create a new Volume for:
>     Storage:      "MidSwap" (/MidSwap)
>     Pool:         OffsiteMid
>     Media type:   File
> 13-Sep 08:23 kilchis JobId 35203: Wrote label to prelabeled Volume 
> "homeMS-202" on File device "MidSwap" (/MidSwap)
> 13-Sep 08:23 kilchis JobId 35203: New volume "homeMS-202" mounted on device 
> "MidSwap" (/MidSwap) at 13-Sep-2017 08:23.
> 13-Sep 08:43 kilchis JobId 35203: Error: bsock.c:849 Read error from Storage 
> daemon:kilchis:9103: ERR=Connection reset by peer
> 13-Sep 08:43 kilchis JobId 35203: Fatal error: append.c:271 Network error 
> reading from FD. ERR=Connection reset by peer
> 13-Sep 08:43 kilchis JobId 35203: Elapsed time=04:56:15, Transfer rate=125.6 
> M Bytes/second
> 13-Sep 08:43 kilchis JobId 35203: Sending spooled attrs to the Director. 
> Despooling 1,533,148,574 bytes ...
> 
> I don't have the job log. Interestingly, I did not have any problems with
> this or any other copy job before I upgraded.  I went from 5.2.13 to 9.0.3
> of Bacula and latest version of MySql to Mariadb.  Not saying that this is
> a problem, because I have 5 other copy jobs that work without error still.
> This one just happens to be the biggest one.
> 
> thanks,
> jerry
> 
> On Mon, Sep 18, 2017 at 7:55 AM, Martin Simmons <mar...@lispworks.com>
> wrote:
> 
> > A copy job will communicate using TCP between the Bacula daemons.  A bsock
> > error could indicate that bacula-sd closed the connection unexpectedly and
> > I
> > would expect media errors to be logged.
> >
> > Your syslog did include some I/O errors.  Any they caused by something
> > else?
> >
> > Do you have the complete job log (from the Bacula log, not the syslog)?
> >
> > __Martin
> >
> >
> > >>>>> On Wed, 13 Sep 2017 09:35:07 -0700, Jerry Lowry said:
> > >
> > > Kern,
> > > My Offsite Backup just failed again on the same drive, different disk. It
> > > failed with the same bsock error.  If the backup is working on the same
> > > system using the copy function, how far out of the network stack does it
> > > go.  My thinking is it does not get out of the application layer.  Is
> > this
> > > right?  Why would I get a bsock error?
> > >
> > > I have taken a look at the smart data for the disk and they seem to be
> > > running okay. I am getting some sector relocation errors, would that
> > cause
> > > the bsock error during a remap?  This procedure has been running
> > flawlessly
> > > for many years ( except for human error ).  I am wondering if I should
> > > delete the present disk files and let bacula recreate new ones.
> > >
> > > thanks for your help!
> > >
> > > jerry
> > >
> > >
> > > On Wed, Sep 6, 2017 at 11:26 PM, Kern Sibbald <k...@sibbald.com> wrote:
> > >
> > > > Hello,
> > > >
> > > > If the job is marked as Incomplete in the catalog ("I" I think), then
> > you
> > > > can simply restart it and it should pickup where it left off.  If not
> > you
> > > > must run it again from the beginning.
> > > >
> > > > If you are switching devices when one is full during a Job, it is
> > unlikely
> > > > you can restore that job when it terminates. I recommend carefully
> > testing
> > > > restores on your system.
> > > >
> > > > Best regards,
> > > >
> > > > Kern
> > > >
> > > > On 09/06/2017 05:38 PM, Jerry Lowry wrote:
> > > >
> > > > List,
> > > > I am running, bacula 9.0.3, Mariadb 12.2.8 on Centos 6.9.  I got notice
> > > > last night that my Offsite backup failed due to a bsock error.  My
> > offsite
> > > > drives are attached to an ATTO raid card which gives me hot swap
> > > > capability. This configuration works great as it allows me to hot swap
> > a
> > > > drive when it fills up with a new drive to continue with.  The problem
> > is
> > > > included below. The backup that I was doing is to the OffsiteMid drive
> > > > which is mounted as /dev/sde. Is there a way to restart this backup
> > job or
> > > > am I left with an incomplete backup going forward.
> > > >
> > > > thanks for your help,
> > > >
> > > > jerry
> > > >
> > > >
> > > > Sep  5 08:46:01 kilchis bat[4339]: bsock.c:147 Unable to connect to
> > > > Director dae
> > > > mon on kilchis:9101. ERR=Connection refused
> > > > Sep  5 10:37:20 kilchis attocfgd: [CRIT] [ExpressSAS
> > > > R608,50:01:08:60:00:57:3d:c
> > > > 0] [FW] RAID Group state now Offline: OffsiteTop
> > > > Sep  5 10:39:06 kilchis kernel: scsi 5:0:1:0: Direct-Access     ATTO
> > > > Offsite
> > > > Top00     0001 PQ: 0 ANSI: 5
> > > > Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: Attached scsi generic sg6
> > type
> > > > 0
> > > > Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte
> > > > logical bl
> > > > ocks: (2.00 TB/1.81 TiB)
> > > > Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] Write Protect is off
> > > > Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] Write cache: enabled,
> > > > read cac
> > > > he: enabled, doesn't support DPO or FUA
> > > > Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte
> > > > logical bl
> > > > ocks: (2.00 TB/1.81 TiB)
> > > > Sep  5 10:39:06 kilchis kernel: sdd: unknown partition table
> > > > Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte
> > > > logical bl
> > > > ocks: (2.00 TB/1.81 TiB)
> > > > Sep  5 10:39:06 kilchis kernel: sd 5:0:1:0: [sdd] Attached SCSI disk
> > > > Sep  5 10:39:35 kilchis kernel: sd 5:0:1:0: [sdd] 488366336 4096-byte
> > > > logical bl
> > > > ocks: (2.00 TB/1.81 TiB)
> > > > Sep  5 10:39:35 kilchis kernel: sdd:
> > > > Sep  5 10:44:54 kilchis kernel: EXT4-fs (sdd): mounted filesystem with
> > > > ordered d
> > > > ata mode. Opts:
> > > > Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has
> > errors=1
> > > > on cal
> > > > l to client:10.20.10.21:9101
> > > > Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has
> > errors=1
> > > > on cal
> > > > l to client:10.20.10.21:9101
> > > > Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has
> > errors=1
> > > > on cal
> > > > l to client:10.20.10.21:9101
> > > > Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has
> > errors=1
> > > > on cal
> > > > l to client:10.20.10.21:9101
> > > > Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has
> > errors=1
> > > > on cal
> > > > l to client:10.20.10.21:9101
> > > > Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has
> > errors=1
> > > > on cal
> > > > l to client:10.20.10.21:9101
> > > > Sep  5 11:02:38 kilchis bacula-dir[4373]: bsock.c:537 Socket has
> > errors=1
> > > > on cal
> > > > l to client:10.20.10.21:9101
> > > > Sep  5 13:45:48 kilchis attocfgd: [CRIT] [ExpressSAS
> > > > R608,50:01:08:60:00:57:3d:c
> > > > 0] [FW] RAID Group state now Offline: OffsiteMid
> > > > Sep  5 13:45:53 kilchis attocfgd: [CRIT] [ExpressSAS
> > > > R608,50:01:08:60:00:57:3d:c
> > > > 0] [FW] RAID Group state now Offline: OffsiteTop
> > > > Sep  5 13:47:52 kilchis kernel: scsi 5:0:1:0: Direct-Access     ATTO
> > > > Offsite
> > > > Mid00     0001 PQ: 0 ANSI: 5
> > > > Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: Attached scsi generic sg6
> > type
> > > > 0
> > > > Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] 488366336 4096-byte
> > > > logical bl
> > > > ocks: (2.00 TB/1.81 TiB)
> > > > Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] Write Protect is off
> > > > Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] Write cache: enabled,
> > > > read cac
> > > > he: enabled, doesn't support DPO or FUA
> > > > Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] 488366336 4096-byte
> > > > logical bl
> > > > ocks: (2.00 TB/1.81 TiB)
> > > > Sep  5 13:47:52 kilchis kernel: sde: unknown partition table
> > > > Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] 488366336 4096-byte
> > > > logical bl
> > > > ocks: (2.00 TB/1.81 TiB)
> > > > Sep  5 13:47:52 kilchis kernel: sd 5:0:1:0: [sde] Attached SCSI disk
> > > > Sep  5 13:48:01 kilchis kernel: EXT4-fs error (device sdd):
> > > > __ext4_get_inode_loc
> > > > : unable to read inode block - inode=2, block=1057
> > > > Sep  5 13:48:01 kilchis kernel: Buffer I/O error on device sdd, logical
> > > > block 0
> > > > Sep  5 13:48:01 kilchis kernel: lost page write due to I/O error on sdd
> > > > Sep  5 13:48:01 kilchis kernel: EXT4-fs error (device sdd) in
> > > > ext4_reserve_inode
> > > > _write: IO failure
> > > > Sep  5 13:48:01 kilchis kernel: EXT4-fs (sdd): previous I/O error to
> > > > superblock
> > > > detected
> > > > Sep  5 13:48:01 kilchis kernel: Buffer I/O error on device sdd, logical
> > > > block 0
> > > > Sep  5 13:48:01 kilchis kernel: lost page write due to I/O error on sdd
> > > > Sep  5 13:48:06 kilchis kernel: Aborting journal on device sdd-8.
> > > > Sep  5 13:48:06 kilchis kernel: Buffer I/O error on device sdd, logical
> > > > block 24
> > > > 3826688
> > > > Sep  5 13:48:06 kilchis kernel: lost page write due to I/O error on sdd
> > > > Sep  5 13:48:06 kilchis kernel: JBD2: I/O error detected when updating
> > > > journal s
> > > > uperblock for sdd-8.
> > > > Sep  5 13:48:08 kilchis kernel: EXT4-fs error (device sdd):
> > > > ext4_put_super: Coul
> > > > dn't clean up the journal
> > > > Sep  5 13:48:08 kilchis kernel: EXT4-fs (sdd): Remounting filesystem
> > > > read-only
> > > > Sep  5 13:48:44 kilchis kernel: sd 5:0:1:0: [sde] 488366336 4096-byte
> > > > logical bl
> > > > ocks: (2.00 TB/1.81 TiB)
> > > > Sep  5 13:48:44 kilchis kernel: sde:
> > > > Sep  5 13:54:05 kilchis kernel: EXT4-fs (sde): mounted filesystem with
> > > > ordered d
> > > > ata mode. Opts:
> > > >
> > > >
> > > >
> > > > ------------------------------------------------------------
> > ------------------
> > > > Check out the vibrant tech community on one of the world's most
> > > > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > Bacula-users mailing listBacula-users@lists.sourceforge.nethttps://
> > lists.sourceforge.net/lists/listinfo/bacula-users
> > > >
> > > >
> > > >
> >
> 

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to