>>>>> On Thu, 13 Jun 2013 08:54:43 -0400, Clark, Patricia A said: > > On 6/12/13 10:41 AM, "Josh Fisher" <jfis...@pvct.com> wrote: > > > > > >On 6/11/2013 11:10 AM, Leonardo - Mandic wrote: > >> Hello, > >> > >> After upgrade to bacula 5.2.13 I have bacula storage problems. Appers > >> a network problem, but don't are, I have a gigabit network dedicated > >> to bacula. The problem is on backups running for many hours or days > >> (full backup of 500gb delay 2 days, for example). > >> > >> The time is random, but 70% of servers have this same errors. > >> > >> On old versions never have this problem, and its same network and same > >> servers of old bacula versions. > >> > >> Anybody have this problem on 5.2.13? > >> > >> Erroris: > >> > >> > >> 2013-06-10 23:51:01 servert-fd JobId 266: Error: bsock.c:429 Write > >> error sending 64562 bytes to Storage daemon:10.1.0.60:9103: > >> ERR=Connection reset by peer > >> 2013-06-10 23:51:01 servert-fd JobId 266: Fatal error: backup.c:1200 > >> Network send error to SD. ERR=Connection reset by peer > > > >In my experience, it has always been hardware related. In particular, > >aggressive power saving modes will cause this when one of the systems > >cuts power to its Ethernet PHY at an inappropriate time. This can be > >because the device driver's default is geared toward early power savings > >and the op hasn't changed it, or a buggy device driver shuts off the PHY > >when it shouldn't. Bacula requires that TCP connections remain up > >throughout the job lifetime. Anything that might cause a delay could > >cause this if the power save timeout for the Ethernet controller is > >shorter than the delay. For example, if the database server is restarted > >by a nightly cron job and you are not spooling attributes, then the > >delay could allow the device driver to shut down the PHY due to > >"inactivity". > > > > > >-------------------------------------------------------------------------- > >---- > > I would suggest that that is not the case for this issue. I have had this > on a server that is busy backing up multiple backups where one of them > will get this error. Everything is on the server, so I am not reaching > out to a separate client. I do not use any of the power saving features > on the server either.
The FD error "Network send error to SD ERR=Connection reset by peer" means that the FD unexpectedly lost contact (at the TCP level) with the SD while writing data to it. I think the only possible causes are: 1. The network broke between the FD and the SD. 2. The SD died. 3. The FD got a different error but reported it incorrectly (not very likely). __Martin ------------------------------------------------------------------------------ This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users