Hello, thank you for the answer.
While at the director it looks like a connection drop, it is not (it happens everytime at one and the same point, it is not a connection problem) - the connection is terminated by the fd with this error (taken from the debug messages): b3: restore.c:338 File index error (and then the process is terminated). I just identified the exact directory with files that cause the problem. When we are restoring *all* the files no more errors than cited below appear, while when we do that for this particular directory, there is one more message in the director: 17-Jul 17:11 Storage: Forward spacing Volume "FILE0004" to file:block 0:3999760376. 17-Jul 17:11 Storage: XXX Error: block.c:275 Volume data error at 0:3999760376! Wanted ID: "BB02", got "s5". Buffer discarded. I personally don't know what it means. It seems that the volume is damaged? More details of our setup are: - we save the volumes on disk; - when backing up we run several concurent jobs (is it possible that they conflict with each other by saving to the same volume?) Also the debug messages at the bacula-fd and different in this case (restoring a single directory): ------ b3: restore.c:727 End Do Restore. Files=85 Bytes=7102334 b3: job.c:1628 bfiled>stored: read close session 24 b3: job.c:1645 End FD msg: 2800 End Job TermCode=84 JobFiles=85 ReadBytes=7090277 JobBytes=7102334 Errors=0 VSS=0 Encrypt=0 b3: job.c:1650 Done in job.c b3: job.c:235 Quit command loop. Canceled=0 b3: runscript.c:102 runscript: running all RUNSCRIPT object (ClientAfterJob) JobStatus=T b3: job.c:322 Calling term_find_files b3: job.c:325 Done with term_find_files b3: mem_pool.c:376 garbage collect memory pool b3: job.c:327 Done with free_jcr ------ I am trying now to catch the exact file, to see what will happen and if we get any more messages. Regards. P.S. not to lose the message in the case of full restore of *all* files here they are again: ------ b3: restore.c:338 File index error b3: restore.c:727 End Do Restore. Files=85206 Bytes=2782261929 b3: job.c:1628 bfiled>stored: read close session 21 b3: job.c:1645 End FD msg: 2800 End Job TermCode=102 JobFiles=85206 ReadBytes=2146572346 JobBytes=2782261929 Errors=0 VSS=0 Encrypt=0 b3: job.c:1650 Done in job.c b3: job.c:235 Quit command loop. Canceled=1 b3: runscript.c:102 runscript: running all RUNSCRIPT object (ClientAfterJob) JobStatus=f b3: job.c:322 Calling term_find_files b3: job.c:325 Done with term_find_files b3: mem_pool.c:376 garbage collect memory pool b3: job.c:327 Done with free_jcr ------ The messages in the director at this point are: ------ 17-Jul 15:24 Storage: XXX Fatal error: read.c:124 Error sending to File daemon. ERR=Connection reset by peer 17-Jul 15:24 Storage: XXX Error: bnet.c:439 Write error sending 32 bytes to client:10.2.1.13:36643: ERR=Connection reset by peer ------ Tuesday, July 17, 2007, 5:09:05 PM: FS> Doytchin Spiridonov wrote: >> The messages in the director at this point are: >> ------ >> 17-Jul 15:24 Storage: XXX Fatal error: read.c:124 Error sending to File >> daemon. ERR=Connection reset by peer >> 17-Jul 15:24 Storage: XXX Error: bnet.c:439 Write error sending 32 bytes to >> client:10.2.1.13:36643: ERR=Connection reset by peer >> ------ >> >> >> Anyone having similar problems? FS> I've had similar symptoms, at least. There are a good number of problems that FS> could look the same. Here's what I've gone through so far. FS> http://www.mail-archive.com/[EMAIL PROTECTED]/msg00024.html ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users