Hello,

thank you for the answer.

While at the director it looks like a connection drop, it is not (it
happens everytime at one and the same point, it is not a connection
problem) - the connection is terminated by the fd with this error
(taken from the debug messages):

b3: restore.c:338 File index error

(and then the process is terminated).

I just identified the exact directory with files that cause the
problem.

When we are restoring *all* the files no more errors than cited below
appear, while when we do that for this particular directory, there is
one more message in the director:

17-Jul 17:11 Storage: Forward spacing Volume "FILE0004" to file:block 
0:3999760376.
17-Jul 17:11 Storage: XXX Error: block.c:275 Volume data error at 0:3999760376! 
Wanted ID: "BB02", got "s5". Buffer discarded.

I personally don't know what it means. It seems that the volume is
damaged? More details of our setup are: - we save the volumes on disk;
- when backing up we run several concurent jobs (is it possible that
they conflict with each other by saving to the same volume?)

Also the debug messages at the bacula-fd and different in this case
(restoring a single directory):

------
b3: restore.c:727 End Do Restore. Files=85 Bytes=7102334

b3: job.c:1628 bfiled>stored: read close session 24
b3: job.c:1645 End FD msg: 2800 End Job TermCode=84 JobFiles=85 
ReadBytes=7090277 JobBytes=7102334 Errors=0 VSS=0 Encrypt=0

b3: job.c:1650 Done in job.c
b3: job.c:235 Quit command loop. Canceled=0
b3: runscript.c:102 runscript: running all RUNSCRIPT object (ClientAfterJob) 
JobStatus=T
b3: job.c:322 Calling term_find_files
b3: job.c:325 Done with term_find_files
b3: mem_pool.c:376 garbage collect memory pool
b3: job.c:327 Done with free_jcr
------


I am trying now to catch the exact file, to see what will happen and
if we get any more messages.

Regards.

P.S. not to lose the message in the case of full restore of *all*
files here they are again:

------
b3: restore.c:338 File index error

b3: restore.c:727 End Do Restore. Files=85206 Bytes=2782261929
b3: job.c:1628 bfiled>stored: read close session 21
b3: job.c:1645 End FD msg: 2800 End Job TermCode=102 JobFiles=85206 
ReadBytes=2146572346 JobBytes=2782261929 Errors=0
VSS=0 Encrypt=0

b3: job.c:1650 Done in job.c
b3: job.c:235 Quit command loop. Canceled=1
b3: runscript.c:102 runscript: running all RUNSCRIPT object (ClientAfterJob) 
JobStatus=f
b3: job.c:322 Calling term_find_files
b3: job.c:325 Done with term_find_files
b3: mem_pool.c:376 garbage collect memory pool
b3: job.c:327 Done with free_jcr
------

The messages in the director at this point are:
------
17-Jul 15:24 Storage: XXX Fatal error: read.c:124 Error sending to File daemon. 
ERR=Connection reset by peer
17-Jul 15:24 Storage: XXX Error: bnet.c:439 Write error sending 32 bytes to 
client:10.2.1.13:36643: ERR=Connection
reset by peer
------




Tuesday, July 17, 2007, 5:09:05 PM:

FS> Doytchin Spiridonov wrote:

>> The messages in the director at this point are:
>> ------
>> 17-Jul 15:24 Storage: XXX Fatal error: read.c:124 Error sending to File 
>> daemon. ERR=Connection reset by peer
>> 17-Jul 15:24 Storage: XXX Error: bnet.c:439 Write error sending 32 bytes to 
>> client:10.2.1.13:36643: ERR=Connection reset by peer
>> ------
>> 
>> 
>> Anyone having similar problems?

FS> I've had similar symptoms, at least.  There are a good number of problems 
that
FS> could look the same.  Here's what I've gone through so far.

FS> http://www.mail-archive.com/[EMAIL PROTECTED]/msg00024.html



-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to