On Tue, 26 Oct 2010 17:01:01 +0100 Pete French wrote: PF> Actually, I just llooked I dmesg on the secondary - it is full PF> of messages thus:
PF> Oct 26 15:44:59 serpentine-passive hastd[10394]: [serp0] (secondary) Unable to receive request header: RPC version wrong. PF> Oct 26 15:45:00 serpentine-passive hastd[782]: [serp0] (secondary) Worker process exited ungracefully (pid=10394, exitcode=75). PF> Oct 26 15:46:59 serpentine-passive hastd[10421]: [serp0] (secondary) Unable to receive request header: RPC version wrong. PF> Oct 26 15:47:04 serpentine-passive hastd[782]: [serp0] (secondary) Worker process exited ungracefully (pid=10421, exitcode=75). I saw this too but only sporadic messages so I forgot and did not investigate then this :-). Now running synchronization I see them too (but again only sporadic). Setting the assertion and looking at the received header: (gdb) list 309 goto fail; 310 311 if (hdr.version != HAST_PROTO_VERSION) { 312 assert(0); 313 errno = ERPCMISMATCH; 314 goto fail; 315 } 316 317 hdr.size = le32toh(hdr.size); 318 (gdb) p/x hdr $2 = {version = 0x9, size = 0x65657266} So it looks like garbage. In hast_proto_send() we send header and then data. Couldn't it be that remote_send and sync threads interfere and their packets are mixed? May be some synchronization is needed here? I set sleep(1) in hast_proto_send() between proto_send(header) and proto_send(data). The error started to occur frequently. -- Mikolaj Golub _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"