Folks, Just in case there is any doubt remaining: yes, I am an idiot. As was probably obvious to everyone but me, in the specific case that I have bored everyone to tears with, I am running afoul of a feature that I myself enabled (i.e. timeouts). That is to say, the analysis was correct, but completely pointless. My only real problem is that my timeout is too short. In my defense, I can only say that it _used_ to be the case that the client got an error message from the server upon timeout. This message seems to have been disabled in the 2.4.3 timeframe because it caused other problems. The absence of this message was what I think mislead me. (It also used to be the case that my timeout was plenty long enough. Then a lot more files were added....) I am sincerely sorry to have wasted everyone's time (including my own!) without materially moving the investigation ahead. So, for those interested in the morals I've learned: 1) if I set a timeout, make darn sure it's long enough because the client side error messages will be less than informative in the event of a timeout; 2) make sure that I'm looking at the right part of the server logs (time skew notwithstanding) before I send kilobytes of drivel to the mailing list; and 3) do not repeatedly cry "The sky is falling! The sky is falling!". Damn. I really thought I had it this time. Back to the drawing board. (The signal handling reentrancy issue is still a genuine problem, though.) In contrition, I submit the following patch in the hope that it will help others interpret the log a bit more quickly. --- log.c.orig Sat Jan 29 06:35:03 2000 +++ log.c Wed Oct 18 14:01:26 2000 @@ -25,6 +25,29 @@ static FILE *logfile; static int log_error_fd = -1; +static const struct errdesc { int c; const char *s; const char *d; } errcodes[] = { + { RERR_SYNTAX, "RERR_SYNTAX", "syntax or usage error" }, + { RERR_PROTOCOL, "RERR_PROTOCOL", "protocol incompatibility" }, + { RERR_FILESELECT, "RERR_FILESELECT", "errors selecting input/output files, dirs" +}, + { RERR_UNSUPPORTED, "RERR_UNSUPPORTED", "requested action not supported" }, + { RERR_SOCKETIO, "RERR_SOCKETIO", "error in socket IO" }, + { RERR_FILEIO, "RERR_FILEIO", "error in file IO" }, + { RERR_STREAMIO, "RERR_STREAMIO", "error in rsync protocol data stream" }, + { RERR_MESSAGEIO, "RERR_MESSAGEIO", "errors with program diagnostics" }, + { RERR_IPC, "RERR_IPC", "error in IPC code" }, + { RERR_SIGNAL, "RERR_SIGNAL", "status returned when sent SIGUSR1, SIGINT" +}, + { RERR_WAITCHILD, "RERR_WAITCHILD", "some error returned by waitpid()" }, + { RERR_MALLOC, "RERR_MALLOC", "error allocating core memory buffers" }, + { RERR_TIMEOUT, "RERR_TIMEOUT", "timeout in data send/receive" }, +}; + +static const struct errdesc *geterrdesc(int code) +{ + int i; + for ( i = 0; i < sizeof(errcodes)/sizeof(*errcodes); ++i ) + if ( code == errcodes[i].c ) break; + return ( i < sizeof(errcodes)/sizeof(*errcodes) ? &errcodes[i] : NULL ); +} static void logit(int priority, char *buf) { @@ -324,12 +347,16 @@ /* called when the transfer is interrupted for some reason */ void log_exit(int code, const char *file, int line) { + const struct errdesc *pd = NULL; if (code == 0) { extern struct stats stats; rprintf(FLOG,"wrote %.0f bytes read %.0f bytes total size %.0f\n", (double)stats.total_written, (double)stats.total_read, (double)stats.total_size); + } else if ( (pd = geterrdesc(code)) != NULL ) { + rprintf(FLOG,"transfer interrupted - %s (code %d, %s) at %s(%d)\n", + pd->d, code, pd->s, file, line); } else { rprintf(FLOG,"transfer interrupted (code %d) at %s(%d)\n", code, file, line); Regards, Neil -- Neil Schellenberger | Voice : (613) 599-2300 ext. 8445 CrossKeys Systems Corporation | Fax : (613) 599-2330 350 Terry Fox Drive | E-Mail: [EMAIL PROTECTED] Kanata, Ont., Canada, K2K 2W5 | URL : http://www.crosskeys.com/ + Greg Moore (1975-1999), Gentleman racer and great Canadian +