> if one node is just slow enough in responding that it > falls outside the timeout, you could get an annoying situation > where that node is out-of-step forever after.
worse yet, nodes may be sending more than one line at a time, circumventing the aggregator. if they do it fast enough it becomes a real mess and there's no amount of lookback one can do to ensure this isn't happening :) i'm routinely seeing syslog brought to its knees around here by a particular cluster management software which decides to log two lines instead of just one for a particular often-failing operation, so instead of 'message repeated X times' (for some very large X) we get 'disk full'...