Hi Phillip,
My situation, at this point is hypothetical, though based on reality in carrier grade messaging systems. I have been building and implementing large scale systems for many years, so I have had time thinking small, which can be good and bad. :)
I understand your setup, makes perfect sense, two boxes, one primary, one secondary (preferably off site), in case something goes wrong with the first. I got the having logging on both boxes.
What I don't get is how you got spamd to pickup the white listed entries on both boxes? AFAIK spamd does not look at the logs, simply puts entries in, does not read them.
I think I would want grey listed tuples included as well. If behind the primary MX were say 3 boxes and the load balancer was not always directing the sending MTA to the same box running spamd, the sending MTA could get delayed for a very long time. While load balancers have persistence, those usually have a timeout period, which MTA retries will probably exceed.
As it is with a single box running spamd a new sending MTA will get delayed by an hour. The first connection attempt gets the sender grey listed. The next attempt has to wait 30 minutes, per the RFCs, and at least 25 minutes per the default of spamd & grey listing. The third attempt would be about 60 minutes after the first attempt, or even later. Depends upon the sending MTA and its behavior of retries. The minimum by RFCs would be 60 minutes, but certainly could be longer. Some MTAs will extend the time between retry attempts after successive failures.
I know spamd is very light weight, but a single box is a SPOF and I don't like those. :) Even two boxes with pf, pfsync, and carp would still not replicate the contents of /var/db/spamd. Unless I'm missing something with respect to the way pf, pfsync, carp, and spamd integrate. I have not taken the time to actually setup such an environment and test. If I am wrong and this would work, then maybe someone can point out to me why.
Regards, Chad