Hi, I have a puzzling issue with carp which I wondered whether anyone knew the answer to. I have two carp + pf + pfsync (on openbsd 4.2) boxes in a standard failover configuration (master and backup designated by advskew values). When the master is brought down the failover works nicely. When the master comes back up though, it takes control straight away, but doesn't respond to anything for between 5 and 20 seconds. I have found a workaround for this issue by enabling portfast on the port switches that the firewall is connected to, but it doesn't make any sense to me why the firewall acts in this way when portfast is disabled.
Looking at the sequence of events when the master comes up I have: 1. Network interface comes up. 2. Switch port cycles through listening, learning, and finally moves to forwarding state. 3. At the precise moment that the port enters the forwarding state packets come to and from the firewall, so it wins the election and becomes the master again as one would expect. 4. HOWEVER, although the master now originates and receives traffic, it doesn't respond to any traffic, ie it won't send an echo reply to a request or ack any tcp traffic.This stays like this for between 5 and 20 seconds, 5. 5-20seconds later, the machine starts responding to messages. If I turn off portfast on the switch ports, the sequence is exactly the same, except that the 5 to 20 second delay isn't there. I have looked at the pf logs, and pf seems to have initialised correctly and is passing in the echo requests, but I don't see anything after this. So I have ruled out pf from my investigation. I wondered if anyone had come across anything similar in the past, or whether anyone has any advice on what to try to track down the issue? Although I can fix it by turning off portfast (which is easy to do) I'd like to understand why it is doing this to better understand the system as a whole, so if anyone has any hints I'd really appreciate hearing them. Thanks!