Hi Steve I think I've tracked this down. Can you apply the attached patch on top of the one I posted before and re-run your test.
With both patches, I was able flip-flop the downed interface multiple times and in all cases path failover completed and data flow resumed. Here is the modified script I was running: #!/bin/sh net1="$1" net2="$2" flush() { iptables -F echo "Flush" exit } trap flush EXIT while true; do # clear table iptables -F echo "flushed" sleep 5 # block net1 iptables -A INPUT -i "$net1" -p sctp -j DROP echo "set net1" sleep 5 # clear table iptables -F echo "flushed" sleep 5 # block net2 iptables -A INPUT -i "$net2" -p sctp -j DROP echo "set net2" sleep 5 done I was able to run this script for 10 minutes sustaining the message flow. -vlad
>From 72d6856f7e45a17e0910e0eacd1a01d44fafd1c0 Mon Sep 17 00:00:00 2001 From: Vlad Yasevich <[EMAIL PROTECTED]> Date: Wed, 7 Feb 2007 14:58:25 -0500 Subject: [PATCH] [SCTP] Strike the transport before updating rto Once we reach a poing where we exceed the max.path.retrans strike the trasport before updating the rto. This will force transport switch at the right time, instead of 1 retransmit too late. Signed-off-by: Vlad Yasevich <[EMAIL PROTECTED]> --- net/sctp/sm_statefuns.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c index fbbc9e6..801f9d6 100644 --- a/net/sctp/sm_statefuns.c +++ b/net/sctp/sm_statefuns.c @@ -4605,12 +4605,12 @@ sctp_disposition_t sctp_sf_do_6_3_3_rtx(const struct sctp_endpoint *ep, * sent as soon as cwnd allows (normally when a SACK arrives). */ - /* NB: Rules E4 and F1 are implicit in R1. */ - sctp_add_cmd_sf(commands, SCTP_CMD_RETRAN, SCTP_TRANSPORT(transport)); - /* Do some failure management (Section 8.2). */ sctp_add_cmd_sf(commands, SCTP_CMD_STRIKE, SCTP_TRANSPORT(transport)); + /* NB: Rules E4 and F1 are implicit in R1. */ + sctp_add_cmd_sf(commands, SCTP_CMD_RETRAN, SCTP_TRANSPORT(transport)); + return SCTP_DISPOSITION_CONSUME; } -- 1.5.0.rc3.g6506