I am *NOT* an expert on fault tolerance, but I have studied it a little (long ago, if not so far away), and I worked on Network Alchemy's fault tolerant implementation of an IPsec gateway (a decade ago, and a little farther away). So, some suggestions on the terminology for the HA&LS draft.
Terminology: "High Availability" refers to a gateway or cluster whose expected downtime is low on e.g. an annual basis. High availability may be achieved using fault tolerant techniques such as hardware/software clustering. It may also be achieved by e.g. using extremely robust components and having a very low reboot time. "Fault Tolerant" refers to a gateway or cluster that will maintain service availability even when a specified set of fault conditions occurs, such as the loss of one or more cluster members to a hardware or software fault. Clusters whose purpose is improving availability may operate using a "hot standby" model, in which one or more gateways is active and one or more gateways is held in reserve and activated when the failure of an active member is detected. Clusters whose purpose is improving scalability (of performance, number of active connections, etc.), using a "load sharing" model, have more than one member active. IPsec gateways must be prepared for their peers to lose state, e.g. as a result of a reboot, resulting in each peer attempting to reconnect. The latency of that reconnection, and the computational load of reconnecting a large number of peers, means that a fast-rebooting gateway alone is not sufficient to provide high availability service, driving the need for fault tolerance in an IPsec gateway implementation. If a fault is never visible to peers, the cluster is said to be "completely transparent". If some peers must reconnect, or a change of IP address is visible, the cluster is said to be "partially transparent". It is possible to create an implementation with lazy synchronization or an otherwise incompletely redundant state, resulting in e.g. a few percent of peers (or a few percent probability of any given peer) being aware of the fault. --Rod _______________________________________________ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec