I did review the draft-ietf-ipsecme-failure-detection before the WG
meeting and some of the comments I have here already have tickets so
no need to add them second time:
----------------------------------------------------------------------

Comments to draft-ietf-ipsecme-failure-detection:

Section 1:

        "However, in many cases the rebooted peer is a
        VPN gateway that protects only servers, "

        What is that supposed to mean?

Section 2:

        "Those "at least several minutes" are a time during part of
        which both peers are active, but IPsec cannot be used."

        Not true! It is the time during one of the peer is active and
        another one is rebooting and the rebooting device might even
        get up before the time runs out as is described next few paragraphs. 

        I suggest removing whole sentence.

Section 2:

        "[RFC5996] does not mandate any time limits, but it is
        possible that the peer will start liveness checks even before
        the other end is sending INVALID_SPI notification, as it
        detected that the other end is not sending any packets anymore
        while it is still rebooting or recovering from the situation."

        I think "but it is possible that the peer will start ..." is
        wrong, more like "good implementation will start ...".

        If implementation supports black hole detection there is no
        point of doing that with long timeouts, as I said in our
        implementation that specific timeout is 10 seconds (i.e.
        around 20 times RTT which means with normal TCP etc traffic it
        never triggers, but will trigger very quickly after other end
        goes silent).

Section 3:

        I still think the protocol would be much easier to implement
        if we limit the QCD Token Taker role for initiator and Token
        maker role for responder.

        There is no point of making the protocol very generic, as
        implementation are not going to implement features before
        there is real use scenario for it. This means even if document
        describes how it can be done it does not help as
        implementations do not support it. If someone finds real use
        scenario where it is needed for responder for being token
        taker then writing new specification for that is way faster
        than to get the implementations modified.

        I have not yet seen use scenario for that where QCD would help
        (meaning there are other already standardized ways in IKEv2
        which are faster and more efficient implemented in
        implementations).

Section 4.2:

        "The QCD_TOKEN notification is related to the IKE SA and MUST
        follow the AUTH payload and precede the Configuration payload
        and all payloads related to the child SA."

        RFC5996 removed payload ordering restrictions, so why are we
        adding them back here? I suggest removing the whole paragraph.

Section 5.2:

        I would remove this whole section.

Section 7:

        I would remove this whole section. It was good to be there,
        but I do not think we need it anymore. At least section 7.4 is
        still completely wrong and is already covered by the section
        2.

Section 8:

        "Before establishing a new IKE SA using Session Resumption, a
        client should ascertain that the gateway has indeed failed.
        This could be done using either a liveness check (as in RFC
        5996) or using the QCD tokens described in this document."

        How do you use QCD tokens to ascertain that the gateways has
        indeed failed. If you receive QCD token then you know that
        other end is dead, but to receive QCD token the active
        operation you do is to send liveness check. I think this
        sentence requires some rewrite.

Section 8:

        Example is wrong. The

        HDR, {}             -->
                                   <--  HDR, N(QCD_TOKEN)

                                   should be

        HDR, SK{}             -->
                                   <--  HDR, N(INVALID_IKE_SPI),
                                             N(QCD_TOKEN)

Section 9.1:

        "Implementing the "token maker" side of QCD makes sense for
        IKE implementation where protected connections originate from
        the peer, such as inter-domain VPNs and remote access
        gateways. Implementing the "token taker" side of QCD makes
        sense for IKE implementations where protected connections
        originate, such as inter-domain VPNs and remote access
        clients."

        So token maker and toker are both used "where protected
        connections originate"? What is the difference? This text
        requires clarifications.

Section 9.1:

        "To clarify the this discussion:"
                        ^^^^^^^^
Section 9.1:

        "o For inter-domain VPN gateway it makes sense to implement
        both roles, because it can't be known in advance where the
        traffic originates."

        I do not really see that. For Inter-Domain VPN gateways there
        is two possibilities: symmetric or asymmetric initiation.

        I.e. in asymmetric situation only one end can initiate
        connections (for example because it is behind NAT or similar
        or because the HQ VPN server is always configured to be
        responder). In that case the Inter-Domain VPN case is similar
        to the remote-access client / gateway case, i.e. the
        "initiator end of Inter-Domain VPN gateway" is same as
        "remote-access client" and "Responder end of the Inter-Domain
        VPN Gateway" is same as "remote-access server".

        For symmetric situations where either end can initiate
        connections there are better and faster ways to handle things,
        as I have already described earlier.

Section 10.1:

        "Specifically, if one taker does not properly secure the QCD
        tokens and an attacker gains access to them, this attacker
        MUST NOT be able to guess other tokens generated by the same
        maker."

        Is bit misleading, as for attacker it is trivial to get large
        amount of tokens. It just need to send one faked IKE SA packet
        to token maker with random IKE SPIs to get valid token for
        that IKE SPI pair.

Section 10.3:

        "An attacker may try to attack QCD if the generation algorithm
        described in Section 5.1 is used."

        I do not think there is that big difference between 5.1 and
        5.2 in here. The 5.2 will limit the dictionary for one IP
        address, but as it is already impossibly large it does not
        matter. I would suggest removing the reference to 5.1 in first
        sentece.

Section 10.4:

        Needs also comment that the load balancer switch demuxing MUST
        stay stable. I.e. it can never change. Especially it cannot
        change even when one devices goes off-line. Also there MUST
        NOT be a way to bypass the load balancer using whether methods
        possible (including tunneling packets in some other tunneling
        protocolos, adding routing headers etc). I would add even more
        warning that this setup is extremly dangeours. Luckily section
        10.2 already forbids this:

        "This document does not specify how a load sharing
        configuration of IPsec gateways would work, but in order to
        support this specification, all members MUST be able to tell
        whether a particular IKE SA is active anywhere in the cluster.
        One way to do it is to synchronize a list of active IKE SPIs
        among all the cluster members."
-- 
kivi...@iki.fi
_______________________________________________
IPsec mailing list
IPsec@ietf.org
https://www.ietf.org/mailman/listinfo/ipsec

Reply via email to