Andre Oppermann wrote:
Hartmut Brandt wrote:
Rui Paulo wrote:

On 15 Nov 2008, at 20:08, Hartmut Brandt wrote:

Hi,

in tcp_syncache.c:syncache_expand() there is a test that the acknowledgement number and the sequence number of an incoming ACK segment are in the expected range. If they are not, syncache_expand() returns 0 and tcp_input drops the segment and sets a reset. So far so good. But syncache_expand() also deletes the syncache entry, and so destroys the connection. I cannot see why it does it. It seems to me that such a wrong segment should be interpreted as to be from another connection and as such the segment should be ignored (but a reset sent). When the correct ACK comes, the connection could still be established. As it is now, the establishment of incoming connections can seriously be disturbed by someone sending fake ACK packets.

The same test (for the ack number, not for the sequence number) is also further down in tcp_input.c:tcp_do_segment() (just after the header prediction stuff) and here the handling is correct: the goto dropwithreset just sends a reset and drops the segment but leaves the connection in the SYN-RECEIVED state. This test is probably never reached now, because of syncache_expand(), though.

Correct.

Maybe I fail to see something obvious, though...


Well, if the RST is sent, why should we keep the syncache entry?

Because this effectively destroys the connection in SYN-RECEIVED which is wrong according to RFC793. On page 69 the handling of incoming segments for connections in SYN-RECEIVED is described: first you check the sequence number and, if it is wrong, you send an RST (unless the RST bit is set in the incoming segment), but otherwise ignore the segment.
 >
A segment with a bad sequence number in SYN-RECEIVED is either forged or from an old connection. In both cases you don't want to destroy the embryonic connection, because the correct ACK from the correct peer may still arrive.

I see your problem.  Syncookies mitigate this problem (if not disabled) as
the correct ACK will pass that test later even if the syncache entry went
away before (which can also happen due to a generally high SYN load).

RFC793 wants us to do the following:

 Page 69: Send back a challenge ACK with the correct parameters to help
          to re-synchronize the connection when
          !(RCV.NXT =< SEG.SEQ+SEG.LEN-1 < RCV.NXT+RCV.WND).

 Page 72: Send back a RST when !(SND.UNA =< SEG.ACK =< SND.NXT).

At the moment we send the RST and delete the syncache entry for both cases.
However we should send the ACK in the former, the RST in the latter, and
and keep the syncache entry in either case.

Fixing this requires some re-shuffling of the syncache_expand().  I'll
post a version later today.

This is a bit more complicated because of interactions with tcp_input()
where syncache_expand() is called from.

The old code (as of December 2002) behaved slightly different.  It would
not remove the syncache entry when (SND.UNA == SEG.ACK) but send a RST.
The (RCV.NXT =< SEG.SEQ+SEG.LEN-1 < RCV.NXT+RCV.WND) test wasn't done at
all.  Instead a socket was opened whenever (SND.UNA == SEG.ACK) succeeded.
This gave way to the "LAND" DoS attack which was mostly fixed with a test
for (RCV.IRS < SEG.SEQ).

See the attached patch for fixed version of syncache_expand().  This patch
is untested though.  My development machine is currently down.  Harti, Rui
and Bjoern, please have a look at the patch and review it.

--
Andre

--- tcp_input.c.1.390   Mon Nov 17 21:33:25 2008
+++ tcp_input.c.1.390.mod       Mon Nov 17 21:35:22 2008
@@ -642,10 +642,13 @@ findpcb:
                        if (!syncache_expand(&inc, &to, th, &so, m)) {
                                /*
                                 * No syncache entry or ACK was not
-                                * for our SYN/ACK.  Send a RST.
+                                * for our SYN/ACK.  Send a RST or
+                                * an ACK for re-synchronization.
                                 * NB: syncache did its own logging
                                 * of the failure cause.
                                 */
+                               if (so == 1)
+                                       goto dropunlock;
                                rstreason = BANDLIM_RST_OPENPORT;
                                goto dropwithreset;
                        }
--- tcp_syncache.c.1.160        Mon Nov 17 16:49:01 2008
+++ tcp_syncache.c.1.160.mod    Mon Nov 17 23:35:39 2008
@@ -817,59 +817,47 @@ syncache_expand(struct in_conninfo *inc,
        INP_INFO_WLOCK_ASSERT(&V_tcbinfo);
        KASSERT((th->th_flags & (TH_RST|TH_ACK|TH_SYN)) == TH_ACK,
            ("%s: can handle only ACK", __func__));
+       *lsop = NULL;
 
        sc = syncache_lookup(inc, &sch);        /* returns locked sch */
        SCH_LOCK_ASSERT(sch);
        if (sc == NULL) {
                /*
                 * There is no syncache entry, so see if this ACK is
-                * a returning syncookie.  To do this, first:
-                *  A. See if this socket has had a syncache entry dropped in
-                *     the past.  We don't want to accept a bogus syncookie
-                *     if we've never received a SYN.
-                *  B. check that the syncookie is valid.  If it is, then
-                *     cobble up a fake syncache entry, and return.
+                * a returning syncookie.  If the syncookie is valid,
+                * cobble up a fake syncache entry and create a socket.
+                *
+                * NB: Syncache head is locked for the syncookie access.
                 */
                if (!tcp_syncookies) {
-                       SCH_UNLOCK(sch);
                        if ((s = tcp_log_addrs(inc, th, NULL, NULL)))
                                log(LOG_DEBUG, "%s; %s: Spurious ACK, "
                                    "segment rejected (syncookies disabled)\n",
                                    s, __func__);
-                       goto failed;
+                       goto sendrst;
                }
                bzero(&scs, sizeof(scs));
                sc = syncookie_lookup(inc, sch, &scs, to, th, *lsop);
-               SCH_UNLOCK(sch);
                if (sc == NULL) {
                        if ((s = tcp_log_addrs(inc, th, NULL, NULL)))
                                log(LOG_DEBUG, "%s; %s: Segment failed "
                                    "SYNCOOKIE authentication, segment rejected 
"
                                    "(probably spoofed)\n", s, __func__);
-                       goto failed;
+                       goto sendrst;
                }
-       } else {
-               /* Pull out the entry to unlock the bucket row. */
-               TAILQ_REMOVE(&sch->sch_bucket, sc, sc_hash);
-               sch->sch_length--;
-               V_tcp_syncache.cache_count--;
-               SCH_UNLOCK(sch);
+               goto expand;            /* fully validated through syncookie */
        }
+       SCH_LOCK_ASSERT(sch);
 
        /*
         * Segment validation:
-        * ACK must match our initial sequence number + 1 (the SYN|ACK).
-        */
-       if (th->th_ack != sc->sc_iss + 1 && !TOEPCB_ISSET(sc)) {
-               if ((s = tcp_log_addrs(inc, th, NULL, NULL)))
-                       log(LOG_DEBUG, "%s; %s: ACK %u != ISS+1 %u, segment "
-                           "rejected\n", s, __func__, th->th_ack, sc->sc_iss);
-               goto failed;
-       }
-
-       /*
+        *
         * The SEQ must fall in the window starting at the received
         * initial receive sequence number + 1 (the SYN).
+        * If not the segment may be from an earlier connection.  We send
+        * an ACK to re-synchronize the connection and keep the syncache
+        * entry without ajusting its timers.
+        * See RFC793 page 69, first check sequence number [SYN_RECEIVED].
         */
        if ((SEQ_LEQ(th->th_seq, sc->sc_irs) ||
            SEQ_GT(th->th_seq, sc->sc_irs + sc->sc_wnd)) &&
@@ -877,14 +865,41 @@ syncache_expand(struct in_conninfo *inc,
                if ((s = tcp_log_addrs(inc, th, NULL, NULL)))
                        log(LOG_DEBUG, "%s; %s: SEQ %u != IRS+1 %u, segment "
                            "rejected\n", s, __func__, th->th_seq, sc->sc_irs);
-               goto failed;
+               (void) syncache_respond(sc);
+               *lsop = 1;                      /* prevent RST */
+               goto sendrstkeep;
        }
 
+       /*
+        * ACK must match our initial sequence number + 1 (the SYN|ACK).
+        * If not the segment may be from an earlier connection.  We send
+        * a RST but keep the syncache entry without ajusting its timers.
+        * See RFC793 page 72, fifth check the ACK field, [SYN_RECEIVED].
+        */
+       if (th->th_ack != sc->sc_iss + 1 && !TOEPCB_ISSET(sc)) {
+               if ((s = tcp_log_addrs(inc, th, NULL, NULL)))
+                       log(LOG_DEBUG, "%s; %s: ACK %u != ISS+1 %u, segment "
+                           "rejected\n", s, __func__, th->th_ack, sc->sc_iss);
+               goto sendrstkeep;
+       }
+
+       /*
+        * Remove the entry to unlock the bucket row.
+        * Tests from now on are fatal and remove the syncache entry.
+        */
+       TAILQ_REMOVE(&sch->sch_bucket, sc, sc_hash);
+       sch->sch_length--;
+       V_tcp_syncache.cache_count--;
+
+       /*
+        * If timestamps were not negotiated they must not show up later.
+        * See RFC1312bis, section 1.3, second paragraph 
+        */
        if (!(sc->sc_flags & SCF_TIMESTAMP) && (to->to_flags & TOF_TS)) {
                if ((s = tcp_log_addrs(inc, th, NULL, NULL)))
                        log(LOG_DEBUG, "%s; %s: Timestamp not expected, "
                            "segment rejected\n", s, __func__);
-               goto failed;
+               goto sendrst;
        }
        /*
         * If timestamps were negotiated the reflected timestamp
@@ -896,9 +911,11 @@ syncache_expand(struct in_conninfo *inc,
                        log(LOG_DEBUG, "%s; %s: TSECR %u != TS %u, "
                            "segment rejected\n",
                            s, __func__, to->to_tsecr, sc->sc_ts);
-               goto failed;
+               goto sendrst;
        }
 
+expand:
+       SCH_UNLOCK(sch);
        *lsop = syncache_socket(sc, *lsop, m);
 
        if (*lsop == NULL)
@@ -906,16 +923,18 @@ syncache_expand(struct in_conninfo *inc,
        else
                V_tcpstat.tcps_sc_completed++;
 
-/* how do we find the inp for the new socket? */
        if (sc != &scs)
                syncache_free(sc);
        return (1);
-failed:
-       if (sc != NULL && sc != &scs)
+
+sendrst:
+       if (sc != &scs)
                syncache_free(sc);
+sendrstkeep:
+       SCH_LOCK_ASSERT(sch);
+       SCH_UNLOCK(sch);
        if (s != NULL)
                free(s, M_TCPLOG);
-       *lsop = NULL;
        return (0);
 }
 
@@ -1322,6 +1341,8 @@ syncache_respond(struct syncache *sc)
                 *
                 *      1) path_mtu_discovery is disabled
                 *      2) the SCF_UNREACH flag has been set
+                *
+                * XXXAO: The route lookup comment doesn't make sense.
                 */
                if (V_path_mtu_discovery && ((sc->sc_flags & SCF_UNREACH) == 0))
                       ip->ip_off |= IP_DF;
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to