On Feb 19, 2014, at 4:18 , Eggert, Lars <l...@netapp.com> wrote: > Hi, > > Midori Kato has implemented Microsoft's/Stanford's Datacenter TCP (DCTCP) for > FreeBSD as part of her MS thesis with me. Find a patch attached. >
Thanks! Any hints on how best to test this code? Best, George > Also note that we're documenting a specification for DCTCP in an IETF draft: > http://tools.ietf.org/html/draft-bensley-tcpm-dctcp > > Microsoft has made a licensing statement (RAND-Z) on the technology to the > IETF: https://datatracker.ietf.org/ipr/2319/ (I'm not sure what this means > for an eventual inclusion in FreeBSD.) > > Roughly, Midori's patch consists of an extension of the modular congestion > control framework to expose ECN information to the modules, a module to > implement DCTCP, and a few experimental variants. See Midori's explanation: > >> [1] A change for the modular congestion control framework (See Section 4.1 >> if needed) >> DCTCP uses the difference ECN processing from RFC3168. We need to prepare >> three functions to do the following ECN processing. >> a) The kernel decides whether an ECE flag should be set in the next outgoing >> TCP segment by snooping reserved bits in IP and TCP headers. (tcp_input.c) >> b) The kernel controls a congestion if an ECE flag is set in an arriving TCP >> segment. (tcp_input.c) >> c) After the outgoing TCP segment is generated, the kernel decides whether >> an ECT bit should be set in an ECN field of IP header in the outgoing >> packet. (tcp_output.c) >> The current framework has no housekeeping functions for (a) and (b). >> Therefore, I add two functions into the moduler cc framework: >> ecnpkt_handler() and ect_handler(). >> >> - ecnpkt_handler() allows the kernel to do the additional ECN processing by >> snooping ECN field in IP and TCP headers. As an option, this function takes >> a flag, which tells whether this function is in the delayed ACK. This >> function returns an integer value. When the return value is set, the kernel >> force to disable delayed ACK. >> - ect_handler() allows the kernel to use different rule from RFC3168 in >> terms of an ECT marking in the outgoing segment. This function returns an >> integer value. If the value is set, an ECT bit is set to the outgoing >> segment. >> >> >> [2] Five changes from the original DCTCP algorithm >> In order to reflect the DCTCP motivation, I modified the following >> processing. First four modifications are for senders and the last >> modification is for receivers. >> >> (1) no congestion recovery in the receipt of ECE flags (See section 4.2.1 if >> needed) >> FreeBSD handles ECN as a congestion event but it's not true for DCTCP >> senders. A DCTCP sender uses ECN as a means to understand the extent of >> congestions. Therefore, I remove congestion recovery mode in any situation >> for DCTCP senders. >> >> (2) selective initial alpha value (See section 4.2.2 if needed) >> DCTCP defines alpha as a parameter to see the depth of a congestion. When >> the alpha value is large, it allows a saw-toothed CWND behavior to a DCTCP >> sender. >> A problem is that the alpha value is not reliable during a dozen of RTTs >> because there is no way to identify the depth of a congestion over a network >> from the beginning. When considering the alpha reliability, I think the >> initial alpha should be selective for applications by users. When a user >> chooses DCTCP for latency-sensitive applications, the initial alpha is >> preferred. Otherwise, DCTCP senders had better to set the initial alpha >> value to zero from my experimental result (See section 7.2 of attaching >> file). >> The default alpha value is set to zero in my implementation. >> >> (3) alpha value initialization after an idle period (See section 4.2.3 if >> needed) >> How long an idle period is no longer predictable. Therefore, for a DCTCP >> sender, using the out-dated alpha after an idle period is not good idea. A >> DCTCP sender resets alpha to the initial value when an idle period occurs. >> >> The following changes is applied to eliminate a compatibility issue to >> standard ECN defined in RFC3465. DCTCP and standard ECN servers have no way >> to identify which mechanism is working on the peer. Thus, we need to >> eliminate the worst situation in a network mixing DCTCP senders/receivers >> and standard ECN senders/receivers. >> (4) using CWR flag when the ECE flag is found for a RTT (See section 5.1 if >> needed) >> This change is applied for a situation when a sender uses DCTCP and a >> reciever uses standard ECN. >> Under the situation, I find that a DCTCP sender minimizes CWND. The detailed >> technical reason is described in section 4.2 of my attaching file. >> Fortunately, the current tcp_input() function complement this change, thus, >> there is no modification in my patch. >> >> (5) enabling delayed ACK in the receipt of the CWR flag (See section 5.2 if >> needed) >> This change is applied for a situation when a sender uses standard ECN and a >> reciever uses DCTCP. Under the situation, I find that a standard ECN sender >> increases smaller CWND than expected without this change. The detailed >> technical reason is described in section 5.2 of my attaching file. > > > The patch is attached and should apply to a recent -CURRENT. Midori's thesis > (which she refers to in the quoted text above) is at > https://eggert.org/students/kato-thesis.pdf > > Lars > > <dctcp.patch>
signature.asc
Description: Message signed with OpenPGP using GPGMail