On 10/24/2017 03:11 AM, Neal Cardwell wrote:
On Mon, Oct 23, 2017 at 6:15 PM, Steve Ibanez <siba...@stanford.edu> wrote:
Hi All,

I upgraded the kernel on all of our machines to Linux
4.13.8-041308-lowlatency. However, I'm still observing the same
behavior where the source enters a timeout when the CWND=1MSS and it
receives ECN marks.

Here are the measured flow rates:
<https://drive.google.com/file/d/0B-bt9QS-C3ONT0VXMUt6WHhKREE/view?usp=sharing>

Here are snapshots of the packet traces at the sources when they both
enter a timeout at t=1.6sec:

10.0.0.1 timeout event:
<https://drive.google.com/file/d/0B-bt9QS-C3ONcl9WRnRPazg2ems/view?usp=sharing>

10.0.0.3 timeout event:
<https://drive.google.com/file/d/0B-bt9QS-C3ONeDlxRjNXa0VzWm8/view?usp=sharing>

Both still essentially follow the same sequence of events that I
mentioned earlier:
(1) receives an ACK for byte XYZ with the ECN flag set
(2) stops sending for RTO_min=300ms
(3) sends a retransmission for byte XYZ

The cwnd samples reported by tcp_probe still indicate that the sources
are reacting to the ECN marks more than once per window. Here are the
cwnd samples at the same timeout event mentioned above:
<https://drive.google.com/file/d/0B-bt9QS-C3ONdEZQdktpaW5JUm8/view?usp=sharing>

Let me know if there is anything else you think I should try.

Sounds like perhaps cwnd is being set to 0 somewhere in this DCTCP
scenario. Would you be able to add printk statements in
tcp_init_cwnd_reduction(), tcp_cwnd_reduction(), and
tcp_end_cwnd_reduction(), printing the IP:port, tp->snd_cwnd, and
tp->snd_ssthresh?

Based on the output you may be able to figure out where cwnd is being
set to zero. If not, could you please post the printk output and
tcpdump traces (.pcap, headers-only is fine) from your tests?

Hi Steve, do you have any updates on your debugging?

thanks,
neal


Reply via email to