> DCTCP is meant to be used in environments where the switches/routers do > ECN marking, so it is not surprising that it performs badly when used in > environments where it was not meant to be used. Has anyone measured the > effect of this changed when DCTCP is used in environments where all the > switches/routers do ECN marking? My concern is that we could end up > hurting performance when DCTCP is used how it was meant to be used in > order to protect incorrect uses of DCTCP.
The results reported are indeed from a non-optimal setting, and mostly to show that it was ignoring losses. In practice, we only use DCTCP on ECN-enabled AQMs, and rarely see the loss reaction (e.g., a burst of new flows IW that congest the ToR switch, in which case I'd argue the behavior is beneficial). I cannot estimate the impact on FB's workloads though. I had originally put a module parameter to make this loss reaction behavior optional, mostly to enable people to check first whether it was safe to use with their configuration. In hindsight, I should have waited a bit more before submitting the v2 with its removal as requested. Olivier