On Fri, 25 Aug 2017 15:36:22 +0000 "Waskiewicz Jr, Peter" <peter.waskiewicz...@intel.com> wrote:
> On 8/25/17 11:25 AM, Jacob Keller wrote: > > Under some circumstances, such as with many stacked devices, it is > > possible that dev_hard_start_xmit will bundle many packets together, and > > mark them all with xmit_more. > > > > Most drivers respond to xmit_more by skipping tail bumps on packet > > rings, or similar behavior as long as xmit_more is set. This is > > a performance win since it means drivers can avoid notifying hardware of > > new packets repeat daily, and thus avoid wasting unnecessary PCIe or other > > bandwidth. > > > > This use of xmit_more comes with a trade off because bundling too many > > packets can increase latency of the Tx packets. To avoid this, we should > > limit the maximum number of packets with xmit_more. > > > > Driver authors could modify their drivers to check for some determined > > limit, but this requires all drivers to be modified in order to gain > > advantage. > > > > Instead, add a sysctl "xmit_more_max" which can be used to configure the > > maximum number of xmit_more skbs to send in a sequence. This ensures > > that all drivers benefit, and allows system administrators the option to > > tune the value to their environment. > > > > Signed-off-by: Jacob Keller <jacob.e.kel...@intel.com> > > --- > > > > Stray thoughts and further questions.... > > > > Is this the right approach? Did I miss any other places where we should > > limit? Does the limit make sense? Should it instead be a per-device > > tuning nob instead of a global? Is 32 a good default? > > I actually like the idea of a per-device knob. A xmit_more_max that's > global in a system with 1GbE devices along with a 25/50GbE or more just > doesn't make much sense to me. Or having heterogeneous vendor devices > in the same system that have different HW behaviors could mask issues > with latency. > > This seems like another incarnation of possible buffer-bloat if the max > is too high... > > > > > Documentation/sysctl/net.txt | 6 ++++++ > > include/linux/netdevice.h | 2 ++ > > net/core/dev.c | 10 +++++++++- > > net/core/sysctl_net_core.c | 7 +++++++ > > 4 files changed, 24 insertions(+), 1 deletion(-) > > > > diff --git a/Documentation/sysctl/net.txt b/Documentation/sysctl/net.txt > > index b67044a2575f..3d995e8f4448 100644 > > --- a/Documentation/sysctl/net.txt > > +++ b/Documentation/sysctl/net.txt > > @@ -230,6 +230,12 @@ netdev_max_backlog > > Maximum number of packets, queued on the INPUT side, when the > > interface > > receives packets faster than kernel can process them. > > > > +xmit_more_max > > +------------- > > + > > +Maximum number of packets in a row to mark with skb->xmit_more. A value of > > zero > > +indicates no limit. > > What defines "packet?" MTU-sized packets, or payloads coming down from > the stack (e.g. TSO's)? xmit_more is only a hint to the device. The device driver should ignore it unless there are hardware advantages. The device driver is the place with HW specific knowledge (like 4 Tx descriptors is equivalent to one PCI transaction on this device). Anything that pushes that optimization out to the user is only useful for benchmarks and embedded devices.