From: Eric Dumazet <eduma...@google.com>
Date: Thu, 20 Aug 2020 08:43:59 -0700

> Currently, tcp sendmsg(MSG_ZEROCOPY) is building skbs with order-0 fragments.
> Compared to standard sendmsg(), these skbs usually contain up to 16 fragments
> on arches with 4KB page sizes, instead of two.
> 
> This adds considerable costs on various ndo_start_xmit() handlers,
> especially when IOMMU is in the picture.
> 
> As high performance applications are often using huge pages,
> we can try to combine adjacent pages belonging to same
> compound page.
> 
> Tested on AMD Rome platform, with IOMMU, nominal single TCP flow speed
> is roughly doubled (~55Gbit -> ~100Gbit), when user application
> is using hugepages.
> 
> For reference, nominal single TCP flow speed on this platform
> without MSG_ZEROCOPY is ~65Gbit.
> 
> Signed-off-by: Eric Dumazet <eduma...@google.com>

Applied, the refcounitng in these kinds of patchs is always fun to
audit :-)

Reply via email to