Re: [PATCH net-next] net: Implement fast csum_partial for x86_64

Hannes Frederic Sowa Mon, 04 Jan 2016 14:19:57 -0800

On 04.01.2016 00:22, Tom Herbert wrote:

Implement assembly routine for csum_partial for 64 bit x86. This
primarily speeds up checksum calculation for smaller lengths such as
those that are present when doing skb_postpull_rcsum when getting
CHECKSUM_COMPLETE from device or after CHECKSUM_UNNECESSARY
conversion.


This implementation is similar to csum_partial implemented in
checksum_32.S, however since we are dealing with 8 bytes at a time
there are more cases for alignment and small lengths-- for those we
employ jump tables.

Testing:

Verified correctness by testing arbitrary length buffer filled with
random data. For each buffer I compared the computed checksum
using the original algorithm for each possible alignment (0-7 bytes).

Checksum performance:

Isolating old and new implementation for some common cases:

                         Old      New
Case                    nsecs    nsecs    Improvement
---------------------+--------+--------+-----------------------------
1400 bytes (0 align)    194.4    176.7      9%    (Big packet)
40 bytes (0 align)      10.5     5.7       45%    (Ipv6 hdr common case)
8 bytes (4 align)       8.6      7.4       15%    (UDP, VXLAN in IPv4)
14 bytes (0 align)      10.4     6.5       37%    (Eth hdr)
14 bytes (4 align)      10.8     7.8       27%    (Eth hdr in IPv4)

Signed-off-by: Tom Herbert <t...@herbertland.com>

I verified the implementation through tests and can also see a speed-upin almost all cases. Unfortunately _addcarry_u64 intrinsics and __int128for letting the compiler use adc instructions generated even worse codeas the current implementation.


Acked-by: Hannes Frederic Sowa <han...@stressinduktion.org>

Thanks Tom!

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] net: Implement fast csum_partial for x86_64

Reply via email to