On 8/6/20 9:48 PM, Scott Dial wrote:
The aes-aesni driver is smart enough to use the FPU if it's not busy and
fallback to the CPU otherwise. Unfortunately, the ghash-clmulni driver
does not have that kind of logic in it and only provides an async version,
so we are forced to use the ghash-generic implementation, which is a pure
CPU implementation. The ideal would be for aesni_intel to provide a
synchronous version of gcm(aes) that fell back to the CPU if the FPU is
busy.

I don't know how the AES-NI support works, but I did see your specific mention of aesni_intel and figured I should mention that this does also affect AMD. I just got access to AMD nodes (2 x EPYC 7302) with a Mellanox 10 GbE NIC.  I did the same test and it had a similar performance pattern.  I doubt this means much but I figured I should mention it.

I don't know if the crypto maintainers would be open to such a change, but
if the choice was between reverting and patching the crypto code, then I
would work on patching the crypto code.

I can't opine on anything crypto-related since it is extremely way outside of my area of expertise, though it is helpful to hear what is going on.

In any case, you didn't report how many packets arrived out of order, which
was the issue being addressed by my change. It would be helpful to get
the output of "ip -s macsec show" and specifically the InPktsDelayed
counter. Did iperf3 report out-of-order packets with the patch reverted?
Otherwise, if this is the only process running on your test servers,
then you may not be generating any contention for the FPU, which is the
source of the out-of-order issue. Maybe you could run prime95 to busy
the FPU to see the issue that I was seeing.

I ran some tests again on the same servers as before with the Intel NICs.  I tested with prime95 running on 27 of the 28 cores in *each* server simultaneously (allowing iperf3 to use a core on each) throughout the entire test.  This was using 5.7.11 with ab046a5d4be4c90a3952a0eae75617b49c0cb01b reverted, so pre-5.7 performance.

MACsec interfaces are deleted and recreated before each test, so counters are always fresh.

== MACSEC WITHOUT ENCRYPTION ==

* Server1:
18: ms1: protect on validate strict sc off sa off encrypt off send_sci on end_station off scb off replay off
    cipher suite: GCM-AES-128, using ICV length 16
    TXSC: 0000000000001234 on SA 0
    stats: OutPktsUntagged InPktsUntagged OutPktsTooLong InPktsNoTag InPktsBadTag InPktsUnknownSCI InPktsNoSCI InPktsOverrun                          0              0              0 1123            0                0           1             0     stats: OutPktsProtected OutPktsEncrypted OutOctetsProtected OutOctetsEncrypted
                    3798421                0 30889802591                  0
        0: PN 3799655, state on, key 01000000000000000000000000000000
    stats: OutPktsProtected OutPktsEncrypted
                    3798421                0
    RXSC: 0000000000001234, state on
    stats: InOctetsValidated InOctetsDecrypted InPktsUnchecked InPktsDelayed InPktsOK InPktsInvalid InPktsLate InPktsNotValid InPktsNotUsingSA InPktsUnusedSA                  30042694872                 0 0           218  3675170             0          0 0                0              0
        0: PN 3676633, state on, key 01000000000000000000000000000000
    stats: InPktsOK InPktsInvalid InPktsNotValid InPktsNotUsingSA InPktsUnusedSA
            3675170             0              0 0              0

*Server2:
18: ms1: protect on validate strict sc off sa off encrypt off send_sci on end_station off scb off replay off
    cipher suite: GCM-AES-128, using ICV length 16
    TXSC: 0000000000001234 on SA 0
    stats: OutPktsUntagged InPktsUntagged OutPktsTooLong InPktsNoTag InPktsBadTag InPktsUnknownSCI InPktsNoSCI InPktsOverrun                          0              0              0 1227            0                0           1             0     stats: OutPktsProtected OutPktsEncrypted OutOctetsProtected OutOctetsEncrypted
                    3675399                0 30042696158                  0
        0: PN 3676633, state on, key 01000000000000000000000000000000
    stats: OutPktsProtected OutPktsEncrypted
                    3675399                0
    RXSC: 0000000000001234, state on
    stats: InOctetsValidated InOctetsDecrypted InPktsUnchecked InPktsDelayed InPktsOK InPktsInvalid InPktsLate InPktsNotValid InPktsNotUsingSA InPktsUnusedSA                  30889801305                 0 0             0  3798410             0          0 0                0              0
        0: PN 3799655, state on, key 01000000000000000000000000000000
    stats: InPktsOK InPktsInvalid InPktsNotValid InPktsNotUsingSA InPktsUnusedSA
            3798410             0              0 0              0


InPktsDelayed was 218 for Server1 and 0 for Server2.

== MACSEC WITH ENCRYPTION ==

I got the following *with* encryption (macsec interface deleted and recreated before the test, so counters are fresh):
*Server1:
19: ms1: protect on validate strict sc off sa off encrypt on send_sci on end_station off scb off replay off
    cipher suite: GCM-AES-128, using ICV length 16
    TXSC: 0000000000001234 on SA 0
    stats: OutPktsUntagged InPktsUntagged OutPktsTooLong InPktsNoTag InPktsBadTag InPktsUnknownSCI InPktsNoSCI InPktsOverrun                          0              0              0 1397            0                0           0             0     stats: OutPktsProtected OutPktsEncrypted OutOctetsProtected OutOctetsEncrypted
                          0          5560714 0        46931594623
        0: PN 5561948, state on, key 01000000000000000000000000000000
    stats: OutPktsProtected OutPktsEncrypted
                          0          5560714
    RXSC: 0000000000001234, state on
    stats: InOctetsValidated InOctetsDecrypted InPktsUnchecked InPktsDelayed InPktsOK InPktsInvalid InPktsLate InPktsNotValid InPktsNotUsingSA InPktsUnusedSA                            0       45977049585 0          3771  5417843             0          0 0                0              0
        0: PN 5422860, state on, key 01000000000000000000000000000000
    stats: InPktsOK InPktsInvalid InPktsNotValid InPktsNotUsingSA InPktsUnusedSA
            5417843             0              0 0              0

*Server2:
19: ms1: protect on validate strict sc off sa off encrypt on send_sci on end_station off scb off replay off
    cipher suite: GCM-AES-128, using ICV length 16
    TXSC: 0000000000001234 on SA 0
    stats: OutPktsUntagged InPktsUntagged OutPktsTooLong InPktsNoTag InPktsBadTag InPktsUnknownSCI InPktsNoSCI InPktsOverrun                          0              0              0 1490            0                0           0             0     stats: OutPktsProtected OutPktsEncrypted OutOctetsProtected OutOctetsEncrypted
                          0          5421626 0        45977059885
        0: PN 5422860, state on, key 01000000000000000000000000000000
    stats: OutPktsProtected OutPktsEncrypted
                          0          5421626
    RXSC: 0000000000001234, state on
    stats: InOctetsValidated InOctetsDecrypted InPktsUnchecked InPktsDelayed InPktsOK InPktsInvalid InPktsLate InPktsNotValid InPktsNotUsingSA InPktsUnusedSA                            0       46931106683 0           109  5560541             0          0 0                0              0
        0: PN 5561948, state on, key 01000000000000000000000000000000
    stats: InPktsOK InPktsInvalid InPktsNotValid InPktsNotUsingSA InPktsUnusedSA
            5560541             0              0 0              0


InPktsDelayed was 3771 for Server1 and 109 for Server2.


The performance numbers were:
* 9.87 Gb/s without macsec
* 6.00 Gb/s with macsec WITHOUT encryption
* 9.19 Gb/s with macsec WITH encryption

iperf3 retransmits were:
* 27 without macsec
* 1211 with macsec WITHOUT encryption
* 721 with macsec WITH encryption


Thanks for the reply and for the background on this.

Ryan

Reply via email to