Yes, Bruce, we understand this. But we are working with huge SYN attacks processing and they are 64byte only :(
On Wed, Jul 1, 2015 at 3:59 PM, Bruce Richardson <bruce.richardson at intel.com> wrote: > On Wed, Jul 01, 2015 at 03:44:57PM +0300, Pavel Odintsov wrote: >> Thanks for answer, Vladimir! So we need look for x16 NIC if we want >> achieve 40GE line rate... >> > Note that this would only apply for your minimal i.e. 64-byte, packet sizes. > Once you go up to larger e.g. 128B packets, your PCI bandwidth requirements > are lower and you can easier achieve line rate. > > /Bruce > >> On Wed, Jul 1, 2015 at 3:06 PM, Vladimir Medvedkin <medvedkinv at gmail.com> >> wrote: >> > Hi Pavel, >> > >> > Looks like you ran into pcie bottleneck. So let's calculate xl710 rx only >> > case. >> > Assume we have 32byte descriptors (if we want more offload). >> > DMA makes one pcie transaction with packet payload, one descriptor >> > writeback >> > and one memory request for free descriptors for every 4 packets. For >> > Transaction Layer Packet (TLP) there is 30 bytes overhead (4 PHY + 6 DLL + >> > 16 header + 4 ECRC). So for 1 rx packet dma sends 30 + 64(packet itself) + >> > 30 + 32 (writeback descriptor) + (16 / 4) (read request for new >> > descriptors). Note that we do not take into account PCIe ACK/NACK/FC Update >> > DLLP. So we have 160 bytes per packet. One lane PCIe 3.0 transmits 1 byte >> > in >> > 1 ns, so x8 transmits 8 bytes in 1 ns. 1 packet transmits in 20 ns. Thus >> > in theory pcie 3.0 x8 may transfer not more than 50mpps. >> > Correct me if I'm wrong. >> > >> > Regards, >> > Vladimir >> > >> > -- Sincerely yours, Pavel Odintsov