On 16-09-10 08:36 AM, Tom Herbert wrote: > On Fri, Sep 9, 2016 at 2:29 PM, John Fastabend <john.fastab...@gmail.com> > wrote: >> e1000 supports a single TX queue so it is being shared with the stack >> when XDP runs XDP_TX action. This requires taking the xmit lock to >> ensure we don't corrupt the tx ring. To avoid taking and dropping the >> lock per packet this patch adds a bundling implementation to submit >> a bundle of packets to the xmit routine. >> >> I tested this patch running e1000 in a VM using KVM over a tap >> device using pktgen to generate traffic along with 'ping -f -l 100'. >> >> Suggested-by: Jesper Dangaard Brouer <bro...@redhat.com> >> Signed-off-by: John Fastabend <john.r.fastab...@intel.com> >> ---
[...] >> diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c >> b/drivers/net/ethernet/intel/e1000/e1000_main.c >> index 91d5c87..b985271 100644 >> --- a/drivers/net/ethernet/intel/e1000/e1000_main.c >> +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c >> @@ -1738,10 +1738,18 @@ static int e1000_setup_rx_resources(struct >> e1000_adapter *adapter, >> struct pci_dev *pdev = adapter->pdev; >> int size, desc_len; >> >> + size = sizeof(struct e1000_rx_buffer_bundle) * >> + E1000_XDP_XMIT_BUNDLE_MAX; >> + rxdr->xdp_buffer = vzalloc(size); >> + if (!rxdr->xdp_buffer) >> + return -ENOMEM; >> + >> size = sizeof(struct e1000_rx_buffer) * rxdr->count; >> rxdr->buffer_info = vzalloc(size); >> - if (!rxdr->buffer_info) >> + if (!rxdr->buffer_info) { >> + vfree(rxdr->xdp_buffer); > > This could be deferred until an XDP program is added. Yep that would be best to avoid overhead in the normal non-XDP case. Also I'll move the xdp prog pointer into the rx ring per Jespers comment that I missed in this rev. [...] >> + >> +static void e1000_xdp_xmit_bundle(struct e1000_rx_buffer_bundle >> *buffer_info, >> + struct net_device *netdev, >> + struct e1000_adapter *adapter) >> +{ >> + struct netdev_queue *txq = netdev_get_tx_queue(netdev, 0); >> + struct e1000_tx_ring *tx_ring = adapter->tx_ring; >> + struct e1000_hw *hw = &adapter->hw; >> + int i = 0; >> + >> /* e1000 only support a single txq at the moment so the queue is >> being >> * shared with stack. To support this requires locking to ensure the >> * stack and XDP are not running at the same time. Devices with >> * multiple queues should allocate a separate queue space. >> + * >> + * To amortize the locking cost e1000 bundles the xmits and sends as >> + * many as possible until either running out of descriptors or >> failing. > > Up to E1000_XDP_XMIT_BUNDLE_MAX at least... Yep will fix comment. [...] >> >> /* use prefetched values */ >> @@ -4498,8 +4536,11 @@ next_desc: >> rx_ring->next_to_clean = i; >> >> cleaned_count = E1000_DESC_UNUSED(rx_ring); >> - if (cleaned_count) >> + if (cleaned_count) { >> + if (xdp_xmit) >> + e1000_xdp_xmit_bundle(xdp_bundle, netdev, adapter); >> adapter->alloc_rx_buf(adapter, rx_ring, cleaned_count); >> + } > > Looks good for XDP path. Is this something we can abstract out into a > library for use by other drivers? > I'm not really sure it can be abstracted much its a bit intertwined with the normal rx receive path. But it should probably be a pattern that gets copied so we avoid unnecessary tx work. > >> >> adapter->total_rx_packets += total_rx_packets; >> adapter->total_rx_bytes += total_rx_bytes; >>