Re: [OpenWrt-Devel] Ethernet performance for transfers between VLANs (bcm47xx)

Robert Bradley Sun, 11 Aug 2013 10:11:52 -0700

On 11/08/13 17:41, Rafał Miłecki wrote:

2013/8/11 Robert Bradley <robert.bradl...@gmail.com>:

On 11/08/13 16:08, Rafał Miłecki wrote:

2013/8/9 Florian Fainelli <f.faine...@gmail.com>:

I am looking at bgmac_dma_rx_read() and I do not quite understand why
you would need to copy data to the newly allocated SKB as it might
really be killing performance here. Looking at b44, the code path
doing this is just when the packet is smaller (say less than 256
bytes) because in that case, the cost of a data cache invalidate might
be higher than a fresh allocation plus memcpy(). Rather, the logic I
would use is the following:


- consume a packet from the DMA RX ring at a given index
- dma_sync_single_for_cpu() this packet
- call netif_receive_skb() for this packet
- allocate a new SKB for the same RX ring index

Eventually if you realize that for small packets you had better do a
new allocation plus memcpy() (aka: copybreak) you could try that.

I've implemented that solution, but it didn't really help much :(

Well, http://patchwork.ozlabs.org/patch/220961/ seems to suggest that bgmac
can produce unaligned accesses, so I assume the memcpy() is used to avoid
that.  You could try removing the new allocation and memcpy(), add in the IP
stack unaligned access patches from ar71xx and see if that helps...

That patch from Hauke was applied and I'm testing kernels having it.
Is there any extra unaligned access you're aware of, or you suspect?

I'm no expert when it comes to bgmac, but I expect that no unalignedaccess currently exists. The only issue with that is that the unalignedaccess is then traded for the expense of copying the packet (whichoccurred even before the patch; the patch merely fixes the alignment ofthe new SKB's data). In the past, a similar trick was done within theag71xx Ethernet drivers for ar71xx, but was reverted later since largepackets are/were costly to realign.


https://dev.openwrt.org/changeset/20506
https://dev.openwrt.org/changeset/20892
https://dev.openwrt.org/changeset/21166

A similar thing may apply here, where the cost of memcpy() is greaterthan the unaligned performance hit. However, since we now have patchesto avoid unaligned access in the first place(https://dev.openwrt.org/browser/trunk/target/linux/ar71xx/patches-3.10/902-unaligned_access_hacks.patch),it might be worth testing a build with these applied and use Florian'smethod instead (pass the current SKB to the stack as-is and create a newone for the next DMA read).


--
Robert Bradley
_______________________________________________
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel

Re: [OpenWrt-Devel] Ethernet performance for transfers between VLANs (bcm47xx)

Reply via email to