On Fri, 2016-01-29 at 14:08 -0800, Alexander Duyck wrote: > It also means DMA becomes dramatically slower as it introduces a > partial write access for the start of every frame. It is why we had > set NET_IP_ALIGN to 0 on x86 since DMA was becoming more expensive > when unaligned then reading IP unaligned headers.
Well, I guess that if you have an arch where DMA accesses are slow and NET_IP_ALIGN = 2, you are out of luck. This is why some platforms are better than others. > > The gain on recvmsg would probably be minimal. The only time I have > seen any significant speed-up for copying is if you can get both ends > aligned to something like 16B. On modern intel cpus, this does not matter at all, sure. It took a while before "rep movsb" finally did the right thing. memcpy() and friends implementations are much slower on some older arches (when dealing with unaligned src/dst) arch/mips/lib/memcpy.S is a gem ;)