From: Eric Dumazet > Sent: 29 January 2016 22:29 ... > On modern intel cpus, this does not matter at all, sure. It took a while > before "rep movsb" finally did the right thing.
Unfortunately memcpy_to_io() etc now map to 'rep movsb', and that can only be optimisied for cached addresses. So copies to/from pcie space get done as byte copies, slower than slow. The same is true when usespace has used mmap() to directly access pcie memory. There isn't a standard wrapper than generates 'rep movsd'. David