Folks,
At many places in the code we have cluster of prefetch instructions which seems
to be bad idea to do.
I already noticed that perfermance is better when prefetch instructions are
interleaved with other code,
And there is nice section explaining right that in the Intel Optimization
Manual.
http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html
Copy/paste from the document:
---
It may seem convenient to cluster all of PREFETCH instructions at the beginning
of a loop body or before a loop, but this can lead to severe performance
degradation. In order to achieve the best possible performance, PREFETCH
instructions must be interspersed with other computational instructions in the
instruction sequence rather than clustered together. If possible, they should
also be placed apart from loads. This improves the instruction level
parallelism and reduces the potential instruction resource stalls. In addition,
this mixing reduces the pressure on the memory access resources and in turn
reduces the possibility of the prefetch retiring without fetching data.
—--
—
Damjan
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#16171): https://lists.fd.io/g/vpp-dev/message/16171
Mute This Topic: https://lists.fd.io/mt/73323447/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-