On 29/04/16 11:31, Claudiu Zissulescu wrote:
It should do the job, at least for EM where the jump takes 2 cycle, and by
means of using delay slots we can make all the cycles count. HS has a branch
prediction mechanism, hence, filling up the delay slot doesn't have such a big
impact like in EM or even earlier cpus.
No, the alternative is to hide the delay slot, so if the branch is
predicted properly, the case with
different high words should be faster without the .d suffix.
I.e. , eagerly filling the delay slot like this has a bigger - negative
- impact on performance.