Hello,
I'm doing some research with software prefetching and so I've been testing
the use of the ARMv7 "pli" instruction. However gem5 has a very curious
behavior with it.
Let fe_mul() be a function consisting of a single basic block of about 400
instructions, that is, no branches, no jumps, only sequential execution.
When I simulate this with some other code, I get the following stats:
sim_insts 1572969
Now, I edit the assembly of the code to insert a single "pli" instruction
right at the start of fe_mul():
00031620 <fe_mul>:
31620: e92d4ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, lr}
31624: e24dd0b0 sub sp, sp, #176 ; 0xb0
31628: e58d0024 str r0, [sp, #36] ; 0x24
3162c: f45ff014 pli [pc, #-20] ; 31620 <fe_mul>
Notice the "pli" instruction at the 5th line.
What should we expect here? This prefetch is pretty much useless, but if
this function fe_mul() is called about 1400 times, then we would like to
see an increase in the number of executed instructions. Instead, gem5
reports:
sim_insts 1572969
Exactly the same number! I've looked into the assembly, and the two
versions of this function differ by exactly one instruction, the "pli" one.
The compiler is not doing any tricks here.
I've done some tests inserting this instruction into different positions of
the code and the same result applies, but if I insert a harmless
instruction such as "add r0 r0 #0" then I get the expected result.
Any insights would be appreciated. Thank you.
--
Felipe
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users