Alan Modra wrote:
These are most likey 64-bit address constant loads. On ppc32, a 32-bit address constant can be calculated in at most 2 instructions. A 64-bit address takes up to 5 instructions to calculate in-line, or a TOC memory load.
One general principle at work here is that there is no rule that 64-bit code will be faster than 32-bit code. On the contrary, one generally expects 32-bit code to be faster in many cases.