Re: [fpc-devel] Optimisation and memory alignment question

Florian Klämpfl via fpc-devel Sun, 28 Feb 2021 02:57:03 -0800

Am 28.02.21 um 11:11 schrieb J. Gareth Moreton via fpc-devel:

Hi everyone,
So to get to the point, I've spotted another potential peepholeoptimisation specifically on x86_64:
     movq    (%rdx),%rax
     shrq    $32,%rax

Is it acceptable to change this to the following?

     movl    4(%rdx),%eax

Yes. If (%rdx) is naturally aligned (so to a 8 byte boundary), 4(%rdx)is at least aligned to a 4 byte boundary and thus naturally aligned.

Logically it's equivalent thanks to the guarantee that the upper 32-bitsof the destination register will be zeroed, but I know sometimes theremight be a penalty for reading from memory that isn't aligned to a16-byte boundary, say.

x86 is very robust against misalignments and the example code is anywaysnaturally aligned. Everything above natural alignment is coincidence.

A "movl; shrl $16" version may be possible with movzx, but I'm notcertain if that will be even more inefficient due to the offset nowbeing 2 rather than 4.
Gareth aka. Kit


_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Optimisation and memory alignment question

Reply via email to