Hi everyone,

So I've found with the peephole optimizer, at least on x86, that if you run pass 2 more than once, it often catches even more optimisations that otherwise get missed.  At the same time I've found some bugs that get triggered when pass 2 is run again (which is why I asked about RegLoadedWithNewValue in another chain).

I'm working out how best to permit this, given pass 2 has only ever been run once, and it's a cross-platform thing that will cause slowdown across the board, although I figure if it only runs pass 2 multiple times on -O3 and above, then it running more slowly is permissible.

Additionally, I've found that running certain elements of pass 1 again also yield some new optimisations, although in this instance I figure it's best to just run these optimisations again in pass 2 instead of falling back to pass 1, although I'll have to experiment to see if this catches all eventualities;

On another note, I do wonder if the pre-peephole pass should be merged into pass 1, and then pass 1 be run up to 3 times on -O2 instead of twice so the level of optimisation is identical.  Then again, I'm not certain if other platforms do some special instruction manipulation that would be incompatible with a regular pass.

Gareth aka. Kit

P.S. Just some examples... in ninl, for example - before:

.Lj1162:
    movq    %r13,%rcx
    call    NCON_$$_GENENUMNODE$TENUMSYM$$TORDCONSTNODE
    movq    %rax,56(%rsp)
    movq    56(%rsp),%rdi
    jmp    .Lj1141
    .balign 16,0x90

After:

.Lj1162:
    movq    %r13,%rcx
    call    NCON_$$_GENENUMNODE$TENUMSYM$$TORDCONSTNODE
    movq    %rax,56(%rsp)
    movq    %rax,%rdi
    jmp    .Lj1141
    .balign 16,0x90

In SysUtils, this sequence appears surprisingly often on x86_64-win64:

.Lj7572:
    movq    -40(%rbp),%rax
    cmpb    $0,-292(%rax)
    jne    .Lj7577
    movq    -40(%rbp),%rcx
    movl    $1,%r8d
    movq    -48(%rbp),%rdx
    call SYSUTILS$_$DATETIMETOSTRING$hxuwovHuJEHC_$$_STORESTR$PCHAR$LONGINT
    movb    %sil,%dil
    movb    -4(%rbp),%sil
    movb    %sil,%dil
    movb    -4(%rbp),%sil
    jmp    .Lj7447
    .p2align 4,,10
    .p2align 3

And this is optimised by additional passes and optimisations:

.Lj7572:
    movq    -40(%rbp),%rax
    cmpb    $0,-292(%rax)
    jne    .Lj7577
    movq    -40(%rbp),%rcx
    movl    $1,%r8d
    movq    -48(%rbp),%rdx
    call SYSUTILS$_$DATETIMETOSTRING$hxuwovHuJEHC_$$_STORESTR$PCHAR$LONGINT
    movb    -4(%rbp),%dil
    movb    %dil,%sil
    jmp    .Lj7447
    .p2align 4,,10
    .p2align 3


--
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to