https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71785

            Bug ID: 71785
           Summary: Computed gotos are mostly optimized away
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: andres at anarazel dot de
  Target Milestone: ---

Hi,

I'm working on some interpreter like constructs in postgres. To reduce the
number of mispredictions I wanted to use the "typical" jump threading approach.
Unfortunately with gcc-6 (gcc-6 (Debian 6.1.1-8) 6.1.1 20160630) and up to a
recent snapshot (Debian 20160612-1) 7.0.0 20160612 (experimental) [trunk
revision 237336]), gcc merges some of the gotos together in a common label, and
jumps there.

In the attached file (a small artifical case showing the problem), with -O3
this results in
CASE_OP_A:
        someglobal++;
        op++;
        goto *dispatch_table[op->opcode];
CASE_OP_B:
        do_stuff_b(op->arg);
        op++;
        goto *dispatch_table[op->opcode];

being implemented as
.L5:
        addq    $8, %rbx
        jmp     *%rax
...
.L3:
        movl    (%rbx), %eax
        addl    $1, someglobal(%rip)
        movq    dispatch_table.1772(,%rax,8), %rax
        jmp     .L5
...
.L4:
        movl    -4(%rbx), %edi
        call    do_stuff_b
        movl    (%rbx), %eax
        movq    dispatch_table.1772(,%rax,8), %rax
        jmp     .L5


I've tried -fno-gcse and -fno-crossjumping, and neither seems to fix the
problem.

It's also kind of weird how the load from the dispatch table is still performed
in the individual branches, just the final jmp *%rax happens in the common
location (L5 here).  In the actual case I'm fighting with gcc "inlines" the jmp
*%rax in one of the dispatches, but not in the other 8.

Any additional information I can provide?

Regards,

Andres

Reply via email to