The following code:

#include <stdio.h>

void Switch4(int x) {
    switch (x & 7) {
    case 0: printf("0\n"); break;
    case 1: printf("1\n"); break;
    case 2: printf("2\n"); break;
    case 3: printf("3\n"); break;
    case 4: printf("4\n"); break;
    case 5: printf("5\n"); break;
    case 6: printf("6\n"); break;
    case 7: printf("7\n"); break;
    }
}

void Switch256(int x) {
    switch ((unsigned char) x) {
    case 0: printf("0\n"); break;
    case 1: printf("1\n"); break;
    case 2: printf("2\n"); break;
    // ... (all 256 cases)
    }
}

when compiled with:
gcc -S -O3 fullswitch.c

produces the following:

        ...
.globl _Switch4
        .def    _Switch4;       .scl    2;      .type   32;     .endef
_Switch4:
        pushl   %ebp
        movl    %esp, %ebp
        movl    8(%ebp), %eax
        andl    $7, %eax
        cmpl    $7, %eax
        ja      L12
        jmp     *L11(,%eax,4)
        ...
.globl _Switch256
        .def    _Switch256;     .scl    2;      .type   32;     .endef
_Switch256:
        pushl   %ebp
        movl    %esp, %ebp
        movzbl  8(%ebp), %eax
        cmpl    $255, %eax
        ja      L273
        jmp     *L272(,%eax,4)


cmpl+ja are redundant in both cases.
Do you think it is possible for gcc to optimize them away?

An example of a real program with all 256 cases for unsigned char is Atari800. 
We use table-driven goto * there for better performance.

Reply via email to