Hello.

Firstly I want to apologize for this long post, but in a way this post also
is meant for documenting the work that I have done hunting down this issue.
Secondly I must say that I do not have much insights on the GCC internals,
only the basic stuff. I know what a function prologue and epilogue is, but
I am not able to read GIMPLE or RTL for instance...

So I have a C function that basically starts this way (after CPP
pre-processing):

void e_printf( const charptr ctrl1, ...){
    int long_flag;
    int dot_flag;
    int skip_switch;
    params_t par;
    char ch;
    va_list argp;
    charptr ctrl = ctrl1;
    __builtin_va_start(argp,ctrl1);
    for ( ; *ctrl; ctrl++) {
    (....)

This C code function when compiled with the GCC version currently in git
trunk, compiled for target lm32-elf, generates the following assembly code:
(...)
 .global e_printf
.type e_printf, @function
e_printf:
addi     sp, sp, -116
sw       (sp+56), r11
sw       (sp+52), r12
sw       (sp+48), r13
sw       (sp+44), r14
sw       (sp+40), r15
sw       (sp+36), r16
sw       (sp+32), r17
sw       (sp+28), r18
sw       (sp+24), r19
sw       (sp+20), r20
sw       (sp+16), r21
sw       (sp+12), r22
sw       (sp+8), r23
sw       (sp+4), ra
lw       r9, (sp+88)
sw       (sp+92), r2
sw       (sp+88), r1
lbu      r1, (r9+0)
sw       (sp+96), r3
sw       (sp+100), r4
sw       (sp+104), r5
sw       (sp+108), r6
sw       (sp+112), r7
sw       (sp+116), r8
sw       (sp+60), r9
be     r1,r0,.L1
(...)

Now this code in part is suboptimal because all 'pushed registers' from r2
to r8 to the stack are never "poped" back but that is not the issue I am
referring.
Assuming that the "const charptr ctrl1" is passed to the function in
register r1, the following sequence is obviously wrong (the "sw
(sp+88), r1" instruction should appear first in this small sequence of
instructions for this snip of code to be correct):
lw       r9, (sp+88)
sw       (sp+92), r2
sw       (sp+88), r1
lbu      r1, (r9+0)

Something went horrible wrong in this reordering instructions in the stack.

So, I compiled about a dozen of GCC released versions and determined that
GCC 4.9.2 was the last version where this code was compiled into assembly
correctly and GCC 5.1.0 is the first one containing erroneous  generated
assembly. All versions tested after GCC 5.1.0 all generated the same above
assembly erroneous code. I should also mention that GCC 4.9.4 also
generates assembly code correctly, but it seems that GCC 5.1.0 evolved from
GCC 4.9.2...

The correct assembly code is the following (although suboptimal, but not
the issue I am reporting):

 .global e_printf
.type e_printf, @function
e_printf:
addi     sp, sp, -124
sw       (sp+64), r11
sw       (sp+60), r12
sw       (sp+56), r13
sw       (sp+52), r14
sw       (sp+48), r15
sw       (sp+44), r16
sw       (sp+40), r17
sw       (sp+36), r18
sw       (sp+32), r19
sw       (sp+28), r20
sw       (sp+24), r21
sw       (sp+20), r22
sw       (sp+16), r23
sw       (sp+12), r24
sw       (sp+8), r25
sw       (sp+4), ra
lbu      r9, (r1+0)
sw       (sp+96), r1
sw       (sp+100), r2
sw       (sp+104), r3
sw       (sp+108), r4
sw       (sp+112), r5
sw       (sp+116), r6
sw       (sp+120), r7
sw       (sp+124), r8
sw       (sp+68), r1
be     r9,r0,.L1

I then cloned GCC git repository and run a "git bisect" command. Something
in the lines of:

git bisect start
git bisect bad 659e39f1bcb277ddcc701b9ed7d41776ab8b4214
git bisect good 5fbb36f4a4e85956c241dafe603f1fc1be10472e
git bisect run ../test/git_bisect_testscript.sh

I determined that commit d5e254e19c59fcc49265dda64007690af08b6e28 from Nov
5th 2014 was the first commit where GCC compiled for target lm32-elf
started to emit the wrong sequence of instructions.
This commit has a huge amount of changes, but I was able to pinpoint that
erasing the following code:

      assign_parm_adjust_stack_rtl (&data);

      if (assign_parm_setup_block_p (&data))
assign_parm_setup_block (&all, parm, &data);
      else if (data.passed_pointer || use_register_for_decl (parm))
assign_parm_setup_reg (&all, parm, &data);
      else
assign_parm_setup_stack (&all, parm, &data);

from the gcc/functions.c file around line 3670 was the major culprit for
the generation of erroneous sequence of instructions.

Meanwhile this piece of code was at some point in time reinstated to
gcc/functions.c file but instead of 'data.passed_pointer' there is now a
'data.arg.pass_by_reference'. The struct data does not have a
passed_pointer field anymore and operations that were dependent of that
struct field were deleted from the  gcc/functions.c file, so I cannot
simply add that condition back and see what happens.

I understand that most likely there is a bug in the backed port of lm32,
and not in the common code of GCC, but there have been no updates to the
lm32 port (besides maintenance ones), and as I mentioned I do not have much
knowledge of GCC internals, so my focus was more identifying which change
led to this erroneous behavior.

But because this change was performed 7 years ago, many other improvements
were implemented, and now there is no chance to simply revert the code
change that I have identified as the point in time were things broke...

I am attaching my testcase (e_printf() C function), and script used in my
git bisect investigation that has the options with which I have been
compiling GCC if someone is willing to have a look.

I can use GCC version 4.9.4 but I have an activity which implies comparing
performance of several ISAs and I would prefer to use recent released
versions of GCC.

Thank you,
Nelson  Ribeiro
typedef __builtin_va_list __gnuc_va_list;

typedef __gnuc_va_list va_list;

typedef unsigned char uint8;
typedef char int8;
typedef unsigned short uint16;
typedef short int16;
typedef unsigned long uint32;
typedef long int32;
typedef unsigned long long uint64;
typedef long long int64;

typedef unsigned char uint8_t;
typedef char int8_t;
typedef unsigned short uint16_t;
typedef short int16_t;
typedef unsigned long uint32_t;
typedef long int32_t;
typedef unsigned long long uint64_t;
typedef long long int64_t;

typedef unsigned char* charptr;
void e_printf( const charptr ctrl1, ...);


typedef struct params_s {
    int len;
    int num1;
    int num2;
    char pad_character;
    int do_padding;
    int left_flag;
} params_t;

extern void outbyte (char);
extern void outs( charptr lp, params_t *par);
extern void outnum( const long n, const long base, params_t *par);
extern int getnum( charptr* linep);

void e_printf( const charptr ctrl1, ...)
{

    int long_flag;
    int dot_flag;
    int skip_switch;

    params_t par;

    char ch;
    va_list argp;
    charptr ctrl = ctrl1;

    __builtin_va_start(argp,ctrl1);

    for ( ; *ctrl; ctrl++) {


        if (*ctrl != '%') {
            outbyte(*ctrl);
            continue;
        }


        dot_flag = long_flag = par.left_flag = par.do_padding = 0;
        par.pad_character = ' ';
        par.num2=32767;

 try_next:
        ch = *(++ctrl);
        skip_switch = 0;

        if (((ch >= '0') && (ch <= '9'))) {
            if (dot_flag)
                par.num2 = getnum(&ctrl);
            else {
                if (ch == '0')
                    par.pad_character = '0';

                par.num1 = getnum(&ctrl);
                par.do_padding = 1;
            }
            ctrl--;
            skip_switch = 1;
        }

        if (skip_switch == 0){
           switch (((((ch>='A') && (ch<='Z')) ? (ch) - 'A' + 'a' : ch))) {
               case '%':
                   outbyte( '%');
                   continue;

               case '-':
                   par.left_flag = 1;
                   break;

               case '.':
                   dot_flag = 1;
                   break;

               case 'l':
                   long_flag = 1;
                   break;

               case 'd':
                   if (long_flag || ch == 'D') {
                       outnum( __builtin_va_arg(argp,long), 10L, &par);
                       continue;
                   }
                   else {
                       outnum( __builtin_va_arg(argp,int), 10L, &par);
                       continue;
                   }
               case 'x':
                   outnum((long)__builtin_va_arg(argp,int), 16L, &par);
                   continue;

               case 's':
                   outs( __builtin_va_arg(argp,charptr), &par);
                   continue;

               case 'c':
                   outbyte( __builtin_va_arg(argp,int));
                   continue;

               case '\\':
                   switch (*ctrl) {
                       case 'a':
                           outbyte( 0x07);
                           break;
                       case 'h':
                           outbyte( 0x08);
                           break;
                       case 'r':
                           outbyte( 0x0D);
                           break;
                       case 'n':
                           outbyte( 0x0D);
                           outbyte( 0x0A);
                           break;
                       default:
                           outbyte( *ctrl);
                           break;
                   }
                   ctrl++;
                   break;

               default:
                   continue;
           }
        }
        goto try_next;
    }
    __builtin_va_end(argp);
}

Attachment: git_bisect_testscript
Description: Binary data

Reply via email to