https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90434

            Bug ID: 90434
           Summary: [regression from 8.x] Incorrect code generation for
                    __builtin_strcmp with LTO and freestanding binary
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: lto
          Assignee: unassigned at gcc dot gnu.org
          Reporter: andrew.cooper3 at citrix dot com
                CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Hello,

Apologies for this being a complicated bug report, but I think I may need some
help tracking things down further.  Nevertheless, I am moderately confident it
is a regression in GCC 9.x, because 8.x works correctly.


I have two versions of GCC built today from the tip of stable branches.

$ PATH=/local/bin/gcc-8.x/bin:$PATH gcc --version
gcc (GCC) 8.3.1 20190511
...

$ PATH=/local/bin/gcc-9.x/bin:$PATH gcc --version
gcc (GCC) 9.1.1 20190511

The 8.x version works correctly, while the 9.x version results in incorrect
code generation, which leads to a crash when executed.  Both versions work fine
when LTO is not in use.


The environment is http://xenbits.xen.org/docs/xtf/ which is a set of
microkerenls.  As such, they are freestanding builds and linked to a single
final executable, but are standard x86 32bit and 64bit ELF binaries.

Because -ffreestanding disables a whole swathe of builtin optimisations, the
headers locally reinsert the optimisations in the following manner:

int strcmp(const char *s1, const char *s2);
#define strcmp(s1, s2) __builtin_strcmp(s1, s2)

An implementation of strcmp() is also provided for when a real call is emitted
by the builtin:

int (strcmp)(const char *_s1, const char *_s2)
{
    unsigned char s1, s2;

    do {
        s1 = *_s1++;
        s2 = *_s2++;
    } while ( s1 && s1 == s2 );

    return (s1 < s2) ? -1 : (s1 > s2);
}


The problematic test in question calls strcmp() from its main function.  With
8.x, the builtin is expanded inline, and (having no callers) the strcmp()
function itself is omitted from the final binary.

With 9.x, the builtin is not expanded inline, and a call to strcmp() is
emitted.  strcmp() itself crashes when it first dereferences the second
parameter.


First of all, I found that if I replace the above strcmp() implementation with
this alternative (which I borrowed from the Linux Kernel):

int (strcmp)(const char *_s1, const char *_s2)
{
    unsigned char s1, s2;

    while ( 1 )
    {
        s1 = *_s1++;
        s2 = *_s2++;

        if ( s1 != s2 )
            return s1 < s2 ? -1 : 1;
        if ( !s1 )
            break;
    }
    return 0;
}

then 9.x *does* expand the builtin inline, and the strcmp() function itself is
omitted from the final binary.  This binary also executes correctly.


Therefore, something is clearly inspecting the content of my locally provided
strcmp() function and deciding that it can't expand the builtin inline. 
However, I'm not sure what it is checking for, because AFAICT, the ABI of those
two variations is identical.


That on its own would only cause an efficiently problem, not a correctness
problem.

When inspecting the disassembly of the bad binary, I see that strcmp() has its
parameters passed in registers rather than on the stack.  AFAICT, the value of
the _s1 parameter is always correct, while the value of _s2 seems to be less
than 256 (possibly a single char?).  The crash occurs because the deference of
_s2 hits the zero page.

As a result, it is clear that the strcmp() function has been compiled with a
non-standard ABI.  AFAICT, the calling code in main() does conform to the
expected ABI, by passing the parameters on the stack.

Therefore, I wonder whether the code actually emitted for strcmp() was supposed
to be a specialised version (strcmp.isra.$foo perhaps, or something else),
which actually got called with the original ABI.


Anyway - that is as far as I've got debugging.  Can anyone suggest further
steps to help narrow this down?

Reply via email to