Hello! Attached patch fixes a failure, where no base register can be found to save the remaining SSE (and integer) registers to stack. Since we are able to restore registers in the prollogue using:
m->fs.sp_valid = (!frame_pointer_needed || (crtl->sp_is_unchanging && !stack_realign_fp)); we can use similar approach at the end of the ix86_expand_prologue: m->fs.sp_valid = !frame_pointer_needed; This way, we will always use frame pointer, if one is available or if frame pointer is not available, the (realigned) stack register for saves. 2016-01-06 Uros Bizjak <ubiz...@gmail.com> PR target/69140 * config/i386/i386.c (ix86_expand_prologue): Declare fs.sp_valid depending on frame_pointer_needed before remaining integer and SSE registers are saved. testsuite/ChangeLog: 2016-01-06 Uros Bizjak <ubiz...@gmail.com> PR target/69140 * gcc.target/i386/pr69140.c: New test Patch was bootstrapped and regression tested on x86_64-linux-gnu for all default languages, Obj-C++ and Go. The patch was also tested by the reporter by building and running Wine without problems. I have to admit that prologue generation is hard to test, since it has many different code paths. I guess committing the one-liner in the hope that it won't break some target is the best approach one can do. So, the patch was committed to mainline. I plan to backport it to gcc-5 branch after a week or two without problems in mainline. Uros.
Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 232102) +++ config/i386/i386.c (working copy) @@ -13065,6 +13065,8 @@ ix86_expand_prologue (void) m->fs.fp_valid = true; } + m->fs.sp_valid = !frame_pointer_needed; + if (!int_registers_saved) ix86_emit_save_regs_using_mov (frame.reg_save_offset); if (!sse_registers_saved) Index: testsuite/gcc.target/i386/pr69140.c =================================================================== --- testsuite/gcc.target/i386/pr69140.c (nonexistent) +++ testsuite/gcc.target/i386/pr69140.c (working copy) @@ -0,0 +1,22 @@ +/* { dg-do compile { target lp64 } } */ +/* { dg-options "-O2 -mincoming-stack-boundary=3" } */ + +typedef struct { + unsigned int buf[4]; + unsigned char in[64]; +} MD4_CTX; + +static void MD4Transform(unsigned int buf[4], const unsigned int in[16]) +{ + unsigned int a, b, c, d; + (b) += ((((c)) & ((d))) | ((~(c)) & ((a)))) + (in[7]); + (a) += ((((b)) & ((c))) | ((~(b)) & ((d)))) + (in[8]); + (d) += ((((a)) & ((b))) | ((~(a)) & ((c)))) + (in[9]); + buf[3] += d; +} + +void __attribute__((ms_abi)) MD4Update( MD4_CTX *ctx, const unsigned char *buf) +{ + MD4Transform( ctx->buf, (unsigned int *)ctx->in); + MD4Transform( ctx->buf, (unsigned int *)ctx->in); +}