------- Additional Comments From uros at kss-loka dot si 2004-10-25 14:35 -------
The problem here is triggered in reload() function around line 950, this part
(#ifdef'd part was added by me:):
for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
if (reg_renumber[i] < 0 && reg_equiv_memory_loc[i])
{
rtx x = eliminate_regs (reg_equiv_memory_loc[i], 0, NULL_RTX);
#ifdef DEBUG
debug_rtx(reg_equiv_memory_loc[i]);
debug_rtx(x);
#endif
if (strict_memory_address_p (GET_MODE (regno_reg_rtx[i]),
XEXP (x, 0)))
reg_equiv_mem[i] = x, reg_equiv_address[i] = 0;
...
For the testcase from comment #9 (converted to plain c), 'gcc -O2 -msse' will
produce relevant debug information:
IN:
(mem:V4SF (plus:SI (reg/f:SI 20 frame)
(const_int -32 [0xffffffe0])) [6 S16 A8])
OUT:
(mem:V4SF (plus:SI (reg/f:SI 6 bp)
(const_int -56 [0xffffffc8])) [6 S16 A8])
So, the problem is inside eliminate_regs() function, that unaligns otherwise
aligned address. This unaligned address is passed down and somewhere around line
1214, following code will be triggered:
for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
{
rtx addr = 0;
if (reg_equiv_mem[i])
addr = XEXP (reg_equiv_mem[i], 0);
...
and this addr is used as new (unaligned) address on stack.
To further analyze this issue: following RTX is passed to eliminate_regs():
(mem:V4SF (plus:SI (reg/f:SI 20 frame)
(const_int -32 [0xffffffe0])) [6 S16 A8])
After getting through MEM: case, function recurses to PLUS: case, where
following RTX is processed:
plus:SI (reg/f:SI 20 frame)
(const_int -32 [0xffffffe0]))
and this code is triggered:
...
else
return gen_rtx_PLUS (Pmode, ep->to_rtx,
plus_constant (XEXP (x, 1),
ep->previous_offset));
where ep->previous_offset (when substituting frame pointer with ebp) equals
(-24). And the resulting sum is then -56.
I'm a little lost here, what previous_offset field represents, perhaps someone
with more knowledge could find, if magic number (-24) is OK [it is not!].
BTW: the testcase from comment #9 when -fomit-frame-pointer is added to
compilation flags produces correctly aligned address, because
ep->previous_offset, when substituting frame pointer with esp, equals to 64.
-32 + 64 = 32 in this case.
Regarding comment #17: Perhaps original testcase still uses ebp, even with
'-fomit-frame-pointer'.
Uros.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17990