Hello Richard,
On Sun, Nov 23, 2008 at 05:14:27PM +0000, Richard Sandiford wrote:
> Richard Sandiford <[EMAIL PROTECTED]> writes:
> > Manuel Lauss <[EMAIL PROTECTED]> writes:
> >> Admittedly my understanding of mips assembly is not yet very advanced, am I
> >> missing something or is this a bug?
> >
> > Well, it's a missed optimisation, certainly. Fortunately,
> > it's conceptually fairly easy to fix.
>
> FWIW, I've attached a patch against GCC 4.4 below. Tested on
> mipsisa64-elfoabi.
>
> I don't think it's appropriate to apply the patch at this stage
> in the development cycle, so I'll hold it back for GCC 4.5.
> I'll add some testcases then too.
I tested this with 4.3.2 and it works very well. One minor thing though
if you don't mind. Consider this C snippet:
-------------- 8< ------------------------- >8 ----------------
#define SYS_CNTRL_M21 (1 << 19)
#define SYS_COUNTER_CNTRL 0xb1900014
#define SYS_RTCMATCH2 0xb1900054
#define SYS_RTCREAD 0xb1900058
int au1x_rtcmatch2_set_next_event(unsigned long delta, void *x)
{
unsigned long next;
next = *(unsigned long *)(SYS_RTCREAD) + delta;
/* wait for register access */
while (*(unsigned long *)(SYS_COUNTER_CNTRL) & SYS_CNTRL_M21)
;
*(unsigned long *)(SYS_RTCMATCH2) = next;
asm volatile ("sync");
}
-------------- 8< ------------------------- >8 -------------
My patched gcc generates this with -O2 (-Os is actually worse):
0x00000028 <au1x_rtcmatch2_set_next_event+0>: lui v0,0xb190
0x0000002c <au1x_rtcmatch2_set_next_event+4>: lw v1,88(v0)
0x00000030 <au1x_rtcmatch2_set_next_event+8>: lui a2,0xb190
0x00000034 <au1x_rtcmatch2_set_next_event+12>: lui a1,0x8
0x00000038 <au1x_rtcmatch2_set_next_event+16>: lw v0,20(a2)
0x0000003c <au1x_rtcmatch2_set_next_event+20>: and v0,v0,a1
0x00000040 <au1x_rtcmatch2_set_next_event+24>: bnez v0,0x38
<au1x_rtcmatch2_set_next_event+16>
0x00000044 <au1x_rtcmatch2_set_next_event+28>: lui v0,0xb190
0x00000048 <au1x_rtcmatch2_set_next_event+32>: addu v1,a0,v1
0x0000004c <au1x_rtcmatch2_set_next_event+36>: sw v1,84(v0)
0x00000050 <au1x_rtcmatch2_set_next_event+40>: sync
0x00000054 <au1x_rtcmatch2_set_next_event+44>: jr ra
0x00000058 <au1x_rtcmatch2_set_next_event+48>: nop
IMO gcc's register usage is a bit odd; when the 'addu' at 0x48
is moved to 0x30 (as 'addu a0,a0,v1'), v1 could be reused and the
two additional 'lui v0,0xb190' become unecessary, saving 2 insns.
(This code is called fairly often, so I do care about performance ;-))
Thanks again,
Manuel Lauss