Am 17.07.24 um 19:51 schrieb Jeff Law:
On 7/17/24 11:13 AM, Georg-Johann Lay wrote:
Am 17.07.24 um 17:55 schrieb Jeff Law:
On 7/17/24 9:26 AM, Georg-Johann Lay wrote:
It looks fine for the trunk. Out of curiosity, does the avr port
implement linker relaxing for this case? That would seem to be
No. avr-ld performs relaxing, but only the two cases of
- JMP/CALL to RJMP/RCALL provided the offset fits.
- [R]CALL+RET to [R]JMP provided there's no label between.
Yea, the first could be comparable to other targets. The second is
probably not all that common since the compiler should be doing that
tail call elimination.
It should. But there are cases where gcc doesn't optimize, like
float add (float a, float b)
{
return a + b;
}
Presumably the a+b is handled via a libcall rather than a normal call? I
guess there might be something in the path where that needs special
handling. It's been like 20+ years since I was last in that code.
Conceptually I don't see a reason why libcalls would need to be special.
Then there are the calls that are not visible to the compiler, like
long mul (long a, long b)
{
return b * a;
}
so that the linker relaxations still have something to do.
Yea, if you're emitting the call behind the back of the compiler for
this kind of case, then the linker is your only real shot. I did
something like that for a few key operations on the mn102 chip eons ago.
One job for Binutils could be optimizing fixed registers like in
char mul3 (char a, char b, char c)
{
return a * b * c;
}
mul3:
mul r22,r20 ; 21 [c=12 l=3] *mulqi3_enh
mov r22,r0
clr r1
mul r22,r24 ; 22 [c=12 l=3] *mulqi3_enh
mov r24,r0
clr r1
ret ; 25 [c=0 l=1] return
The first "clr r1" is void due to the following mul.
Just like GCC PR20296, the only feasible solution is by letting Binutils
do the job. But I have no idea how to adjust branches without labels
like RJMP .+20 that cross an instruction that's optimized out.
I suspect the most important step is to prevent the assembler from
resolving pc-relative jumps and instead emit a suitable relocation. Once
that's done I think the branches should get adjusted automatically.
May be that's already the case with -mlink-relax? IIRC that was
introduced to keep the assembler from resolving label differences
when the linker may relax and hence change label differences, because
it shredded debug info.
...appears to work:
void trelax (void)
{
__asm ("rjmp .+4" "\n\t"
"rcall main" "\n\t"
"ret" "\n\t"
"inc r0");
}
int main (void)
{
return 0;
}
with -mrelax, the code is:
0000004c <trelax>:
4c: 01 c0 rjmp .+2 ; 0x50 <L0^A+0x2>
0000004e <L0^A>:
4e: 15 c0 rjmp .+42 ; 0x7a <main>
50: 03 94 inc r0
52: 08 95 ret
so that the RJMP is still targeting the INC. Though the
very optimization is performed by ld and not by gas.
And there is the complication that a zero_reg optimization
must only be performed on asm code from C/C++ that is using
the avr-gcc ABI. But that could be handled by options, so
we'd have a change to the decide-specs again :-/
Or maybe better by a directive like .abi gcc or so.
Johann