Hi,
My goal is to create optimal C bindings for Atari ST system calls, using
m68k-elf-gcc (tested with version 7.1.0). Basically, system calls are
similar to function calls: parameters are stacked in the reverse order, last
one being a function number. But there are 2 differences from standard C
calling convention:
- actual jump to subroutine is replaced by trap #1
- clobbered registers include d2/a2 in addition to standard d0-d1/a0-a1
Here is the problem. If I use "g" as parameter constraint, everything looks
fine, but GCC may replace the parameter token with a reference to a stack
location (specially when compiling with -fomit-frame-pointer). If another
parameter has already been pushed on the stack, the reference to the second
one is not properly adjusted, so the pushed address is wrong.
My wish is to add a parameter constraint telling: "use whatever you want as
parameter, but nothing relative to the stack".
Any idea how to achieve that?
I know that I can use "r" as constraint to force GCC to put data into
registers before calling the assembler template, but that produces extra
instructions, hence suboptimal code.
Ideally, GCC should be able to adjust the stack offsets depending on what
has already been pushed. But I know this is not possible as GCC doesn't
interpret the assembly template, so it can't understand that it contains
stack pushes.
Here is a concrete example.
See result with:
m68k-elf-gcc -c test.c -O2 -fomit-frame-pointer && m68k-elf-objdump -d test.o
/* test.c */
static __inline__
long Maddalt(void *start, long size)
{
register long ret __asm__("d0");
__asm__ volatile (
"move.l %2,-(%%sp)\n\t"
"move.l %1,-(%%sp)\n\t"
"move.w #20,-(%%sp)\n\t"
"trap #1\n\t"
"lea 10(%%sp),%%sp"
: "=r"(ret)
: "g"(start), "g"(size)
: "d1", "d2", "a0", "a1", "a2"
);
return ret;
}
long f(long dummy, void *start, long size)
{
return Maddalt(start, size);
}
/* Result */
0: 2f0a movel %a2,%sp@-
2: 2f02 movel %d2,%sp@-
4: 2f2f 0014 movel %sp@(20),%sp@-
8: 2f2f 0010 movel %sp@(16),%sp@- // Bug: should be %sp@(20)
c: 3f3c 0014 movew #20,%sp@-
10: 4e41 trap #1
12: 4fef 000a lea %sp@(10),%sp
16: 241f movel %sp@+,%d2
18: 245f moveal %sp@+,%a2
1a: 4e75 rts
At first glance, this looks optimal:
- Clobbered registers d2 and a2 are properly saved/restored on the stack.
- Parameters are taken from the stack and pushed again
%2 is replaced by %sp@(20). Offset is correct: the stack contains d2/a2
pushed by GCC (8 bytes), the return address (4 bytes), the "dummy" and
"start" parameters (8 bytes), for a total of 20 bytes. This is indeed the
right offset to reach the "size" parameter.
%1 is replaced by %sp@(16). This was correct at the beginning of the
template. But as we have just pushed a long, we must add 4 to the offset. So
the right value would actually be %sp@(20). In this case, GCC produces wrong
code. This is the issue I want to solve.
Here are some alternatives, none of them are satisfactory.
1) Use "r" as constraints:
0: 48e7 3020 moveml %d2-%d3/%a2,%sp@-
4: 202f 0014 movel %sp@(20),%d0
8: 262f 0018 movel %sp@(24),%d3
c: 2f03 movel %d3,%sp@-
e: 2f00 movel %d0,%sp@-
10: 3f3c 0014 movew #20,%sp@-
14: 4e41 trap #1
16: 4fef 000a lea %sp@(10),%sp
1a: 4cdf 040c moveml %sp@+,%d2-%d3/%a2
1e: 4e75 rts
This produces correct code. But cost is usage of extra intermediate
registers, which may have to be saved/restored.
2) Disable -fomit-frame-pointer for the wrap function:
__attribute__ ((optimize("no-omit-frame-pointer")))
static __inline__
long Maddalt(void *start, long size)
...
00000000 <Maddalt>:
0: 4e56 0000 linkw %fp,#0
4: 2f0a movel %a2,%sp@-
6: 2f02 movel %d2,%sp@-
8: 2f2e 000c movel %fp@(12),%sp@-
c: 2f2e 0008 movel %fp@(8),%sp@-
10: 3f3c 0014 movew #20,%sp@-
14: 4e41 trap #1
16: 4fef 000a lea %sp@(10),%sp
1a: 241f movel %sp@+,%d2
1c: 245f moveal %sp@+,%a2
1e: 4e5e unlk %fp
20: 4e75 rts
00000022 <f>:
22: 2f6f 0008 0004 movel %sp@(8),%sp@(4)
28: 2f6f 000c 0008 movel %sp@(12),%sp@(8)
2e: 60d0 bras 0 <Maddalt>
Again this produces correct code, but it adds useless linkw/unlk
instructions. And moreover, it prevents the binding function to be inlined.
Any hints to solve this issue will be appreciated.
--
Vincent Rivière