Hi,

My goal is to create optimal C bindings for Atari ST system calls, using m68k-elf-gcc (tested with version 7.1.0). Basically, system calls are similar to function calls: parameters are stacked in the reverse order, last one being a function number. But there are 2 differences from standard C calling convention:
- actual jump to subroutine is replaced by trap #1
- clobbered registers include d2/a2 in addition to standard d0-d1/a0-a1

Here is the problem. If I use "g" as parameter constraint, everything looks fine, but GCC may replace the parameter token with a reference to a stack location (specially when compiling with -fomit-frame-pointer). If another parameter has already been pushed on the stack, the reference to the second one is not properly adjusted, so the pushed address is wrong.

My wish is to add a parameter constraint telling: "use whatever you want as parameter, but nothing relative to the stack".
Any idea how to achieve that?
I know that I can use "r" as constraint to force GCC to put data into registers before calling the assembler template, but that produces extra instructions, hence suboptimal code. Ideally, GCC should be able to adjust the stack offsets depending on what has already been pushed. But I know this is not possible as GCC doesn't interpret the assembly template, so it can't understand that it contains stack pushes.

Here is a concrete example.
See result with:
m68k-elf-gcc -c test.c -O2 -fomit-frame-pointer && m68k-elf-objdump -d test.o

/* test.c */
static __inline__
long Maddalt(void *start, long size)
{
    register long ret __asm__("d0");

    __asm__ volatile (
        "move.l  %2,-(%%sp)\n\t"
        "move.l  %1,-(%%sp)\n\t"
        "move.w  #20,-(%%sp)\n\t"
        "trap    #1\n\t"
        "lea     10(%%sp),%%sp"
        : "=r"(ret)
        : "g"(start), "g"(size)
        : "d1", "d2", "a0", "a1", "a2"
    );

    return ret;
}

long f(long dummy, void *start, long size)
{
    return Maddalt(start, size);
}

/* Result */
   0:   2f0a            movel %a2,%sp@-
   2:   2f02            movel %d2,%sp@-
   4:   2f2f 0014       movel %sp@(20),%sp@-
   8:   2f2f 0010       movel %sp@(16),%sp@- // Bug: should be %sp@(20)
   c:   3f3c 0014       movew #20,%sp@-
  10:   4e41            trap #1
  12:   4fef 000a       lea %sp@(10),%sp
  16:   241f            movel %sp@+,%d2
  18:   245f            moveal %sp@+,%a2
  1a:   4e75            rts

At first glance, this looks optimal:
- Clobbered registers d2 and a2 are properly saved/restored on the stack.
- Parameters are taken from the stack and pushed again

%2 is replaced by %sp@(20). Offset is correct: the stack contains d2/a2 pushed by GCC (8 bytes), the return address (4 bytes), the "dummy" and "start" parameters (8 bytes), for a total of 20 bytes. This is indeed the right offset to reach the "size" parameter.

%1 is replaced by %sp@(16). This was correct at the beginning of the template. But as we have just pushed a long, we must add 4 to the offset. So the right value would actually be %sp@(20). In this case, GCC produces wrong code. This is the issue I want to solve.

Here are some alternatives, none of them are satisfactory.

1) Use "r" as constraints:
   0:   48e7 3020       moveml %d2-%d3/%a2,%sp@-
   4:   202f 0014       movel %sp@(20),%d0
   8:   262f 0018       movel %sp@(24),%d3
   c:   2f03            movel %d3,%sp@-
   e:   2f00            movel %d0,%sp@-
  10:   3f3c 0014       movew #20,%sp@-
  14:   4e41            trap #1
  16:   4fef 000a       lea %sp@(10),%sp
  1a:   4cdf 040c       moveml %sp@+,%d2-%d3/%a2
  1e:   4e75            rts

This produces correct code. But cost is usage of extra intermediate registers, which may have to be saved/restored.

2) Disable -fomit-frame-pointer for the wrap function:
__attribute__ ((optimize("no-omit-frame-pointer")))
static __inline__
long Maddalt(void *start, long size)
...

00000000 <Maddalt>:
   0:   4e56 0000       linkw %fp,#0
   4:   2f0a            movel %a2,%sp@-
   6:   2f02            movel %d2,%sp@-
   8:   2f2e 000c       movel %fp@(12),%sp@-
   c:   2f2e 0008       movel %fp@(8),%sp@-
  10:   3f3c 0014       movew #20,%sp@-
  14:   4e41            trap #1
  16:   4fef 000a       lea %sp@(10),%sp
  1a:   241f            movel %sp@+,%d2
  1c:   245f            moveal %sp@+,%a2
  1e:   4e5e            unlk %fp
  20:   4e75            rts

00000022 <f>:
  22:   2f6f 0008 0004  movel %sp@(8),%sp@(4)
  28:   2f6f 000c 0008  movel %sp@(12),%sp@(8)
  2e:   60d0            bras 0 <Maddalt>

Again this produces correct code, but it adds useless linkw/unlk instructions. And moreover, it prevents the binding function to be inlined.

Any hints to solve this issue will be appreciated.

--
Vincent Rivière

Reply via email to