Hello, I've been working with some old programs that have been build with other compilers and moving them to GCC. The code is for an embedded m68k (mcpu32) application with no onboard OS (yet). I've been disappointed with the size of the code that I've seen generated by the compiler, and after looking at the disassembly, I think that a lot of the bloat is due to the fact that the compiler is using little to no indirect address access, even with -Os. Here's an example of something I might see:
file: test.c static int main_line = 0; int foo(int *a, int *b, int *c, int *d) { #define DOLINE if(*pmain_line < __LINE__) *pmain_line = __LINE__ register int *const pmain_line = &main_line; DOLINE; *a = *b+*c; DOLINE; *b = *c+*d; DOLINE; *c = *d+*a; DOLINE; *d = *a+*b; return *a; } m68k-elf-gcc -mcpu32 -Os -g -c test.c -o test m68k-elf-objdump -S test test: file format elf32-m68k Disassembly of section .text: 00000000 <foo>: static int main_line = 0; int foo(int *a, int *b, int *c, int *d) { 0: 4e56 0000 linkw %fp,#0 4: 2f0b movel %a3,[EMAIL PROTECTED] 6: 2f0a movel %a2,[EMAIL PROTECTED] 8: 226e 0008 moveal %fp@(8),%a1 c: 266e 000c moveal %fp@(12),%a3 10: 206e 0010 moveal %fp@(16),%a0 14: 246e 0014 moveal %fp@(20),%a2 #define DOLINE if(*pmain_line < __LINE__) *pmain_line = __LINE__ register int *const pmain_line = &main_line; DOLINE; *a = *b+*c; 18: 7005 moveq #5,%d0 1a: b0b9 0000 0000 cmpl 0 <foo>,%d0 20: 6d0a blts 2c <foo+0x2c> 22: 103c 0006 moveb #6,%d0 26: 23c0 0000 0000 movel %d0,0 <foo> 2c: 2013 movel %a3@,%d0 2e: d090 addl %a0@,%d0 30: 2280 movel %d0,%a1@ DOLINE; *b = *c+*d; 32: 7006 moveq #6,%d0 34: b0b9 0000 0000 cmpl 0 <foo>,%d0 3a: 6d0a blts 46 <foo+0x46> 3c: 103c 0007 moveb #7,%d0 40: 23c0 0000 0000 movel %d0,0 <foo> 46: 2010 movel %a0@,%d0 48: d092 addl %a2@,%d0 4a: 2680 movel %d0,%a3@ DOLINE; *c = *d+*a; 4c: 7007 moveq #7,%d0 4e: b0b9 0000 0000 cmpl 0 <foo>,%d0 54: 6d0a blts 60 <foo+0x60> 56: 103c 0008 moveb #8,%d0 5a: 23c0 0000 0000 movel %d0,0 <foo> 60: 2012 movel %a2@,%d0 62: d091 addl %a1@,%d0 64: 2080 movel %d0,%a0@ DOLINE; *d = *a+*b; 66: 7008 moveq #8,%d0 68: b0b9 0000 0000 cmpl 0 <foo>,%d0 6e: 6d0a blts 7a <foo+0x7a> 70: 103c 0009 moveb #9,%d0 74: 23c0 0000 0000 movel %d0,0 <foo> 7a: 2011 movel %a1@,%d0 7c: d093 addl %a3@,%d0 7e: 2480 movel %d0,%a2@ return *a; } 80: 2011 movel %a1@,%d0 82: 245f moveal [EMAIL PROTECTED],%a2 84: 265f moveal [EMAIL PROTECTED],%a3 86: 4e5e unlk %fp 88: 4e75 rts Here I've used a macro to keep track of the farthest place reached in the code. As you can see, I've even tried to set it up in such a way that it will use a register to access the value. However, I don't get that result, as I guess that is optimized out. Instead each comparison uses the full address of the array, creating two more words for the read and for the write. I'd prefer a sequence to read something like: movel #main_line, %a0 /* only once, at the start of the function */ moveq #(LINE-1), %d0 cmpl %a0@, %d0 blt skip moveb #LINE, %d0 movel %d0,%a0@ skip: ... I haven't seen any options that encourage more use of indirect addressing. Are there any that I have missed? If not, I assume I will need to work with the machine description. I've downloaded the gcc internals book, but it's a lot of material and it's hard to figure out where to start. Can anybody point me in the right direction? Thanks, Luke Powell Project Engineer BJ Services