> Interestingly the code (bar4) that four times expands the whole > 'get position' calculation is smaller (585 < 622) than the more > conventionally coded subroutine (bar5) based version. > > And, maybe not surprisingly, it is about four times faster. > (The execution times are of course estimate based on code > size and the execution path assumption). > > Without testing conventional wisdom would have led me to > believe that the macro version would be much bigger than > the subroutine based one.
I'm not actually surprised and I've seen similar in Z80 code generation. SDCC isn't bright enough to do massive inlining of functions which means it's also not bright enough to then realise that i is a constant in each case. Very few compilers except gcc will figure that one out unaided. Your first version can generate constant loads, the latter has to do indexing to compute a pointer and then add on offsets. You might also want to compare passing &g_stepper_states[i] instead in the function case. It ought to produce the same results but SDCC doesn't always figure that out (and in some cases such as assignments it can't always do so due to the rather braindead aliasing rules in C). The asm approach on many processors would be to use self modifying code but sdcc (probably quite sensibly) doesn't do this even on those it could 8) Alan ------------------------------------------------------------------------------ Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk _______________________________________________ Sdcc-user mailing list Sdcc-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/sdcc-user