> Interestingly the code (bar4) that four times expands the whole
> 'get position'  calculation is smaller (585 < 622) than the more
> conventionally coded subroutine (bar5) based version.
> 
> And, maybe not surprisingly, it is about four times faster.
> (The execution times are of course estimate based on code
> size and the execution path assumption).
> 
> Without testing conventional wisdom would have led me to
> believe that the macro version would be much bigger than
> the subroutine based one.

I'm not actually surprised and I've seen similar in Z80 code generation.
SDCC isn't bright enough to do massive inlining of functions which means
it's also not bright enough to then realise that i is a constant in each
case. Very few compilers except gcc will figure that one out unaided.

Your first version can generate constant loads, the latter has to do
indexing to compute a pointer and then add on offsets.

You might also want to compare passing &g_stepper_states[i] instead in
the function case. It ought to produce the same results but SDCC doesn't
always figure that out (and in some cases such as assignments it can't
always do so due to the rather braindead aliasing rules in C).

The asm approach on many processors would be to use self modifying code
but sdcc (probably quite sensibly) doesn't do this even on those it could
8)

Alan


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Sdcc-user mailing list
Sdcc-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sdcc-user

Reply via email to