Hello
> > Dose anyone know the reason why they are not clobbered?
>
> So that they don't have to be saved. This function is supposed to be
> very fast. If you want to use a slow implementation, write an
> assembly wrapper which saves additional registers.
This might be the initial plan.
But is this true?
Without clobbering the registers r1-r3
the compiler generates something like this:
ldr r3,[pc, #48]
bl __aeabi_read_tp
adds r7, r0, r3
..
Additional a push and pop of r1-r3 in function __aeabi_read_tp () might
be required.
With clobbering I can see:
bl __aeabi_read_tp
ldr r3,[pc, #48]
adds r7, r0, r3
..
Here the clobbered version is faster.
Maybe there is an other reason not to clobber.
> > The next point is that the __builtin_thread_pointer() call isn't
> > ARM/Thumb interwork save.
> > To use the "hard" Coprocessor fetch instruction the calling function
> > must run in ARM mode.
>
> True (or Thumb-2, I think).
>
> > To use "soft" implementation caller and __aeabi_read_tp() must run in
> > the same mode.
>
> I don't believe that this is true. In what way is it not safe?
A "bl __aeabi_read_tp" call does not exchanging the mode.
So the program simply crashes.
Using a "blx" instruction dose the mode exchange,
but this instruction only exists since ArchV5, so this won't help for
ArchV4T (aka ARM7TDMI).
Long calls also seems not to be handled here.
(There might be reason not to handle this.)
That's why I'm asking.
Is the implementation still incomplete?
regards
Thomas