Hi,

I'd like to gather some opinions and advice on the expansion of
__builtin_frame_address, as discussed on gcc-patches last year [1, 2].
This centres on the following comment in expand_builtin_return_addr
arising from revision 103294 last year:
/* For a zero count, we don't care what frame address we return, so frame
   pointer elimination is OK, and using the soft frame pointer is OK.
   For a non-zero count, we require a stable offset from the current frame
   pointer to the previous one, so we must use the hard frame pointer, and
   we must disable frame pointer elimination.  */

I believe that, when this function is used to expand
__builtin_frame_address (), the first sentence isn't true: in some cases,
a function does care about the value at count == 0.  A concrete
example is the glibc backtrace () function [3] which uses the expression
__builtin_frame_address (0) to determine the starting point for stack
traversal.  (It performs subsequent dereferences back down the chain
of frame pointers by itself.)

The wording of the comment in itself is unfortunately not the end of
the issue.  Due to the subsequent conditional testing count == 0, the
builtin can expand to an erroneous value when the following conditions
are met:

- the expansion function is invoked with count set to zero; and

- the target is such that frame_pointer_rtx and hard_frame_pointer_rtx
do not ultimately yield the same address.

An example of such an invocation occurs during compilation of the
aforementioned backtrace function, and an example of such a target is the
ARM.  Currently, calling backtrace () on such a target yields a failure.

Let us just consider how to fix the ARM case for a moment.  The obvious
thing to do is to define INITIAL_FRAME_ADDRESS_RTX.  However, the correct
semantics of such a macro definition would be to:

- set current_function_accesses_prior_frames to 1, so that FP is not
eliminated in this function; and

- return hard_frame_pointer_rtx.

I hope I'm not alone in thinking that such a side-effecting macro would
be in bad taste.

Let us come back to the more general case.  As identified above, when
expand_builtin_return_addr is used to expand __builtin_frame_address (),
we do care about the count == 0 case.  I believe that the comment should
be adjusted to reflect this whatever other changes, if any, are made.
As for the remaining problem, I suggest that we could:

(i) always return the hard frame pointer, and disable FP elimination in
the current function; or

(ii) remove this logic entirely (but preserve the means to communicate any
necessity to prevent FP elimination to reload) and insist that all
targets define INITIAL_FRAME_ADDRESS_RTX.  Any such solution should
probably adjust INITIAL_FRAME_ADDRESS_RTX so that it doesn't have to
cause a side-effect in order to communicate the information about the
frame pointer.  Or...

(iii) ...the same as option (i), but allow targets to define another macro
that will cause the default code to use the soft frame pointer rather than
the hard frame pointer, and hence allow FP elimination.  (If such a macro
were set by a particular target, am I right in thinking that it would be
safe to use the soft frame pointer even in the count >= 1 cases?)

Option (ii) is in some ways the most satisfactory, as it would provide
more certainty (particularly if another target were to be added) that
the code is doing the right thing.  However it involves a moderate amount
of work and I would assume that the current level of enthusiasm in this
regard is approximately the same as last year :-)

Option (i), which is in all but name the "solution 5" approach [1] proposed
last year, means that the "count == 0" case is elevated to the same level
of importance as the "count > 0" cases, in line with the use in
backtrace ().  The problem with this is that on platforms where the
soft and hard FPs coincide there is going to be a slight
performance degradation, as identified previously, whenever these
builtins are used.  Someone with more experience will have to enlighten
me as to whether this penalty would actually be significant: my
intuition tells me that in fact it would not be, though perhaps there are
oft-used occurrences of these builtin expansions that I don't know about.
(On how many platforms is it the case that the soft and hard FPs actually
coincide?  If there are platforms where they do _not_ coincide, then
presumably those targets can be affected in the same way as I identify
above for ARM?  I'm confused because the current code seems to assume
(in the count == 0 case) that the soft and hard FPs coincide, yet uses
the hard rather than the soft FP for count >= 1.)

Option (iii) gives the advantage of a working default and removes the
pessimization on a target-by-target basis, just when it is known to be
safe.  If I'm correct in thinking that the setting of the "soft frame
pointer" macro would enable the soft frame pointer to be used no matter
what the value of count, then this option actually permits FP elimination
than the present code.

I tend to think that option (iii) might be best, although perhaps it
is overkill and option (i) would do.  But I'm not entirely sure;
still being a gcc novice I have to admit to not being quite thoroughly
clear on this myself at this stage.  So any advice or comments would be
appreciated!

Thanks,
Mark

[1] http://gcc.gnu.org/ml/gcc-patches/2005-08/msg01068.html

[2] http://gcc.gnu.org/ml/gcc-patches/2005-08/msg01194.html

[3] sysdeps/generic/backtrace.c in a glibc distribution.

Reply via email to