Hi, I'd like to gather some opinions and advice on the expansion of __builtin_frame_address, as discussed on gcc-patches last year [1, 2]. This centres on the following comment in expand_builtin_return_addr arising from revision 103294 last year:
/* For a zero count, we don't care what frame address we return, so frame pointer elimination is OK, and using the soft frame pointer is OK. For a non-zero count, we require a stable offset from the current frame pointer to the previous one, so we must use the hard frame pointer, and we must disable frame pointer elimination. */ I believe that, when this function is used to expand __builtin_frame_address (), the first sentence isn't true: in some cases, a function does care about the value at count == 0. A concrete example is the glibc backtrace () function [3] which uses the expression __builtin_frame_address (0) to determine the starting point for stack traversal. (It performs subsequent dereferences back down the chain of frame pointers by itself.) The wording of the comment in itself is unfortunately not the end of the issue. Due to the subsequent conditional testing count == 0, the builtin can expand to an erroneous value when the following conditions are met: - the expansion function is invoked with count set to zero; and - the target is such that frame_pointer_rtx and hard_frame_pointer_rtx do not ultimately yield the same address. An example of such an invocation occurs during compilation of the aforementioned backtrace function, and an example of such a target is the ARM. Currently, calling backtrace () on such a target yields a failure. Let us just consider how to fix the ARM case for a moment. The obvious thing to do is to define INITIAL_FRAME_ADDRESS_RTX. However, the correct semantics of such a macro definition would be to: - set current_function_accesses_prior_frames to 1, so that FP is not eliminated in this function; and - return hard_frame_pointer_rtx. I hope I'm not alone in thinking that such a side-effecting macro would be in bad taste. Let us come back to the more general case. As identified above, when expand_builtin_return_addr is used to expand __builtin_frame_address (), we do care about the count == 0 case. I believe that the comment should be adjusted to reflect this whatever other changes, if any, are made. As for the remaining problem, I suggest that we could: (i) always return the hard frame pointer, and disable FP elimination in the current function; or (ii) remove this logic entirely (but preserve the means to communicate any necessity to prevent FP elimination to reload) and insist that all targets define INITIAL_FRAME_ADDRESS_RTX. Any such solution should probably adjust INITIAL_FRAME_ADDRESS_RTX so that it doesn't have to cause a side-effect in order to communicate the information about the frame pointer. Or... (iii) ...the same as option (i), but allow targets to define another macro that will cause the default code to use the soft frame pointer rather than the hard frame pointer, and hence allow FP elimination. (If such a macro were set by a particular target, am I right in thinking that it would be safe to use the soft frame pointer even in the count >= 1 cases?) Option (ii) is in some ways the most satisfactory, as it would provide more certainty (particularly if another target were to be added) that the code is doing the right thing. However it involves a moderate amount of work and I would assume that the current level of enthusiasm in this regard is approximately the same as last year :-) Option (i), which is in all but name the "solution 5" approach [1] proposed last year, means that the "count == 0" case is elevated to the same level of importance as the "count > 0" cases, in line with the use in backtrace (). The problem with this is that on platforms where the soft and hard FPs coincide there is going to be a slight performance degradation, as identified previously, whenever these builtins are used. Someone with more experience will have to enlighten me as to whether this penalty would actually be significant: my intuition tells me that in fact it would not be, though perhaps there are oft-used occurrences of these builtin expansions that I don't know about. (On how many platforms is it the case that the soft and hard FPs actually coincide? If there are platforms where they do _not_ coincide, then presumably those targets can be affected in the same way as I identify above for ARM? I'm confused because the current code seems to assume (in the count == 0 case) that the soft and hard FPs coincide, yet uses the hard rather than the soft FP for count >= 1.) Option (iii) gives the advantage of a working default and removes the pessimization on a target-by-target basis, just when it is known to be safe. If I'm correct in thinking that the setting of the "soft frame pointer" macro would enable the soft frame pointer to be used no matter what the value of count, then this option actually permits FP elimination than the present code. I tend to think that option (iii) might be best, although perhaps it is overkill and option (i) would do. But I'm not entirely sure; still being a gcc novice I have to admit to not being quite thoroughly clear on this myself at this stage. So any advice or comments would be appreciated! Thanks, Mark [1] http://gcc.gnu.org/ml/gcc-patches/2005-08/msg01068.html [2] http://gcc.gnu.org/ml/gcc-patches/2005-08/msg01194.html [3] sysdeps/generic/backtrace.c in a glibc distribution.